repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | --- |
| #2 | title: LLM as Reranker |
| #3 | description: 'Flexible reranking using LLMs' |
| #4 | --- |
| #5 | |
| #6 | <Warning> |
| #7 | **This page has been superseded.** Please see [LLM Reranker](/components/rerankers/models/llm_reranker) for the complete and up-to-date documentation on using LLMs for reranking. |
| #8 | </Warning> |
| #9 | |
| #10 | LLM-based reranker provides maximum flexibility by using any Large Language Model to score document relevance. This approach allows for custom prompts and domain-specific scoring logic. |
| #11 | |
| #12 | ## Supported LLM Providers |
| #13 | |
| #14 | Any LLM provider supported by Mem0 can be used for reranking: |
| #15 | |
| #16 | - **OpenAI**: GPT-4, GPT-3.5-turbo, etc. |
| #17 | - **Anthropic**: Claude models |
| #18 | - **Together**: Open-source models |
| #19 | - **Groq**: Fast inference |
| #20 | - **Ollama**: Local models |
| #21 | - And more... |
| #22 | |
| #23 | ## Configuration |
| #24 | |
| #25 | ```python Python |
| #26 | from mem0 import Memory |
| #27 | |
| #28 | config = { |
| #29 | "vector_store": { |
| #30 | "provider": "chroma", |
| #31 | "config": { |
| #32 | "collection_name": "my_memories", |
| #33 | "path": "./chroma_db" |
| #34 | } |
| #35 | }, |
| #36 | "llm": { |
| #37 | "provider": "openai", |
| #38 | "config": { |
| #39 | "model": "gpt-4o-mini" |
| #40 | } |
| #41 | }, |
| #42 | "reranker": { |
| #43 | "provider": "llm", |
| #44 | "config": { |
| #45 | "model": "gpt-4o-mini", |
| #46 | "provider": "openai", |
| #47 | "api_key": "your-openai-api-key", # or set OPENAI_API_KEY |
| #48 | "top_k": 5, |
| #49 | "temperature": 0.0 |
| #50 | } |
| #51 | } |
| #52 | } |
| #53 | |
| #54 | memory = Memory.from_config(config) |
| #55 | ``` |
| #56 | |
| #57 | ## Custom Scoring Prompt |
| #58 | |
| #59 | You can provide a custom prompt for relevance scoring: |
| #60 | |
| #61 | ```python Python |
| #62 | custom_prompt = """You are a relevance scoring assistant. Rate how well this document answers the query. |
| #63 | |
| #64 | Query: "{query}" |
| #65 | Document: "{document}" |
| #66 | |
| #67 | Score from 0.0 to 1.0 where: |
| #68 | - 1.0: Perfect match, directly answers the query |
| #69 | - 0.8-0.9: Highly relevant, good match |
| #70 | - 0.6-0.7: Moderately relevant, partial match |
| #71 | - 0.4-0.5: Slightly relevant, limited useful information |
| #72 | - 0.0-0.3: Not relevant or no useful information |
| #73 | |
| #74 | Provide only a single numerical score between 0.0 and 1.0.""" |
| #75 | |
| #76 | config["reranker"]["config"]["scoring_prompt"] = custom_prompt |
| #77 | ``` |
| #78 | |
| #79 | ## Usage Example |
| #80 | |
| #81 | ```python Python |
| #82 | import os |
| #83 | from mem0 import Memory |
| #84 | |
| #85 | # Set API key |
| #86 | os.environ["OPENAI_API_KEY"] = "your-api-key" |
| #87 | |
| #88 | # Initialize memory with LLM reranker |
| #89 | config = { |
| #90 | "vector_store": {"provider": "chroma"}, |
| #91 | "llm": {"provider": "openai", "config": {"model": "gpt-4o-mini"}}, |
| #92 | "reranker": { |
| #93 | "provider": "llm", |
| #94 | "config": { |
| #95 | "model": "gpt-4o-mini", |
| #96 | "provider": "openai", |
| #97 | "temperature": 0.0 |
| #98 | } |
| #99 | } |
| #100 | } |
| #101 | |
| #102 | memory = Memory.from_config(config) |
| #103 | |
| #104 | # Add memories |
| #105 | messages = [ |
| #106 | {"role": "user", "content": "I'm learning Python programming"}, |
| #107 | {"role": "user", "content": "I find object-oriented programming challenging"}, |
| #108 | {"role": "user", "content": "I love hiking in national parks"} |
| #109 | ] |
| #110 | |
| #111 | memory.add(messages, user_id="david") |
| #112 | |
| #113 | # Search with LLM reranking |
| #114 | results = memory.search("What programming topics is the user studying?", user_id="david") |
| #115 | |
| #116 | for result in results['results']: |
| #117 | print(f"Memory: {result['memory']}") |
| #118 | print(f"Vector Score: {result['score']:.3f}") |
| #119 | print(f"Rerank Score: {result['rerank_score']:.3f}") |
| #120 | print() |
| #121 | ``` |
| #122 | |
| #123 | ```text Output |
| #124 | Memory: I'm learning Python programming |
| #125 | Vector Score: 0.856 |
| #126 | Rerank Score: 0.920 |
| #127 | |
| #128 | Memory: I find object-oriented programming challenging |
| #129 | Vector Score: 0.782 |
| #130 | Rerank Score: 0.850 |
| #131 | ``` |
| #132 | |
| #133 | ## Domain-Specific Scoring |
| #134 | |
| #135 | Create specialized scoring for your domain: |
| #136 | |
| #137 | ```python Python |
| #138 | medical_prompt = """You are a medical relevance expert. Score how relevant this medical record is to the clinical query. |
| #139 | |
| #140 | Clinical Query: "{query}" |
| #141 | Medical Record: "{document}" |
| #142 | |
| #143 | Consider: |
| #144 | - Clinical relevance and accuracy |
| #145 | - Patient safety implications |
| #146 | - Diagnostic value |
| #147 | - Treatment relevance |
| #148 | |
| #149 | Score from 0.0 to 1.0. Provide only the numerical score.""" |
| #150 | |
| #151 | config = { |
| #152 | "reranker": { |
| #153 | "provider": "llm", |
| #154 | "config": { |
| #155 | "model": "gpt-4o-mini", |
| #156 | "provider": "openai", |
| #157 | "scoring_prompt": medical_prompt, |
| #158 | "temperature": 0.0 |
| #159 | } |
| #160 | } |
| #161 | } |
| #162 | ``` |
| #163 | |
| #164 | ## Multiple LLM Providers |
| #165 | |
| #166 | Use different LLM providers for reranking: |
| #167 | |
| #168 | ```python Python |
| #169 | # Using Anthropic Claude |
| #170 | anthropic_config = { |
| #171 | "reranker": { |
| #172 | "provider": "llm", |
| #173 | "config": { |
| #174 | "model": "claude-3-haiku-20240307", |
| #175 | "provider": "anthropic", |
| #176 | "temperature": 0.0 |
| #177 | } |
| #178 | } |
| #179 | } |
| #180 | |
| #181 | # Using local Ollama model |
| #182 | ollama_config = { |
| #183 | "reranker": { |
| #184 | "provider": "llm", |
| #185 | "config": { |
| #186 | "model": "llama2:7b", |
| #187 | "provider": "ollama", |
| #188 | "temperature": 0.0 |
| #189 | } |
| #190 | } |
| #191 | } |
| #192 | ``` |
| #193 | |
| #194 | ## Configuration Parameters |
| #195 | |
| #196 | | Parameter | Description | Type | Default | |
| #197 | |-----------|-------------|------|---------| |
| #198 | | `model` | LLM model to use for scoring | `str` | `"gpt-4o-mini"` | |
| #199 | | `provider` | LLM provider name | `str` | `"openai"` | |
| #200 | | `api_key` | API key for the LLM provider | `str` | `None` | |
| #201 | | `top_k` | Maximum documents to return | `int` | `None` | |
| #202 | | `temperature` | Temperature for LLM generation | `float` | `0.0` | |
| #203 | | `max_tokens` | Maximum tokens for LLM response | `int` | `100` | |
| #204 | | `scoring_prompt` | Custom prompt template | `str` | Default prompt | |
| #205 | |
| #206 | ## Advantages |
| #207 | |
| #208 | - **Maximum Flexibility**: Custom prompts for any use case |
| #209 | - **Domain Expertise**: Leverage LLM knowledge for specialized domains |
| #210 | - **Interpretability**: Understand scoring through prompt engineering |
| #211 | - **Multi-criteria**: Score based on multiple relevance factors |
| #212 | |
| #213 | ## Considerations |
| #214 | |
| #215 | - **Latency**: Higher latency than specialized rerankers |
| #216 | - **Cost**: LLM API costs per reranking operation |
| #217 | - **Consistency**: May have slight variations in scoring |
| #218 | - **Prompt Engineering**: Requires careful prompt design |
| #219 | |
| #220 | ## Best Practices |
| #221 | |
| #222 | 1. **Temperature**: Use 0.0 for consistent scoring |
| #223 | 2. **Prompt Design**: Be specific about scoring criteria |
| #224 | 3. **Token Efficiency**: Keep prompts concise to reduce costs |
| #225 | 4. **Caching**: Cache results for repeated queries when possible |
| #226 | 5. **Fallback**: Handle API errors gracefully |