repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | --- |
| #2 | title: Sentence Transformer |
| #3 | description: 'Local reranking with HuggingFace cross-encoder models' |
| #4 | --- |
| #5 | |
| #6 | Sentence Transformer reranker provides local reranking using HuggingFace cross-encoder models, perfect for privacy-focused deployments where you want to keep data on-premises. |
| #7 | |
| #8 | ## Models |
| #9 | |
| #10 | Any HuggingFace cross-encoder model can be used. Popular choices include: |
| #11 | |
| #12 | - **`cross-encoder/ms-marco-MiniLM-L-6-v2`**: Default, good balance of speed and accuracy |
| #13 | - **`cross-encoder/ms-marco-TinyBERT-L-2-v2`**: Fastest, smaller model size |
| #14 | - **`cross-encoder/ms-marco-electra-base`**: Higher accuracy, larger model |
| #15 | - **`cross-encoder/stsb-distilroberta-base`**: Good for semantic similarity tasks |
| #16 | |
| #17 | ## Installation |
| #18 | |
| #19 | ```bash |
| #20 | pip install sentence-transformers |
| #21 | ``` |
| #22 | |
| #23 | ## Configuration |
| #24 | |
| #25 | ```python Python |
| #26 | from mem0 import Memory |
| #27 | |
| #28 | config = { |
| #29 | "vector_store": { |
| #30 | "provider": "chroma", |
| #31 | "config": { |
| #32 | "collection_name": "my_memories", |
| #33 | "path": "./chroma_db" |
| #34 | } |
| #35 | }, |
| #36 | "llm": { |
| #37 | "provider": "openai", |
| #38 | "config": { |
| #39 | "model": "gpt-4o-mini" |
| #40 | } |
| #41 | }, |
| #42 | "rerank": { |
| #43 | "provider": "sentence_transformer", |
| #44 | "config": { |
| #45 | "model": "cross-encoder/ms-marco-MiniLM-L-6-v2", |
| #46 | "device": "cpu", # or "cuda" for GPU |
| #47 | "batch_size": 32, |
| #48 | "show_progress_bar": False, |
| #49 | "top_k": 5 |
| #50 | } |
| #51 | } |
| #52 | } |
| #53 | |
| #54 | memory = Memory.from_config(config) |
| #55 | ``` |
| #56 | |
| #57 | ## GPU Acceleration |
| #58 | |
| #59 | For better performance, use GPU acceleration: |
| #60 | |
| #61 | ```python Python |
| #62 | config = { |
| #63 | "rerank": { |
| #64 | "provider": "sentence_transformer", |
| #65 | "config": { |
| #66 | "model": "cross-encoder/ms-marco-MiniLM-L-6-v2", |
| #67 | "device": "cuda", # Use GPU |
| #68 | "batch_size": 64 # high batch size for high memory GPUs |
| #69 | } |
| #70 | } |
| #71 | } |
| #72 | ``` |
| #73 | |
| #74 | ## Usage Example |
| #75 | |
| #76 | ```python Python |
| #77 | from mem0 import Memory |
| #78 | |
| #79 | # Initialize memory with local reranker |
| #80 | config = { |
| #81 | "vector_store": {"provider": "chroma"}, |
| #82 | "llm": {"provider": "openai", "config": {"model": "gpt-4o-mini"}}, |
| #83 | "rerank": { |
| #84 | "provider": "sentence_transformer", |
| #85 | "config": { |
| #86 | "model": "cross-encoder/ms-marco-MiniLM-L-6-v2", |
| #87 | "device": "cpu" |
| #88 | } |
| #89 | } |
| #90 | } |
| #91 | |
| #92 | memory = Memory.from_config(config) |
| #93 | |
| #94 | # Add memories |
| #95 | messages = [ |
| #96 | {"role": "user", "content": "I love reading science fiction novels"}, |
| #97 | {"role": "user", "content": "My favorite author is Isaac Asimov"}, |
| #98 | {"role": "user", "content": "I also enjoy watching sci-fi movies"} |
| #99 | ] |
| #100 | |
| #101 | memory.add(messages, user_id="charlie") |
| #102 | |
| #103 | # Search with local reranking |
| #104 | results = memory.search("What books does the user like?", user_id="charlie") |
| #105 | |
| #106 | for result in results['results']: |
| #107 | print(f"Memory: {result['memory']}") |
| #108 | print(f"Vector Score: {result['score']:.3f}") |
| #109 | print(f"Rerank Score: {result['rerank_score']:.3f}") |
| #110 | print() |
| #111 | ``` |
| #112 | |
| #113 | ## Custom Models |
| #114 | |
| #115 | You can use any HuggingFace cross-encoder model: |
| #116 | |
| #117 | ```python Python |
| #118 | # Using a different model |
| #119 | config = { |
| #120 | "rerank": { |
| #121 | "provider": "sentence_transformer", |
| #122 | "config": { |
| #123 | "model": "cross-encoder/stsb-distilroberta-base", |
| #124 | "device": "cpu" |
| #125 | } |
| #126 | } |
| #127 | } |
| #128 | ``` |
| #129 | |
| #130 | ## Configuration Parameters |
| #131 | |
| #132 | | Parameter | Description | Type | Default | |
| #133 | |-----------|-------------|------|---------| |
| #134 | | `model` | HuggingFace cross-encoder model name | `str` | `"cross-encoder/ms-marco-MiniLM-L-6-v2"` | |
| #135 | | `device` | Device to run model on (`cpu`, `cuda`, etc.) | `str` | `None` | |
| #136 | | `batch_size` | Batch size for processing documents | `int` | `32` | |
| #137 | | `show_progress_bar` | Show progress bar during processing | `bool` | `False` | |
| #138 | | `top_k` | Maximum documents to return | `int` | `None` | |
| #139 | |
| #140 | ## Advantages |
| #141 | |
| #142 | - **Privacy**: Complete local processing, no external API calls |
| #143 | - **Cost**: No per-token charges after initial model download |
| #144 | - **Customization**: Use any HuggingFace cross-encoder model |
| #145 | - **Offline**: Works without internet connection after model download |
| #146 | |
| #147 | ## Performance Considerations |
| #148 | |
| #149 | - **First Run**: Model download may take time initially |
| #150 | - **Memory Usage**: Models require GPU/CPU memory |
| #151 | - **Batch Size**: Optimize batch size based on available memory |
| #152 | - **Device**: GPU acceleration significantly improves speed |
| #153 | |
| #154 | ## Best Practices |
| #155 | |
| #156 | 1. **Model Selection**: Choose model based on accuracy vs speed requirements |
| #157 | 2. **Device Management**: Use GPU when available for better performance |
| #158 | 3. **Batch Processing**: Process multiple documents together for efficiency |
| #159 | 4. **Memory Monitoring**: Monitor system memory usage with larger models |