repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | [Databricks Vector Search](https://docs.databricks.com/en/generative-ai/vector-search.html) is a serverless similarity search engine that allows you to store a vector representation of your data, including metadata, in a vector database. With Vector Search, you can create auto-updating vector search indexes from Delta tables managed by Unity Catalog and query them with a simple API to return the most similar vectors. |
| #2 | |
| #3 | ### Usage |
| #4 | |
| #5 | ```python |
| #6 | import os |
| #7 | from mem0 import Memory |
| #8 | |
| #9 | config = { |
| #10 | "vector_store": { |
| #11 | "provider": "databricks", |
| #12 | "config": { |
| #13 | "workspace_url": "https://your-workspace.databricks.com", |
| #14 | "access_token": "your-access-token", |
| #15 | "endpoint_name": "your-vector-search-endpoint", |
| #16 | "index_name": "catalog.schema.index_name", |
| #17 | "source_table_name": "catalog.schema.source_table", |
| #18 | "embedding_dimension": 1536 |
| #19 | } |
| #20 | } |
| #21 | } |
| #22 | |
| #23 | m = Memory.from_config(config) |
| #24 | messages = [ |
| #25 | {"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"}, |
| #26 | {"role": "assistant", "content": "How about thriller movies? They can be quite engaging."}, |
| #27 | {"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."}, |
| #28 | {"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."} |
| #29 | ] |
| #30 | m.add(messages, user_id="alice", metadata={"category": "movies"}) |
| #31 | ``` |
| #32 | |
| #33 | ### Config |
| #34 | |
| #35 | Here are the parameters available for configuring Databricks Vector Search: |
| #36 | |
| #37 | | Parameter | Description | Default Value | |
| #38 | | --- | --- | --- | |
| #39 | | `workspace_url` | The URL of your Databricks workspace | **Required** | |
| #40 | | `access_token` | Personal Access Token for authentication | `None` | |
| #41 | | `service_principal_client_id` | Service principal client ID (alternative to access_token) | `None` | |
| #42 | | `service_principal_client_secret` | Service principal client secret (required with client_id) | `None` | |
| #43 | | `endpoint_name` | Name of the Vector Search endpoint | **Required** | |
| #44 | | `index_name` | Name of the vector index (Unity Catalog format: catalog.schema.index) | **Required** | |
| #45 | | `source_table_name` | Name of the source Delta table (Unity Catalog format: catalog.schema.table) | **Required** | |
| #46 | | `embedding_dimension` | Dimension of self-managed embeddings | `1536` | |
| #47 | | `embedding_source_column` | Column name for text when using Databricks-computed embeddings | `None` | |
| #48 | | `embedding_model_endpoint_name` | Databricks serving endpoint for embeddings | `None` | |
| #49 | | `embedding_vector_column` | Column name for self-managed embedding vectors | `embedding` | |
| #50 | | `endpoint_type` | Type of endpoint (`STANDARD` or `STORAGE_OPTIMIZED`) | `STANDARD` | |
| #51 | | `sync_computed_embeddings` | Whether to sync computed embeddings automatically | `True` | |
| #52 | |
| #53 | ### Authentication |
| #54 | |
| #55 | Databricks Vector Search supports two authentication methods: |
| #56 | |
| #57 | #### Service Principal (Recommended for Production) |
| #58 | ```python |
| #59 | config = { |
| #60 | "vector_store": { |
| #61 | "provider": "databricks", |
| #62 | "config": { |
| #63 | "workspace_url": "https://your-workspace.databricks.com", |
| #64 | "service_principal_client_id": "your-service-principal-id", |
| #65 | "service_principal_client_secret": "your-service-principal-secret", |
| #66 | "endpoint_name": "your-endpoint", |
| #67 | "index_name": "catalog.schema.index_name", |
| #68 | "source_table_name": "catalog.schema.source_table" |
| #69 | } |
| #70 | } |
| #71 | } |
| #72 | ``` |
| #73 | |
| #74 | #### Personal Access Token (for Development) |
| #75 | ```python |
| #76 | config = { |
| #77 | "vector_store": { |
| #78 | "provider": "databricks", |
| #79 | "config": { |
| #80 | "workspace_url": "https://your-workspace.databricks.com", |
| #81 | "access_token": "your-personal-access-token", |
| #82 | "endpoint_name": "your-endpoint", |
| #83 | "index_name": "catalog.schema.index_name", |
| #84 | "source_table_name": "catalog.schema.source_table" |
| #85 | } |
| #86 | } |
| #87 | } |
| #88 | ``` |
| #89 | |
| #90 | ### Embedding Options |
| #91 | |
| #92 | #### Self-Managed Embeddings (Default) |
| #93 | Use your own embedding model and provide vectors directly: |
| #94 | |
| #95 | ```python |
| #96 | config = { |
| #97 | "vector_store": { |
| #98 | "provider": "databricks", |
| #99 | "config": { |
| #100 | # ... authentication config ... |
| #101 | "embedding_dimension": 768, # Match your embedding model |
| #102 | "embedding_vector_column": "embedding" |
| #103 | } |
| #104 | } |
| #105 | } |
| #106 | ``` |
| #107 | |
| #108 | #### Databricks-Computed Embeddings |
| #109 | Let Databricks compute embeddings from text using a serving endpoint: |
| #110 | |
| #111 | ```python |
| #112 | config = { |
| #113 | "vector_store": { |
| #114 | "provider": "databricks", |
| #115 | "config": { |
| #116 | # ... authentication config ... |
| #117 | "embedding_source_column": "text", |
| #118 | "embedding_model_endpoint_name": "e5-small-v2" |
| #119 | } |
| #120 | } |
| #121 | } |
| #122 | ``` |
| #123 | |
| #124 | ### Important Notes |
| #125 | |
| #126 | - **Delta Sync Index**: This implementation uses Delta Sync Index, which automatically syncs with your source Delta table. Direct vector insertion/deletion/update operations will log warnings as they're not supported with Delta Sync. |
| #127 | - **Unity Catalog**: Both the source table and index must be in Unity Catalog format (`catalog.schema.table_name`). |
| #128 | - **Endpoint Auto-Creation**: If the specified endpoint doesn't exist, it will be created automatically. |
| #129 | - **Index Auto-Creation**: If the specified index doesn't exist, it will be created automatically with the provided configuration. |
| #130 | - **Filter Support**: Supports filtering by metadata fields, with different syntax for STANDARD vs STORAGE_OPTIMIZED endpoints. |
| #131 |