repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | --- |
| #2 | title: Pinecone |
| #3 | --- |
| #4 | |
| #5 | ## Overview |
| #6 | |
| #7 | Install pinecone related dependencies using the following command: |
| #8 | |
| #9 | ```bash |
| #10 | pip install --upgrade 'pinecone-client pinecone-text' |
| #11 | ``` |
| #12 | |
| #13 | In order to use Pinecone as vector database, set the environment variable `PINECONE_API_KEY` which you can find on [Pinecone dashboard](https://app.pinecone.io/). |
| #14 | |
| #15 | <CodeGroup> |
| #16 | |
| #17 | ```python main.py |
| #18 | from embedchain import App |
| #19 | |
| #20 | # Load pinecone configuration from yaml file |
| #21 | app = App.from_config(config_path="pod_config.yaml") |
| #22 | # Or |
| #23 | app = App.from_config(config_path="serverless_config.yaml") |
| #24 | ``` |
| #25 | |
| #26 | ```yaml pod_config.yaml |
| #27 | vectordb: |
| #28 | provider: pinecone |
| #29 | config: |
| #30 | metric: cosine |
| #31 | vector_dimension: 1536 |
| #32 | index_name: my-pinecone-index |
| #33 | pod_config: |
| #34 | environment: gcp-starter |
| #35 | metadata_config: |
| #36 | indexed: |
| #37 | - "url" |
| #38 | - "hash" |
| #39 | ``` |
| #40 | |
| #41 | ```yaml serverless_config.yaml |
| #42 | vectordb: |
| #43 | provider: pinecone |
| #44 | config: |
| #45 | metric: cosine |
| #46 | vector_dimension: 1536 |
| #47 | index_name: my-pinecone-index |
| #48 | serverless_config: |
| #49 | cloud: aws |
| #50 | region: us-west-2 |
| #51 | ``` |
| #52 | |
| #53 | </CodeGroup> |
| #54 | |
| #55 | <br /> |
| #56 | <Note> |
| #57 | You can find more information about Pinecone configuration [here](https://docs.pinecone.io/docs/manage-indexes#create-a-pod-based-index). |
| #58 | You can also optionally provide `index_name` as a config param in yaml file to specify the index name. If not provided, the index name will be `{collection_name}-{vector_dimension}`. |
| #59 | </Note> |
| #60 | |
| #61 | ## Usage |
| #62 | |
| #63 | ### Hybrid search |
| #64 | |
| #65 | Here is an example of how you can do hybrid search using Pinecone as a vector database through Embedchain. |
| #66 | |
| #67 | ```python |
| #68 | import os |
| #69 | |
| #70 | from embedchain import App |
| #71 | |
| #72 | config = { |
| #73 | 'app': { |
| #74 | "config": { |
| #75 | "id": "ec-docs-hybrid-search" |
| #76 | } |
| #77 | }, |
| #78 | 'vectordb': { |
| #79 | 'provider': 'pinecone', |
| #80 | 'config': { |
| #81 | 'metric': 'dotproduct', |
| #82 | 'vector_dimension': 1536, |
| #83 | 'index_name': 'my-index', |
| #84 | 'serverless_config': { |
| #85 | 'cloud': 'aws', |
| #86 | 'region': 'us-west-2' |
| #87 | }, |
| #88 | 'hybrid_search': True, # Remember to set this for hybrid search |
| #89 | } |
| #90 | } |
| #91 | } |
| #92 | |
| #93 | # Initialize app |
| #94 | app = App.from_config(config=config) |
| #95 | |
| #96 | # Add documents |
| #97 | app.add("/path/to/file.pdf", data_type="pdf_file", namespace="my-namespace") |
| #98 | |
| #99 | # Query |
| #100 | app.query("<YOUR QUESTION HERE>", namespace="my-namespace") |
| #101 | |
| #102 | # Chat |
| #103 | app.chat("<YOUR QUESTION HERE>", namespace="my-namespace") |
| #104 | ``` |
| #105 | |
| #106 | Under the hood, Embedchain fetches the relevant chunks from the documents you added by doing hybrid search on the pinecone index. |
| #107 | If you have questions on how pinecone hybrid search works, please refer to their [offical documentation here](https://docs.pinecone.io/docs/hybrid-search). |
| #108 | |
| #109 | <Snippet file="missing-vector-db-tip.mdx" /> |
| #110 |