my-project-public

repository

loading code, commits, and activity

repositories

loading repo index

#1	---
#2	title: Pinecone
#3	---
#4
#5	## Overview
#6
#7	Install pinecone related dependencies using the following command:
#8
#9	```bash
#10	pip install --upgrade 'pinecone-client pinecone-text'
#11	```
#12
#13	In order to use Pinecone as vector database, set the environment variable `PINECONE_API_KEY` which you can find on [Pinecone dashboard](https://app.pinecone.io/).
#14
#15	<CodeGroup>
#16
#17	```python main.py
#18	from embedchain import App
#19
#20	# Load pinecone configuration from yaml file
#21	app = App.from_config(config_path="pod_config.yaml")
#22	# Or
#23	app = App.from_config(config_path="serverless_config.yaml")
#24	```
#25
#26	```yaml pod_config.yaml
#27	vectordb:
#28	provider: pinecone
#29	config:
#30	metric: cosine
#31	vector_dimension: 1536
#32	index_name: my-pinecone-index
#33	pod_config:
#34	environment: gcp-starter
#35	metadata_config:
#36	indexed:
#37	- "url"
#38	- "hash"
#39	```
#40
#41	```yaml serverless_config.yaml
#42	vectordb:
#43	provider: pinecone
#44	config:
#45	metric: cosine
#46	vector_dimension: 1536
#47	index_name: my-pinecone-index
#48	serverless_config:
#49	cloud: aws
#50	region: us-west-2
#51	```
#52
#53	</CodeGroup>
#54
#55	<br />
#56	<Note>
#57	You can find more information about Pinecone configuration [here](https://docs.pinecone.io/docs/manage-indexes#create-a-pod-based-index).
#58	You can also optionally provide `index_name` as a config param in yaml file to specify the index name. If not provided, the index name will be `{collection_name}-{vector_dimension}`.
#59	</Note>
#60
#61	## Usage
#62
#63	### Hybrid search
#64
#65	Here is an example of how you can do hybrid search using Pinecone as a vector database through Embedchain.
#66
#67	```python
#68	import os
#69
#70	from embedchain import App
#71
#72	config = {
#73	'app': {
#74	"config": {
#75	"id": "ec-docs-hybrid-search"
#76	}
#77	},
#78	'vectordb': {
#79	'provider': 'pinecone',
#80	'config': {
#81	'metric': 'dotproduct',
#82	'vector_dimension': 1536,
#83	'index_name': 'my-index',
#84	'serverless_config': {
#85	'cloud': 'aws',
#86	'region': 'us-west-2'
#87	},
#88	'hybrid_search': True, # Remember to set this for hybrid search
#89	}
#90	}
#91	}
#92
#93	# Initialize app
#94	app = App.from_config(config=config)
#95
#96	# Add documents
#97	app.add("/path/to/file.pdf", data_type="pdf_file", namespace="my-namespace")
#98
#99	# Query
#100	app.query("<YOUR QUESTION HERE>", namespace="my-namespace")
#101
#102	# Chat
#103	app.chat("<YOUR QUESTION HERE>", namespace="my-namespace")
#104	```
#105
#106	Under the hood, Embedchain fetches the relevant chunks from the documents you added by doing hybrid search on the pinecone index.
#107	If you have questions on how pinecone hybrid search works, please refer to their [offical documentation here](https://docs.pinecone.io/docs/hybrid-search).
#108
#109	<Snippet file="missing-vector-db-tip.mdx" />
#110

z6Mkq5mY3JWtxoxUobWcfNHm7AkRubgSWEZTkBVqZXJviFZ5/my-project-public