repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | --- |
| #2 | title: 🤖 Large language models (LLMs) |
| #3 | --- |
| #4 | |
| #5 | ## Overview |
| #6 | |
| #7 | Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface. |
| #8 | |
| #9 | <CardGroup cols={4}> |
| #10 | <Card title="OpenAI" href="#openai"></Card> |
| #11 | <Card title="Google AI" href="#google-ai"></Card> |
| #12 | <Card title="Azure OpenAI" href="#azure-openai"></Card> |
| #13 | <Card title="Anthropic" href="#anthropic"></Card> |
| #14 | <Card title="Cohere" href="#cohere"></Card> |
| #15 | <Card title="Together" href="#together"></Card> |
| #16 | <Card title="Ollama" href="#ollama"></Card> |
| #17 | <Card title="vLLM" href="#vllm"></Card> |
| #18 | <Card title="Clarifai" href="#clarifai"></Card> |
| #19 | <Card title="GPT4All" href="#gpt4all"></Card> |
| #20 | <Card title="JinaChat" href="#jinachat"></Card> |
| #21 | <Card title="Hugging Face" href="#hugging-face"></Card> |
| #22 | <Card title="Llama2" href="#llama2"></Card> |
| #23 | <Card title="Vertex AI" href="#vertex-ai"></Card> |
| #24 | <Card title="Mistral AI" href="#mistral-ai"></Card> |
| #25 | <Card title="AWS Bedrock" href="#aws-bedrock"></Card> |
| #26 | <Card title="Groq" href="#groq"></Card> |
| #27 | <Card title="NVIDIA AI" href="#nvidia-ai"></Card> |
| #28 | </CardGroup> |
| #29 | |
| #30 | ## OpenAI |
| #31 | |
| #32 | To use OpenAI LLM models, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys). |
| #33 | |
| #34 | Once you have obtained the key, you can use it like this: |
| #35 | |
| #36 | ```python |
| #37 | import os |
| #38 | from embedchain import App |
| #39 | |
| #40 | os.environ['OPENAI_API_KEY'] = 'xxx' |
| #41 | |
| #42 | app = App() |
| #43 | app.add("https://en.wikipedia.org/wiki/OpenAI") |
| #44 | app.query("What is OpenAI?") |
| #45 | ``` |
| #46 | |
| #47 | If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a [yaml config](https://github.com/embedchain/embedchain/blob/main/configs/chroma.yaml) file. |
| #48 | |
| #49 | <CodeGroup> |
| #50 | |
| #51 | ```python main.py |
| #52 | import os |
| #53 | from embedchain import App |
| #54 | |
| #55 | os.environ['OPENAI_API_KEY'] = 'xxx' |
| #56 | |
| #57 | # load llm configuration from config.yaml file |
| #58 | app = App.from_config(config_path="config.yaml") |
| #59 | ``` |
| #60 | |
| #61 | ```yaml config.yaml |
| #62 | llm: |
| #63 | provider: openai |
| #64 | config: |
| #65 | model: 'gpt-4o-mini' |
| #66 | temperature: 0.5 |
| #67 | max_tokens: 1000 |
| #68 | top_p: 1 |
| #69 | stream: false |
| #70 | ``` |
| #71 | </CodeGroup> |
| #72 | |
| #73 | ### Function Calling |
| #74 | Embedchain supports OpenAI [Function calling](https://platform.openai.com/docs/guides/function-calling) with a single function. It accepts inputs in accordance with the [Langchain interface](https://python.langchain.com/docs/modules/model_io/chat/function_calling#legacy-args-functions-and-function_call). |
| #75 | |
| #76 | <Accordion title="Pydantic Model"> |
| #77 | ```python |
| #78 | from pydantic import BaseModel |
| #79 | |
| #80 | class multiply(BaseModel): |
| #81 | """Multiply two integers together.""" |
| #82 | |
| #83 | a: int = Field(..., description="First integer") |
| #84 | b: int = Field(..., description="Second integer") |
| #85 | ``` |
| #86 | </Accordion> |
| #87 | |
| #88 | <Accordion title="Python function"> |
| #89 | ```python |
| #90 | def multiply(a: int, b: int) -> int: |
| #91 | """Multiply two integers together. |
| #92 | |
| #93 | Args: |
| #94 | a: First integer |
| #95 | b: Second integer |
| #96 | """ |
| #97 | return a * b |
| #98 | ``` |
| #99 | </Accordion> |
| #100 | <Accordion title="OpenAI tool dictionary"> |
| #101 | ```python |
| #102 | multiply = { |
| #103 | "type": "function", |
| #104 | "function": { |
| #105 | "name": "multiply", |
| #106 | "description": "Multiply two integers together.", |
| #107 | "parameters": { |
| #108 | "type": "object", |
| #109 | "properties": { |
| #110 | "a": { |
| #111 | "description": "First integer", |
| #112 | "type": "integer" |
| #113 | }, |
| #114 | "b": { |
| #115 | "description": "Second integer", |
| #116 | "type": "integer" |
| #117 | } |
| #118 | }, |
| #119 | "required": [ |
| #120 | "a", |
| #121 | "b" |
| #122 | ] |
| #123 | } |
| #124 | } |
| #125 | } |
| #126 | ``` |
| #127 | </Accordion> |
| #128 | |
| #129 | With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function. |
| #130 | |
| #131 | ```python |
| #132 | import os |
| #133 | from embedchain import App |
| #134 | from embedchain.llm.openai import OpenAILlm |
| #135 | |
| #136 | os.environ["OPENAI_API_KEY"] = "sk-xxx" |
| #137 | |
| #138 | llm = OpenAILlm(tools=multiply) |
| #139 | app = App(llm=llm) |
| #140 | |
| #141 | result = app.query("What is the result of 125 multiplied by fifteen?") |
| #142 | ``` |
| #143 | |
| #144 | ## Google AI |
| #145 | |
| #146 | To use Google AI model, you have to set the `GOOGLE_API_KEY` environment variable. You can obtain the Google API key from the [Google Maker Suite](https://makersuite.google.com/app/apikey) |
| #147 | |
| #148 | <CodeGroup> |
| #149 | ```python main.py |
| #150 | import os |
| #151 | from embedchain import App |
| #152 | |
| #153 | os.environ["GOOGLE_API_KEY"] = "xxx" |
| #154 | |
| #155 | app = App.from_config(config_path="config.yaml") |
| #156 | |
| #157 | app.add("https://www.forbes.com/profile/elon-musk") |
| #158 | |
| #159 | response = app.query("What is the net worth of Elon Musk?") |
| #160 | if app.llm.config.stream: # if stream is enabled, response is a generator |
| #161 | for chunk in response: |
| #162 | print(chunk) |
| #163 | else: |
| #164 | print(response) |
| #165 | ``` |
| #166 | |
| #167 | ```yaml config.yaml |
| #168 | llm: |
| #169 | provider: google |
| #170 | config: |
| #171 | model: gemini-pro |
| #172 | max_tokens: 1000 |
| #173 | temperature: 0.5 |
| #174 | top_p: 1 |
| #175 | stream: false |
| #176 | |
| #177 | embedder: |
| #178 | provider: google |
| #179 | config: |
| #180 | model: 'models/embedding-001' |
| #181 | task_type: "retrieval_document" |
| #182 | title: "Embeddings for Embedchain" |
| #183 | ``` |
| #184 | </CodeGroup> |
| #185 | |
| #186 | ## Azure OpenAI |
| #187 | |
| #188 | To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below: |
| #189 | |
| #190 | <CodeGroup> |
| #191 | |
| #192 | ```python main.py |
| #193 | import os |
| #194 | from embedchain import App |
| #195 | |
| #196 | os.environ["OPENAI_API_TYPE"] = "azure" |
| #197 | os.environ["AZURE_OPENAI_ENDPOINT"] = "https://xxx.openai.azure.com/" |
| #198 | os.environ["AZURE_OPENAI_KEY"] = "xxx" |
| #199 | os.environ["OPENAI_API_VERSION"] = "xxx" |
| #200 | |
| #201 | app = App.from_config(config_path="config.yaml") |
| #202 | ``` |
| #203 | |
| #204 | ```yaml config.yaml |
| #205 | llm: |
| #206 | provider: azure_openai |
| #207 | config: |
| #208 | model: gpt-4o-mini |
| #209 | deployment_name: your_llm_deployment_name |
| #210 | temperature: 0.5 |
| #211 | max_tokens: 1000 |
| #212 | top_p: 1 |
| #213 | stream: false |
| #214 | |
| #215 | embedder: |
| #216 | provider: azure_openai |
| #217 | config: |
| #218 | model: text-embedding-ada-002 |
| #219 | deployment_name: you_embedding_model_deployment_name |
| #220 | ``` |
| #221 | </CodeGroup> |
| #222 | |
| #223 | You can find the list of models and deployment name on the [Azure OpenAI Platform](https://oai.azure.com/portal). |
| #224 | |
| #225 | ## Anthropic |
| #226 | |
| #227 | To use anthropic's model, please set the `ANTHROPIC_API_KEY` which you find on their [Account Settings Page](https://console.anthropic.com/account/keys). |
| #228 | |
| #229 | <CodeGroup> |
| #230 | |
| #231 | ```python main.py |
| #232 | import os |
| #233 | from embedchain import App |
| #234 | |
| #235 | os.environ["ANTHROPIC_API_KEY"] = "xxx" |
| #236 | |
| #237 | # load llm configuration from config.yaml file |
| #238 | app = App.from_config(config_path="config.yaml") |
| #239 | ``` |
| #240 | |
| #241 | ```yaml config.yaml |
| #242 | llm: |
| #243 | provider: anthropic |
| #244 | config: |
| #245 | model: 'claude-instant-1' |
| #246 | temperature: 0.5 |
| #247 | max_tokens: 1000 |
| #248 | top_p: 1 |
| #249 | stream: false |
| #250 | ``` |
| #251 | |
| #252 | </CodeGroup> |
| #253 | |
| #254 | ## Cohere |
| #255 | |
| #256 | Install related dependencies using the following command: |
| #257 | |
| #258 | ```bash |
| #259 | pip install --upgrade 'embedchain[cohere]' |
| #260 | ``` |
| #261 | |
| #262 | Set the `COHERE_API_KEY` as environment variable which you can find on their [Account settings page](https://dashboard.cohere.com/api-keys). |
| #263 | |
| #264 | Once you have the API key, you are all set to use it with Embedchain. |
| #265 | |
| #266 | <CodeGroup> |
| #267 | |
| #268 | ```python main.py |
| #269 | import os |
| #270 | from embedchain import App |
| #271 | |
| #272 | os.environ["COHERE_API_KEY"] = "xxx" |
| #273 | |
| #274 | # load llm configuration from config.yaml file |
| #275 | app = App.from_config(config_path="config.yaml") |
| #276 | ``` |
| #277 | |
| #278 | ```yaml config.yaml |
| #279 | llm: |
| #280 | provider: cohere |
| #281 | config: |
| #282 | model: large |
| #283 | temperature: 0.5 |
| #284 | max_tokens: 1000 |
| #285 | top_p: 1 |
| #286 | ``` |
| #287 | |
| #288 | </CodeGroup> |
| #289 | |
| #290 | ## Together |
| #291 | |
| #292 | Install related dependencies using the following command: |
| #293 | |
| #294 | ```bash |
| #295 | pip install --upgrade 'embedchain[together]' |
| #296 | ``` |
| #297 | |
| #298 | Set the `TOGETHER_API_KEY` as environment variable which you can find on their [Account settings page](https://api.together.xyz/settings/api-keys). |
| #299 | |
| #300 | Once you have the API key, you are all set to use it with Embedchain. |
| #301 | |
| #302 | <CodeGroup> |
| #303 | |
| #304 | ```python main.py |
| #305 | import os |
| #306 | from embedchain import App |
| #307 | |
| #308 | os.environ["TOGETHER_API_KEY"] = "xxx" |
| #309 | |
| #310 | # load llm configuration from config.yaml file |
| #311 | app = App.from_config(config_path="config.yaml") |
| #312 | ``` |
| #313 | |
| #314 | ```yaml config.yaml |
| #315 | llm: |
| #316 | provider: together |
| #317 | config: |
| #318 | model: togethercomputer/RedPajama-INCITE-7B-Base |
| #319 | temperature: 0.5 |
| #320 | max_tokens: 1000 |
| #321 | top_p: 1 |
| #322 | ``` |
| #323 | |
| #324 | </CodeGroup> |
| #325 | |
| #326 | ## Ollama |
| #327 | |
| #328 | Setup Ollama using https://github.com/jmorganca/ollama |
| #329 | |
| #330 | <CodeGroup> |
| #331 | |
| #332 | ```python main.py |
| #333 | import os |
| #334 | os.environ["OLLAMA_HOST"] = "http://127.0.0.1:11434" |
| #335 | from embedchain import App |
| #336 | |
| #337 | # load llm configuration from config.yaml file |
| #338 | app = App.from_config(config_path="config.yaml") |
| #339 | ``` |
| #340 | |
| #341 | ```yaml config.yaml |
| #342 | llm: |
| #343 | provider: ollama |
| #344 | config: |
| #345 | model: 'llama2' |
| #346 | temperature: 0.5 |
| #347 | top_p: 1 |
| #348 | stream: true |
| #349 | base_url: 'http://localhost:11434' |
| #350 | embedder: |
| #351 | provider: ollama |
| #352 | config: |
| #353 | model: znbang/bge:small-en-v1.5-q8_0 |
| #354 | base_url: http://localhost:11434 |
| #355 | |
| #356 | ``` |
| #357 | |
| #358 | </CodeGroup> |
| #359 | |
| #360 | |
| #361 | ## vLLM |
| #362 | |
| #363 | Setup vLLM by following instructions given in [their docs](https://docs.vllm.ai/en/latest/getting_started/installation.html). |
| #364 | |
| #365 | <CodeGroup> |
| #366 | |
| #367 | ```python main.py |
| #368 | import os |
| #369 | from embedchain import App |
| #370 | |
| #371 | # load llm configuration from config.yaml file |
| #372 | app = App.from_config(config_path="config.yaml") |
| #373 | ``` |
| #374 | |
| #375 | ```yaml config.yaml |
| #376 | llm: |
| #377 | provider: vllm |
| #378 | config: |
| #379 | model: 'meta-llama/Llama-2-70b-hf' |
| #380 | temperature: 0.5 |
| #381 | top_p: 1 |
| #382 | top_k: 10 |
| #383 | stream: true |
| #384 | trust_remote_code: true |
| #385 | ``` |
| #386 | |
| #387 | </CodeGroup> |
| #388 | |
| #389 | ## Clarifai |
| #390 | |
| #391 | Install related dependencies using the following command: |
| #392 | |
| #393 | ```bash |
| #394 | pip install --upgrade 'embedchain[clarifai]' |
| #395 | ``` |
| #396 | |
| #397 | set the `CLARIFAI_PAT` as environment variable which you can find in the [security page](https://clarifai.com/settings/security). Optionally you can also pass the PAT key as parameters in LLM/Embedder class. |
| #398 | |
| #399 | Now you are all set with exploring Embedchain. |
| #400 | |
| #401 | <CodeGroup> |
| #402 | |
| #403 | ```python main.py |
| #404 | import os |
| #405 | from embedchain import App |
| #406 | |
| #407 | os.environ["CLARIFAI_PAT"] = "XXX" |
| #408 | |
| #409 | # load llm configuration from config.yaml file |
| #410 | app = App.from_config(config_path="config.yaml") |
| #411 | |
| #412 | #Now let's add some data. |
| #413 | app.add("https://www.forbes.com/profile/elon-musk") |
| #414 | |
| #415 | #Query the app |
| #416 | response = app.query("what college degrees does elon musk have?") |
| #417 | ``` |
| #418 | Head to [Clarifai Platform](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22use_cases%22%2C%22value%22%3A%5B%22llm%22%5D%7D%5D) to browse various State-of-the-Art LLM models for your use case. |
| #419 | For passing model inference parameters use `model_kwargs` argument in the config file. Also you can use `api_key` argument to pass `CLARIFAI_PAT` in the config. |
| #420 | |
| #421 | ```yaml config.yaml |
| #422 | llm: |
| #423 | provider: clarifai |
| #424 | config: |
| #425 | model: "https://clarifai.com/mistralai/completion/models/mistral-7B-Instruct" |
| #426 | model_kwargs: |
| #427 | temperature: 0.5 |
| #428 | max_tokens: 1000 |
| #429 | embedder: |
| #430 | provider: clarifai |
| #431 | config: |
| #432 | model: "https://clarifai.com/clarifai/main/models/BAAI-bge-base-en-v15" |
| #433 | ``` |
| #434 | </CodeGroup> |
| #435 | |
| #436 | |
| #437 | ## GPT4ALL |
| #438 | |
| #439 | Install related dependencies using the following command: |
| #440 | |
| #441 | ```bash |
| #442 | pip install --upgrade 'embedchain[opensource]' |
| #443 | ``` |
| #444 | |
| #445 | GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code: |
| #446 | |
| #447 | <CodeGroup> |
| #448 | |
| #449 | ```python main.py |
| #450 | from embedchain import App |
| #451 | |
| #452 | # load llm configuration from config.yaml file |
| #453 | app = App.from_config(config_path="config.yaml") |
| #454 | ``` |
| #455 | |
| #456 | ```yaml config.yaml |
| #457 | llm: |
| #458 | provider: gpt4all |
| #459 | config: |
| #460 | model: 'orca-mini-3b-gguf2-q4_0.gguf' |
| #461 | temperature: 0.5 |
| #462 | max_tokens: 1000 |
| #463 | top_p: 1 |
| #464 | stream: false |
| #465 | |
| #466 | embedder: |
| #467 | provider: gpt4all |
| #468 | ``` |
| #469 | </CodeGroup> |
| #470 | |
| #471 | |
| #472 | ## JinaChat |
| #473 | |
| #474 | First, set `JINACHAT_API_KEY` in environment variable which you can obtain from [their platform](https://chat.jina.ai/api). |
| #475 | |
| #476 | Once you have the key, load the app using the config yaml file: |
| #477 | |
| #478 | <CodeGroup> |
| #479 | |
| #480 | ```python main.py |
| #481 | import os |
| #482 | from embedchain import App |
| #483 | |
| #484 | os.environ["JINACHAT_API_KEY"] = "xxx" |
| #485 | # load llm configuration from config.yaml file |
| #486 | app = App.from_config(config_path="config.yaml") |
| #487 | ``` |
| #488 | |
| #489 | ```yaml config.yaml |
| #490 | llm: |
| #491 | provider: jina |
| #492 | config: |
| #493 | temperature: 0.5 |
| #494 | max_tokens: 1000 |
| #495 | top_p: 1 |
| #496 | stream: false |
| #497 | ``` |
| #498 | </CodeGroup> |
| #499 | |
| #500 | |
| #501 | ## Hugging Face |
| #502 | |
| #503 | |
| #504 | Install related dependencies using the following command: |
| #505 | |
| #506 | ```bash |
| #507 | pip install --upgrade 'embedchain[huggingface-hub]' |
| #508 | ``` |
| #509 | |
| #510 | First, set `HUGGINGFACE_ACCESS_TOKEN` in environment variable which you can obtain from [their platform](https://huggingface.co/settings/tokens). |
| #511 | |
| #512 | You can load the LLMs from Hugging Face using three ways: |
| #513 | |
| #514 | - [Hugging Face Hub](#hugging-face-hub) |
| #515 | - [Hugging Face Local Pipelines](#hugging-face-local-pipelines) |
| #516 | - [Hugging Face Inference Endpoint](#hugging-face-inference-endpoint) |
| #517 | |
| #518 | ### Hugging Face Hub |
| #519 | |
| #520 | To load the model from Hugging Face Hub, use the following code: |
| #521 | |
| #522 | <CodeGroup> |
| #523 | |
| #524 | ```python main.py |
| #525 | import os |
| #526 | from embedchain import App |
| #527 | |
| #528 | os.environ["HUGGINGFACE_ACCESS_TOKEN"] = "xxx" |
| #529 | |
| #530 | config = { |
| #531 | "app": {"config": {"id": "my-app"}}, |
| #532 | "llm": { |
| #533 | "provider": "huggingface", |
| #534 | "config": { |
| #535 | "model": "bigscience/bloom-1b7", |
| #536 | "top_p": 0.5, |
| #537 | "max_length": 200, |
| #538 | "temperature": 0.1, |
| #539 | }, |
| #540 | }, |
| #541 | } |
| #542 | |
| #543 | app = App.from_config(config=config) |
| #544 | ``` |
| #545 | </CodeGroup> |
| #546 | |
| #547 | ### Hugging Face Local Pipelines |
| #548 | |
| #549 | If you want to load the locally downloaded model from Hugging Face, you can do so by following the code provided below: |
| #550 | |
| #551 | <CodeGroup> |
| #552 | ```python main.py |
| #553 | from embedchain import App |
| #554 | |
| #555 | config = { |
| #556 | "app": {"config": {"id": "my-app"}}, |
| #557 | "llm": { |
| #558 | "provider": "huggingface", |
| #559 | "config": { |
| #560 | "model": "Trendyol/Trendyol-LLM-7b-chat-v0.1", |
| #561 | "local": True, # Necessary if you want to run model locally |
| #562 | "top_p": 0.5, |
| #563 | "max_tokens": 1000, |
| #564 | "temperature": 0.1, |
| #565 | }, |
| #566 | } |
| #567 | } |
| #568 | app = App.from_config(config=config) |
| #569 | ``` |
| #570 | </CodeGroup> |
| #571 | |
| #572 | ### Hugging Face Inference Endpoint |
| #573 | |
| #574 | You can also use [Hugging Face Inference Endpoints](https://huggingface.co/docs/inference-endpoints/index#-inference-endpoints) to access custom endpoints. First, set the `HUGGINGFACE_ACCESS_TOKEN` as above. |
| #575 | |
| #576 | Then, load the app using the config yaml file: |
| #577 | |
| #578 | <CodeGroup> |
| #579 | |
| #580 | ```python main.py |
| #581 | from embedchain import App |
| #582 | |
| #583 | config = { |
| #584 | "app": {"config": {"id": "my-app"}}, |
| #585 | "llm": { |
| #586 | "provider": "huggingface", |
| #587 | "config": { |
| #588 | "endpoint": "https://api-inference.huggingface.co/models/gpt2", |
| #589 | "model_params": {"temprature": 0.1, "max_new_tokens": 100} |
| #590 | }, |
| #591 | }, |
| #592 | } |
| #593 | app = App.from_config(config=config) |
| #594 | |
| #595 | ``` |
| #596 | </CodeGroup> |
| #597 | |
| #598 | Currently only supports `text-generation` and `text2text-generation` for now [[ref](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html?highlight=huggingfaceendpoint#)]. |
| #599 | |
| #600 | See langchain's [hugging face endpoint](https://python.langchain.com/docs/integrations/chat/huggingface#huggingfaceendpoint) for more information. |
| #601 | |
| #602 | ## Llama2 |
| #603 | |
| #604 | Llama2 is integrated through [Replicate](https://replicate.com/). Set `REPLICATE_API_TOKEN` in environment variable which you can obtain from [their platform](https://replicate.com/account/api-tokens). |
| #605 | |
| #606 | Once you have the token, load the app using the config yaml file: |
| #607 | |
| #608 | <CodeGroup> |
| #609 | |
| #610 | ```python main.py |
| #611 | import os |
| #612 | from embedchain import App |
| #613 | |
| #614 | os.environ["REPLICATE_API_TOKEN"] = "xxx" |
| #615 | |
| #616 | # load llm configuration from config.yaml file |
| #617 | app = App.from_config(config_path="config.yaml") |
| #618 | ``` |
| #619 | |
| #620 | ```yaml config.yaml |
| #621 | llm: |
| #622 | provider: llama2 |
| #623 | config: |
| #624 | model: 'a16z-infra/llama13b-v2-chat:df7690f1994d94e96ad9d568eac121aecf50684a0b0963b25a41cc40061269e5' |
| #625 | temperature: 0.5 |
| #626 | max_tokens: 1000 |
| #627 | top_p: 0.5 |
| #628 | stream: false |
| #629 | ``` |
| #630 | </CodeGroup> |
| #631 | |
| #632 | ## Vertex AI |
| #633 | |
| #634 | Setup Google Cloud Platform application credentials by following the instruction on [GCP](https://cloud.google.com/docs/authentication/external/set-up-adc). Once setup is done, use the following code to create an app using VertexAI as provider: |
| #635 | |
| #636 | <CodeGroup> |
| #637 | |
| #638 | ```python main.py |
| #639 | from embedchain import App |
| #640 | |
| #641 | # load llm configuration from config.yaml file |
| #642 | app = App.from_config(config_path="config.yaml") |
| #643 | ``` |
| #644 | |
| #645 | ```yaml config.yaml |
| #646 | llm: |
| #647 | provider: vertexai |
| #648 | config: |
| #649 | model: 'chat-bison' |
| #650 | temperature: 0.5 |
| #651 | top_p: 0.5 |
| #652 | ``` |
| #653 | </CodeGroup> |
| #654 | |
| #655 | |
| #656 | ## Mistral AI |
| #657 | |
| #658 | Obtain the Mistral AI api key from their [console](https://console.mistral.ai/). |
| #659 | |
| #660 | <CodeGroup> |
| #661 | |
| #662 | ```python main.py |
| #663 | os.environ["MISTRAL_API_KEY"] = "xxx" |
| #664 | |
| #665 | app = App.from_config(config_path="config.yaml") |
| #666 | |
| #667 | app.add("https://www.forbes.com/profile/elon-musk") |
| #668 | |
| #669 | response = app.query("what is the net worth of Elon Musk?") |
| #670 | # As of January 16, 2024, Elon Musk's net worth is $225.4 billion. |
| #671 | |
| #672 | response = app.chat("which companies does elon own?") |
| #673 | # Elon Musk owns Tesla, SpaceX, Boring Company, Twitter, and X. |
| #674 | |
| #675 | response = app.chat("what question did I ask you already?") |
| #676 | # You have asked me several times already which companies Elon Musk owns, specifically Tesla, SpaceX, Boring Company, Twitter, and X. |
| #677 | ``` |
| #678 | |
| #679 | ```yaml config.yaml |
| #680 | llm: |
| #681 | provider: mistralai |
| #682 | config: |
| #683 | model: mistral-tiny |
| #684 | temperature: 0.5 |
| #685 | max_tokens: 1000 |
| #686 | top_p: 1 |
| #687 | embedder: |
| #688 | provider: mistralai |
| #689 | config: |
| #690 | model: mistral-embed |
| #691 | ``` |
| #692 | </CodeGroup> |
| #693 | |
| #694 | |
| #695 | ## AWS Bedrock |
| #696 | |
| #697 | ### Setup |
| #698 | - Before using the AWS Bedrock LLM, make sure you have the appropriate model access from [Bedrock Console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess). |
| #699 | - You will also need to authenticate the `boto3` client by using a method in the [AWS documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials) |
| #700 | - You can optionally export an `AWS_REGION` |
| #701 | |
| #702 | |
| #703 | ### Usage |
| #704 | |
| #705 | <CodeGroup> |
| #706 | |
| #707 | ```python main.py |
| #708 | import os |
| #709 | from embedchain import App |
| #710 | |
| #711 | os.environ["AWS_REGION"] = "us-west-2" |
| #712 | |
| #713 | app = App.from_config(config_path="config.yaml") |
| #714 | ``` |
| #715 | |
| #716 | ```yaml config.yaml |
| #717 | llm: |
| #718 | provider: aws_bedrock |
| #719 | config: |
| #720 | model: amazon.titan-text-express-v1 |
| #721 | # check notes below for model_kwargs |
| #722 | model_kwargs: |
| #723 | temperature: 0.5 |
| #724 | topP: 1 |
| #725 | maxTokenCount: 1000 |
| #726 | ``` |
| #727 | </CodeGroup> |
| #728 | |
| #729 | <br /> |
| #730 | <Note> |
| #731 | The model arguments are different for each providers. Please refer to the [AWS Bedrock Documentation](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/providers) to find the appropriate arguments for your model. |
| #732 | </Note> |
| #733 | |
| #734 | <br/ > |
| #735 | |
| #736 | ## Groq |
| #737 | |
| #738 | [Groq](https://groq.com/) is the creator of the world's first Language Processing Unit (LPU), providing exceptional speed performance for AI workloads running on their LPU Inference Engine. |
| #739 | |
| #740 | |
| #741 | ### Usage |
| #742 | |
| #743 | In order to use LLMs from Groq, go to their [platform](https://console.groq.com/keys) and get the API key. |
| #744 | |
| #745 | Set the API key as `GROQ_API_KEY` environment variable or pass in your app configuration to use the model as given below in the example. |
| #746 | |
| #747 | <CodeGroup> |
| #748 | |
| #749 | ```python main.py |
| #750 | import os |
| #751 | from embedchain import App |
| #752 | |
| #753 | # Set your API key here or pass as the environment variable |
| #754 | groq_api_key = "gsk_xxxx" |
| #755 | |
| #756 | config = { |
| #757 | "llm": { |
| #758 | "provider": "groq", |
| #759 | "config": { |
| #760 | "model": "mixtral-8x7b-32768", |
| #761 | "api_key": groq_api_key, |
| #762 | "stream": True |
| #763 | } |
| #764 | } |
| #765 | } |
| #766 | |
| #767 | app = App.from_config(config=config) |
| #768 | # Add your data source here |
| #769 | app.add("https://docs.embedchain.ai/sitemap.xml", data_type="sitemap") |
| #770 | app.query("Write a poem about Embedchain") |
| #771 | |
| #772 | # In the realm of data, vast and wide, |
| #773 | # Embedchain stands with knowledge as its guide. |
| #774 | # A platform open, for all to try, |
| #775 | # Building bots that can truly fly. |
| #776 | |
| #777 | # With REST API, data in reach, |
| #778 | # Deployment a breeze, as easy as a speech. |
| #779 | # Updating data sources, anytime, anyday, |
| #780 | # Embedchain's power, never sway. |
| #781 | |
| #782 | # A knowledge base, an assistant so grand, |
| #783 | # Connecting to platforms, near and far. |
| #784 | # Discord, WhatsApp, Slack, and more, |
| #785 | # Embedchain's potential, never a bore. |
| #786 | ``` |
| #787 | </CodeGroup> |
| #788 | |
| #789 | ## NVIDIA AI |
| #790 | |
| #791 | [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) let you quickly use NVIDIA's AI models, such as Mixtral 8x7B, Llama 2 etc, through our API. These models are available in the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), fully optimized and ready to use on NVIDIA's AI platform. They are designed for high speed and easy customization, ensuring smooth performance on any accelerated setup. |
| #792 | |
| #793 | |
| #794 | ### Usage |
| #795 | |
| #796 | In order to use LLMs from NVIDIA AI, create an account on [NVIDIA NGC Service](https://catalog.ngc.nvidia.com/). |
| #797 | |
| #798 | Generate an API key from their dashboard. Set the API key as `NVIDIA_API_KEY` environment variable. Note that the `NVIDIA_API_KEY` will start with `nvapi-`. |
| #799 | |
| #800 | Below is an example of how to use LLM model and embedding model from NVIDIA AI: |
| #801 | |
| #802 | <CodeGroup> |
| #803 | |
| #804 | ```python main.py |
| #805 | import os |
| #806 | from embedchain import App |
| #807 | |
| #808 | os.environ['NVIDIA_API_KEY'] = 'nvapi-xxxx' |
| #809 | |
| #810 | config = { |
| #811 | "app": { |
| #812 | "config": { |
| #813 | "id": "my-app", |
| #814 | }, |
| #815 | }, |
| #816 | "llm": { |
| #817 | "provider": "nvidia", |
| #818 | "config": { |
| #819 | "model": "nemotron_steerlm_8b", |
| #820 | }, |
| #821 | }, |
| #822 | "embedder": { |
| #823 | "provider": "nvidia", |
| #824 | "config": { |
| #825 | "model": "nvolveqa_40k", |
| #826 | "vector_dimension": 1024, |
| #827 | }, |
| #828 | }, |
| #829 | } |
| #830 | |
| #831 | app = App.from_config(config=config) |
| #832 | |
| #833 | app.add("https://www.forbes.com/profile/elon-musk") |
| #834 | answer = app.query("What is the net worth of Elon Musk today?") |
| #835 | # Answer: The net worth of Elon Musk is subject to fluctuations based on the market value of his holdings in various companies. |
| #836 | # As of March 1, 2024, his net worth is estimated to be approximately $210 billion. However, this figure can change rapidly due to stock market fluctuations and other factors. |
| #837 | # Additionally, his net worth may include other assets such as real estate and art, which are not reflected in his stock portfolio. |
| #838 | ``` |
| #839 | </CodeGroup> |
| #840 | |
| #841 | ## Token Usage |
| #842 | |
| #843 | You can get the cost of the query by setting `token_usage` to `True` in the config file. This will return the token details: `prompt_tokens`, `completion_tokens`, `total_tokens`, `total_cost`, `cost_currency`. |
| #844 | The list of paid LLMs that support token usage are: |
| #845 | - OpenAI |
| #846 | - Vertex AI |
| #847 | - Anthropic |
| #848 | - Cohere |
| #849 | - Together |
| #850 | - Groq |
| #851 | - Mistral AI |
| #852 | - NVIDIA AI |
| #853 | |
| #854 | Here is an example of how to use token usage: |
| #855 | <CodeGroup> |
| #856 | |
| #857 | ```python main.py |
| #858 | os.environ["OPENAI_API_KEY"] = "xxx" |
| #859 | |
| #860 | app = App.from_config(config_path="config.yaml") |
| #861 | |
| #862 | app.add("https://www.forbes.com/profile/elon-musk") |
| #863 | |
| #864 | response = app.query("what is the net worth of Elon Musk?") |
| #865 | # {'answer': 'Elon Musk's net worth is $209.9 billion as of 6/9/24.', |
| #866 | # 'usage': {'prompt_tokens': 1228, |
| #867 | # 'completion_tokens': 21, |
| #868 | # 'total_tokens': 1249, |
| #869 | # 'total_cost': 0.001884, |
| #870 | # 'cost_currency': 'USD'} |
| #871 | # } |
| #872 | |
| #873 | |
| #874 | response = app.chat("Which companies did Elon Musk found?") |
| #875 | # {'answer': 'Elon Musk founded six companies, including Tesla, which is an electric car maker, SpaceX, a rocket producer, and the Boring Company, a tunneling startup.', |
| #876 | # 'usage': {'prompt_tokens': 1616, |
| #877 | # 'completion_tokens': 34, |
| #878 | # 'total_tokens': 1650, |
| #879 | # 'total_cost': 0.002492, |
| #880 | # 'cost_currency': 'USD'} |
| #881 | # } |
| #882 | ``` |
| #883 | |
| #884 | ```yaml config.yaml |
| #885 | llm: |
| #886 | provider: openai |
| #887 | config: |
| #888 | model: gpt-4o-mini |
| #889 | temperature: 0.5 |
| #890 | max_tokens: 1000 |
| #891 | token_usage: true |
| #892 | ``` |
| #893 | </CodeGroup> |
| #894 | |
| #895 | If a model is missing and you'd like to add it to `model_prices_and_context_window.json`, please feel free to open a PR. |
| #896 | |
| #897 | <br/ > |
| #898 | |
| #899 | <Snippet file="missing-llm-tip.mdx" /> |
| #900 |