repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | --- |
| #2 | title: 📝 Github |
| #3 | --- |
| #4 | |
| #5 | 1. Setup the Github loader by configuring the Github account with username and personal access token (PAT). Check out [this](https://docs.github.com/en/enterprise-server@3.6/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-personal-access-token) link to learn how to create a PAT. |
| #6 | ```Python |
| #7 | from embedchain.loaders.github import GithubLoader |
| #8 | |
| #9 | loader = GithubLoader( |
| #10 | config={ |
| #11 | "token":"ghp_xxxx" |
| #12 | } |
| #13 | ) |
| #14 | ``` |
| #15 | |
| #16 | 2. Once you setup the loader, you can create an app and load data using the above Github loader |
| #17 | ```Python |
| #18 | import os |
| #19 | from embedchain.pipeline import Pipeline as App |
| #20 | |
| #21 | os.environ["OPENAI_API_KEY"] = "sk-xxxx" |
| #22 | |
| #23 | app = App() |
| #24 | |
| #25 | app.add("repo:embedchain/embedchain type:repo", data_type="github", loader=loader) |
| #26 | |
| #27 | response = app.query("What is Embedchain?") |
| #28 | # Answer: Embedchain is a Data Platform for Large Language Models (LLMs). It allows users to seamlessly load, index, retrieve, and sync unstructured data in order to build dynamic, LLM-powered applications. There is also a JavaScript implementation called embedchain-js available on GitHub. |
| #29 | ``` |
| #30 | The `add` function of the app will accept any valid github query with qualifiers. It only supports loading github code, repository, issues and pull-requests. |
| #31 | <Note> |
| #32 | You must provide qualifiers `type:` and `repo:` in the query. The `type:` qualifier can be a combination of `code`, `repo`, `pr`, `issue`, `branch`, `file`. The `repo:` qualifier must be a valid github repository name. |
| #33 | </Note> |
| #34 | |
| #35 | <Card title="Valid queries" icon="lightbulb" iconType="duotone" color="#ca8b04"> |
| #36 | - `repo:embedchain/embedchain type:repo` - to load the repository |
| #37 | - `repo:embedchain/embedchain type:branch name:feature_test` - to load the branch of the repository |
| #38 | - `repo:embedchain/embedchain type:file path:README.md` - to load the specific file of the repository |
| #39 | - `repo:embedchain/embedchain type:issue,pr` - to load the issues and pull-requests of the repository |
| #40 | - `repo:embedchain/embedchain type:issue state:closed` - to load the closed issues of the repository |
| #41 | </Card> |
| #42 | |
| #43 | 3. We automatically create a chunker to chunk your GitHub data, however if you wish to provide your own chunker class. Here is how you can do that: |
| #44 | ```Python |
| #45 | from embedchain.chunkers.common_chunker import CommonChunker |
| #46 | from embedchain.config.add_config import ChunkerConfig |
| #47 | |
| #48 | github_chunker_config = ChunkerConfig(chunk_size=2000, chunk_overlap=0, length_function=len) |
| #49 | github_chunker = CommonChunker(config=github_chunker_config) |
| #50 | |
| #51 | app.add(load_query, data_type="github", loader=loader, chunker=github_chunker) |
| #52 | ``` |
| #53 |