repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
Mirrored from https://github.com/benmaster82/Kwipu
stars
latest
clone command
git clone gitlawb://did:key:z6MkqRzA...RfoM/benmaster82-Kwi...git clone gitlawb://did:key:z6MkqRzA.../benmaster82-Kwi...908f0e4eAdd MCP badge18d ago





A local Graph RAG system that turns your markdown notes into a queryable knowledge graph. Ask questions in natural language and get answers that connect information across multiple files.
Built for Obsidian vaults but works with any folder of markdown files.
--llm-model, --embed-model[[wikilinks]] and YAML frontmatter into structured graph triples - LLM synonym expansion (optional, disable with --fast)
- Vector similarity search
- BM25 keyword scoring
- Temporal/metadata matching
--llm-model and --embed-modelllama3.1:8b, qwen2.5:7b, mistral:7b)nomic-embed-text)# Install dependencies pip install -r requirements.txt # Pull models in Ollama ollama pull llama3.1:8b ollama pull nomic-embed-text
Kwipu can run as an MCP server, allowing AI agents to query your knowledge graph directly. All processing happens locally via Ollama - the agent only sends the question and receives the answer.
Add to your claude_desktop_config.json (or equivalent MCP config):
{
"mcpServers": {
"kwipu": {
"command": "C:/path/to/python.exe",
"args": ["C:/path/to/kwipu_mcp_server.py"]
}
}
}
Replace paths with your actual Python and project locations. Requires Ollama running with the configured model.
# Full mode (default, all retrievers) python geode_graph.py # Fast mode (skips LLM synonym retriever, faster queries) python geode_graph.py --fast # Override models from CLI (no need to edit the file) python geode_graph.py --llm-model qwen2.5:7b --embed-model nomic-embed-text # Build with cloud model, then query with local model python geode_graph.py --llm-model gpt-oss:20b-cloud # After build completes, restart with: python geode_graph.py --llm-model qwen2.5:3b --fast
Place your markdown files in ./knowledge_base/ (or change KNOWLEDGE_DIR in the config). The system builds the graph on first run and watches for changes.
Your Notes (.md)
│
▼
┌─────────────────────┐
│ Pre-processing │ ← Extracts [[wikilinks]], YAML frontmatter
│ (lang_config.py) │ ← Infers relations from context (multilingual)
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ LLM Extraction │ ← Extracts additional entity-relation triples
│ (SimpleLLMPath) │
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ Property Graph │ ← Merges structural + LLM triples
│ Index │ ← Persisted to disk (storage_graph/)
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ Hybrid Retrieval │ ← Synonym + Vector + BM25 + Temporal
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ LLM Response │ ← Generates answer from retrieved context
└─────────────────────┘
├── geode_graph.py # Main application (terminal interface) ├── kwipu_mcp_server.py # MCP server for AI agent integration ├── lang_config.py # Multilingual configuration (stopwords, patterns, relations) ├── requirements.txt # Python dependencies ├── knowledge_base/ # Your notes go here │ └── examples/ # Example notes to get started └── storage_graph/ # Generated graph index (auto-created, gitignored)
Change KNOWLEDGE_DIR to your vault path:
KNOWLEDGE_DIR = "C:/Users/YourName/Documents/MyVault"
The system reads files without modifying them. It ignores .obsidian/ configuration files automatically.
| Model | RAM (Q4) | Quality | Speed per query (CPU) | Speed per query (GPU) |
|-------|----------|---------|----------------------|----------------------|
| 1B | ~2 GB | Basic | ~8s | ~2s |
| 3B | ~3 GB | Good | ~30-60s | ~5-8s |
| 7-8B | ~5-6 GB | Great | ~2-5 min | ~15-25s |
| 20B | ~12 GB | Best | Not practical | ~15s |
For serious use, 7B+ with a GPU is the sweet spot. The 3B is a good compromise for CPU-only setups.
First-time graph construction requires an LLM call for each document chunk. Subsequent runs load the graph from disk instantly. Times can vary ±2x depending on note length and model.
| Notes | GPU (7B) | CPU (7B) | CPU (3B) |
|-------|----------|----------|----------|
| 5 | ~2 min | ~8 min | ~4 min |
| 20 | ~8 min | ~30 min | ~15 min |
| 50 | ~20 min | ~1.5 hrs | ~40 min |
| 100 | ~40 min | ~3 hrs | ~1.5 hrs |
| 500+ | ~3 hrs | Not recommended | Not recommended |
Adding a single new file is incremental (~20-60s) and does not rebuild the full graph. Modifying an existing file also uses incremental update (delete + re-insert). Only file deletion triggers a full rebuild.
| Component | RAM | Notes |
|-----------|-----|-------|
| Ollama (LLM) | 2-14 GB | Depends on model size and quantization |
| Ollama (embeddings) | ~300 MB | nomic-embed-text |
| Kwipu (indexing) | 0.5-4 GB | Depends on number of notes |
| Kwipu (queries) | 200-500 MB | After graph is built |
| Total (7B Q4) | ~8-12 GB | Recommended minimum: 16 GB system RAM |
If your hardware is limited, you can use a powerful cloud model via Ollama to build the graph once, then switch to a smaller local model for daily queries. The graph is persisted to disk, so you only need the large model during construction.
# Step 1: Build the graph with a cloud model (one-time, high quality extraction) python geode_graph.py --llm-model gpt-oss:20b-cloud # Wait for "Graph built and saved successfully", then exit. # Step 2: Switch to a small local model for queries (fast, low resource) python geode_graph.py --llm-model qwen2.5:3b --fast
This gives you the best of both worlds: a high-quality graph built by a 20B+ model, with fast and lightweight queries on a 3B model. The graph structure (entities, relations, triples) doesn't change when you switch models - only the response generation uses the smaller model.
Note: If you change the embedding model (--embed-model), you must deletestorage_graph/and rebuild. Kwipu will detect the mismatch and warn you.
Contributions are welcome. Here's how to get started:
# Clone and setup git clone https://github.com/benmaster82/Kwipu.git cd Kwipu pip install -r requirements.txt
Areas where help is needed:
Guidelines:
Open an issue first if you want to discuss an approach before coding.
MIT