my-project-public

repository

loading code, commits, and activity

repositories

loading repo index

#1	# Clawd Memory Configuration
#2
#3	Clawd Memory works with zero configuration for local use. The Clawd layer adds `CLAWD_BRAIN_VAULT` for the markdown vault and reuses the Mnemosyne engine variables for SQLite storage, BEAM limits, vector compression, and LLM consolidation.
#4
#5	## Clawd Settings
#6
#7	\| Variable \| Default \| Description \|
#8	\|---\|---\|---\|
#9	\| `CLAWD_BRAIN_VAULT` \| `MemeBRain/vault` \| Markdown vault used by `ClawdBrain` and `clawd-brain` \|
#10	\| `MNEMOSYNE_DATA_DIR` \| `~/.hermes/mnemosyne/data` \| SQLite engine data directory \|
#11
#12	Example:
#13
#14	```bash
#15	export CLAWD_BRAIN_VAULT="/Users/8bit/bots/Cladwbot-solana/solana-clawd/MemeBRain/vault"
#16	export MNEMOSYNE_DATA_DIR="$HOME/.hermes/mnemosyne/data"
#17	python3 -m mnemosyne.clawd_brain init
#18	```
#19
#20	The sections below document the engine variables. They keep the `MNEMOSYNE_` prefix for compatibility.
#21
#22	## Data Directory
#23
#24	```bash
#25	MNEMOSYNE_DATA_DIR=~/.hermes/mnemosyne/data
#26	```
#27
#28	Default: `~/.hermes/mnemosyne/data`
#29
#30	The SQLite database file (`mnemosyne.db`) is created here on first use. The directory is created automatically.
#31
#32	This path defaults to `~/.hermes/` because Hermes persists that directory across sessions, including on ephemeral VMs (Fly.io, etc.).
#33
#34	## Memory Tiers
#35
#36	### Working Memory
#37
#38	\| Variable \| Default \| Description \|
#39	\|---\|---\|---\|
#40	\| `MNEMOSYNE_WM_MAX_ITEMS` \| `10000` \| Maximum items in working memory \|
#41	\| `MNEMOSYNE_WM_TTL_HOURS` \| `24` \| Time-to-live for working memory entries (hours) \|
#42
#43	### Episodic Memory
#44
#45	\| Variable \| Default \| Description \|
#46	\|---\|---\|---\|
#47	\| `MNEMOSYNE_EP_LIMIT` \| `50000` \| Maximum episodic memory entries \|
#48	\| `MNEMOSYNE_SLEEP_BATCH` \| `5000` \| Max working memories to fetch per consolidation cycle \|
#49
#50	### Scratchpad
#51
#52	\| Variable \| Default \| Description \|
#53	\|---\|---\|---\|
#54	\| `MNEMOSYNE_SP_MAX` \| `1000` \| Maximum scratchpad entries \|
#55
#56	### Recency
#57
#58	\| Variable \| Default \| Description \|
#59	\|---\|---\|---\|
#60	\| `MNEMOSYNE_RECENCY_HALFLIFE` \| `168` \| Recency decay halflife in hours (default: 1 week) \|
#61
#62	Affects how recent memories are scored relative to older ones during recall.
#63
#64	## Vector Compression
#65
#66	```bash
#67	MNEMOSYNE_VEC_TYPE=int8
#68	```
#69
#70	\| Value \| Size per vector \| Description \|
#71	\|---\|---\|---\|
#72	\| `float32` \| 1,536 bytes \| Full precision. Largest, most accurate. \|
#73	\| `int8` \| 384 bytes \| Default. Good balance of size vs. accuracy. \|
#74	\| `bit` \| 48 bytes \| 32× smaller than float32. Fastest, lowest precision. \|
#75
#76	All values use 384-dimensional vectors (bge-small-en-v1.5 embedding model).
#77
#78	## LLM Consolidation
#79
#80	### Local LLM (ctransformers / GGUF)
#81
#82	\| Variable \| Default \| Description \|
#83	\|---\|---\|---\|
#84	\| `MNEMOSYNE_LLM_ENABLED` \| `true` \| Enable LLM summarization during sleep cycle \|
#85	\| `MNEMOSYNE_LLM_N_CTX` \| `2048` \| Context window size for the local model \|
#86	\| `MNEMOSYNE_LLM_MAX_TOKENS` \| `256` \| Maximum output tokens per summary \|
#87	\| `MNEMOSYNE_LLM_N_THREADS` \| `4` \| CPU threads for local inference \|
#88	\| `MNEMOSYNE_LLM_REPO` \| `TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF` \| HuggingFace repo for GGUF model \|
#89	\| `MNEMOSYNE_LLM_FILE` \| `tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf` \| GGUF filename \|
#90
#91	### Remote LLM (OpenAI-compatible)
#92
#93	Use a remote model instead of local TinyLlama:
#94
#95	\| Variable \| Default \| Description \|
#96	\|---\|---\|---\|
#97	\| `MNEMOSYNE_LLM_BASE_URL` \| (none) \| OpenAI-compatible API base URL (e.g. `http://localhost:8080/v1`) \|
#98	\| `MNEMOSYNE_LLM_API_KEY` \| (none) \| API key for authenticated endpoints \|
#99	\| `MNEMOSYNE_LLM_MODEL` \| (none) \| Model identifier sent in requests \|
#100
#101	When `MNEMOSYNE_LLM_BASE_URL` is set, Mnemosyne uses the remote endpoint for consolidation. Falls back to local ctransformers if the remote is unreachable, then to AAAK encoding.
#102
#103	Works with: llama.cpp server, vLLM, Ollama, LM Studio, or any OpenAI-compatible API.
#104
#105	### Host LLM Adapter (Hermes / agent integration)
#106
#107	Route consolidation and fact extraction through a host-provided LLM (e.g., Hermes' authenticated `agent.auxiliary_client.call_llm`). Useful for OAuth-backed providers like `openai-codex` that don't fit the URL+API-key remote shape.
#108
#109	\| Variable \| Default \| Description \|
#110	\|---\|---\|---\|
#111	\| `MNEMOSYNE_HOST_LLM_ENABLED` \| `false` \| Opt in to host-adapter routing \|
#112	\| `MNEMOSYNE_HOST_LLM_PROVIDER` \| (none) \| Optional provider override, e.g. `openai-codex` \|
#113	\| `MNEMOSYNE_HOST_LLM_MODEL` \| (none) \| Optional model override, e.g. `gpt-5.1-mini` \|
#114	\| `MNEMOSYNE_HOST_LLM_N_CTX` \| `32000` \| Prompt-budget when host is the chosen path (TinyLlama-calibrated `LLM_N_CTX=2048` is too small for Codex/GPT-class) \|
#115
#116	When the host call fails, the adapter falls back to the local GGUF model rather than the remote URL. See [hermes-llm-integration.md](hermes-llm-integration.md) for the full behavior model and session-shutdown semantics.
#117
#118	### Fallback Chain
#119
#120	```
#121	0. Host LLM adapter (if MNEMOSYNE_HOST_LLM_ENABLED=true AND a backend is registered)
#122	↓ (on failure: skip remote, go to local)
#123	1. Remote LLM (if MNEMOSYNE_LLM_BASE_URL is set AND host is not enabled)
#124	↓ (on failure)
#125	2. Local LLM (ctransformers + TinyLlama GGUF)
#126	↓ (on failure or not installed)
#127	3. AAAK encoding (keyword-based, no LLM required)
#128	```
#129
#130	## Config File (config.yaml)
#131
#132	In addition to environment variables, Mnemosyne supports configuration via a `config.yaml` file. This is the recommended approach when running Mnemosyne as a Hermes plugin, as it allows configuring memory behavior in the same file as other Hermes settings.
#133
#134	### memory.mnemosyne
#135
#136	Place this section in your `config.yaml` under the top-level `memory` key:
#137
#138	```yaml
#139	memory:
#140	mnemosyne:
#141	# Enable automatic memory consolidation on session start/end
#142	auto_sleep: true
#143
#144	# Minimum number of working memories required before auto-sleep triggers.
#145	# Prevents consolidation on trivial sessions. Default: 20
#146	sleep_threshold: 20
#147
#148	# Regex patterns for content that should NOT be stored in memory.
#149	# Each pattern is matched against the content string using Python's re.search().
#150	# Useful for filtering out technical noise, stack traces, boilerplate, etc.
#151	ignore_patterns:
#152	- "^pip install"
#153	- "^npm install"
#154	- "^sudo "
#155	- "^Traceback \$most recent call last\$"
#156	```
#157
#158	### auto_sleep
#159
#160	Type: `bool` \| Default: `true`
#161
#162	When `true`, Mnemosyne automatically runs the sleep consolidation cycle (`consolidate_to_episodic()`) on session start and end. This offloads working memories into the episodic tier for long-term storage. Set to `false` if you only want to trigger sleep manually via the `mnemosyne_sleep` tool.
#163
#164	### sleep_threshold
#165
#166	Type: `int` \| Default: `20`
#167
#168	The minimum number of working memory entries required before auto-sleep triggers. This prevents consolidation from running on sessions that barely generated any memories. If the working memory count is below the threshold, the sleep cycle is skipped.
#169
#170	### ignore_patterns
#171
#172	Type: `list[str]` \| Default: `[]`
#173
#174	A list of regex patterns (Python `re` syntax) that filter content before it enters memory storage. If any pattern matches `re.search(pattern, content)`, the content is silently skipped — it will not be stored in working memory and will not appear in recalls.
#175
#176	This is useful for excluding:
#177
#178	- Shell commands (`^pip install`, `^npm run`, `^git `)
#179	- Error stack traces (`^Traceback`, `^Error:`, `^\s+at `)
#180	- Boilerplate text (`^---BEGIN`, `^#include`)
#181	- System-level chatter that pollutes memory
#182
#183	Example:
#184	```yaml
#185	memory:
#186	mnemosyne:
#187	ignore_patterns:
#188	- "^pip "
#189	- "^npm "
#190	- "^Traceback \$most recent call last\$"
#191	- "^Error:"
#192	- "^\\s+at "
#193	```
#194
#195	Patterns are applied at `remember()` time. Content that matches any pattern is discarded with a debug-level log.
#196
#197	## Optional Dependencies
#198
#199	```bash
#200	# Dense retrieval (semantic search)
#201	pip install fastembed>=0.3.0
#202
#203	# Local LLM consolidation
#204	pip install ctransformers>=0.2.27 huggingface-hub>=0.20
#205
#206	# Both
#207	pip install mnemosyne-memory[all]
#208	```
#209
#210	Without `fastembed`, Mnemosyne falls back to keyword-only retrieval (FTS5). It works, but semantic search and benchmark scores require it.
#211
#212	## Example Configuration
#213
#214	```bash
#215	# ~/.bashrc or .env
#216	export MNEMOSYNE_DATA_DIR=~/.hermes/mnemosyne/data
#217	export MNEMOSYNE_VEC_TYPE=int8
#218	export MNEMOSYNE_WM_MAX_ITEMS=10000
#219	export MNEMOSYNE_WM_TTL_HOURS=48
#220	export MNEMOSYNE_SLEEP_BATCH=3000
#221
#222	# Use Ollama for consolidation
#223	export MNEMOSYNE_LLM_BASE_URL=http://localhost:11434/v1
#224	export MNEMOSYNE_LLM_MODEL=llama3
#225
#226	# OR: when running under Hermes, route through Hermes' authenticated provider
#227	# (e.g., an OAuth-backed openai-codex subscription) instead of a remote URL
#228	export MNEMOSYNE_HOST_LLM_ENABLED=true
#229	```
#230

z6Mkq5mY3JWtxoxUobWcfNHm7AkRubgSWEZTkBVqZXJviFZ5/my-project-public