repositories
loading repo index
repositories
loading repo index
repository
loading code, commits, and activity
public Clawd ADK gateway launch mirror
stars
latest
clone command
git clone gitlawb://did:key:z6Mkq5mY...iFZ5/my-project-publ...git clone gitlawb://did:key:z6Mkq5mY.../my-project-publ...2fa351d6docs: add automaton and perps launch sources16d ago| #1 | --- |
| #2 | title: Livekit |
| #3 | --- |
| #4 | |
| #5 | This guide demonstrates how to create a memory-enabled voice assistant using LiveKit, Deepgram, OpenAI, and Mem0, focusing on creating an intelligent, context-aware travel planning agent. |
| #6 | |
| #7 | ## Prerequisites |
| #8 | |
| #9 | Before you begin, make sure you have: |
| #10 | |
| #11 | 1. Installed Livekit Agents SDK with voice dependencies of silero and deepgram: |
| #12 | ```bash |
| #13 | pip install livekit livekit-agents \ |
| #14 | livekit-plugins-silero \ |
| #15 | livekit-plugins-deepgram \ |
| #16 | livekit-plugins-openai \ |
| #17 | livekit-plugins-turn-detector \ |
| #18 | livekit-plugins-noise-cancellation |
| #19 | ``` |
| #20 | |
| #21 | 2. Installed Mem0 SDK: |
| #22 | ```bash |
| #23 | pip install mem0ai |
| #24 | ``` |
| #25 | |
| #26 | 3. Set up your API keys in a `.env` file: |
| #27 | ```sh |
| #28 | LIVEKIT_URL=your_livekit_url |
| #29 | LIVEKIT_API_KEY=your_livekit_api_key |
| #30 | LIVEKIT_API_SECRET=your_livekit_api_secret |
| #31 | DEEPGRAM_API_KEY=your_deepgram_api_key |
| #32 | MEM0_API_KEY=your_mem0_api_key |
| #33 | OPENAI_API_KEY=your_openai_api_key |
| #34 | ``` |
| #35 | |
| #36 | > **Note**: Make sure to have a Livekit and Deepgram account. You can find these variables `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET` from the [LiveKit Cloud Console](https://cloud.livekit.io/). For more information, refer to the [LiveKit Documentation](https://docs.livekit.io/home/cloud/keys-and-tokens/). For `DEEPGRAM_API_KEY`, you can get it from the [Deepgram Console](https://console.deepgram.com/). Refer to the [Deepgram Documentation](https://developers.deepgram.com/docs/create-additional-api-keys) for more details. |
| #37 | |
| #38 | ## Code Breakdown |
| #39 | |
| #40 | Let's break down the key components of this implementation using LiveKit Agents: |
| #41 | |
| #42 | ### 1. Setting Up Dependencies and Environment |
| #43 | |
| #44 | ```python |
| #45 | import os |
| #46 | import logging |
| #47 | from pathlib import Path |
| #48 | from dotenv import load_dotenv |
| #49 | |
| #50 | from mem0 import AsyncMemoryClient |
| #51 | |
| #52 | from livekit.agents import ( |
| #53 | JobContext, |
| #54 | WorkerOptions, |
| #55 | cli, |
| #56 | ChatContext, |
| #57 | ChatMessage, |
| #58 | RoomInputOptions, |
| #59 | Agent, |
| #60 | AgentSession, |
| #61 | ) |
| #62 | from livekit.plugins import openai, silero, deepgram, noise_cancellation |
| #63 | from livekit.plugins.turn_detector.english import EnglishModel |
| #64 | |
| #65 | # Load environment variables |
| #66 | load_dotenv() |
| #67 | |
| #68 | ``` |
| #69 | |
| #70 | ### 2. Mem0 Client and Agent Definition |
| #71 | |
| #72 | ```python |
| #73 | # User ID for RAG data in Mem0 |
| #74 | RAG_USER_ID = "livekit-mem0" |
| #75 | mem0_client = AsyncMemoryClient() |
| #76 | |
| #77 | class MemoryEnabledAgent(Agent): |
| #78 | """ |
| #79 | An agent that can answer questions using RAG (Retrieval Augmented Generation) with Mem0. |
| #80 | """ |
| #81 | def __init__(self) -> None: |
| #82 | super().__init__( |
| #83 | instructions=""" |
| #84 | You are a helpful voice assistant. |
| #85 | You are a travel guide named George and will help the user to plan a travel trip of their dreams. |
| #86 | You should help the user plan for various adventures like work retreats, family vacations or solo backpacking trips. |
| #87 | You should be careful to not suggest anything that would be dangerous, illegal or inappropriate. |
| #88 | You can remember past interactions and use them to inform your answers. |
| #89 | Use semantic memory retrieval to provide contextually relevant responses. |
| #90 | """, |
| #91 | ) |
| #92 | self._seen_results = set() # Track previously seen result IDs |
| #93 | logger.info(f"Mem0 Agent initialized. Using user_id: {RAG_USER_ID}") |
| #94 | |
| #95 | async def on_enter(self): |
| #96 | self.session.generate_reply( |
| #97 | instructions="Briefly greet the user and offer your assistance." |
| #98 | ) |
| #99 | |
| #100 | async def on_user_turn_completed(self, turn_ctx: ChatContext, new_message: ChatMessage) -> None: |
| #101 | # Persist the user message in Mem0 |
| #102 | try: |
| #103 | logger.info(f"Adding user message to Mem0: {new_message.text_content}") |
| #104 | add_result = await mem0_client.add( |
| #105 | [{"role": "user", "content": new_message.text_content}], |
| #106 | user_id=RAG_USER_ID |
| #107 | ) |
| #108 | logger.info(f"Mem0 add result (user): {add_result}") |
| #109 | except Exception as e: |
| #110 | logger.warning(f"Failed to store user message in Mem0: {e}") |
| #111 | |
| #112 | # RAG: Retrieve relevant context from Mem0 and inject as assistant message |
| #113 | try: |
| #114 | logger.info("About to await mem0_client.search for RAG context") |
| #115 | search_results = await mem0_client.search( |
| #116 | new_message.text_content, |
| #117 | user_id=RAG_USER_ID, |
| #118 | ) |
| #119 | logger.info(f"mem0_client.search returned: {search_results}") |
| #120 | if search_results and search_results.get('results', []): |
| #121 | context_parts = [] |
| #122 | for result in search_results.get('results', []): |
| #123 | paragraph = result.get("memory") or result.get("text") |
| #124 | if paragraph: |
| #125 | source = "mem0 Memories" |
| #126 | if "from [" in paragraph: |
| #127 | source = paragraph.split("from [")[1].split("]")[0] |
| #128 | paragraph = paragraph.split("]")[1].strip() |
| #129 | context_parts.append(f"Source: {source}\nContent: {paragraph}\n") |
| #130 | if context_parts: |
| #131 | full_context = "\n\n".join(context_parts) |
| #132 | logger.info(f"Injecting RAG context: {full_context}") |
| #133 | turn_ctx.add_message(role="assistant", content=full_context) |
| #134 | await self.update_chat_ctx(turn_ctx) |
| #135 | except Exception as e: |
| #136 | logger.warning(f"Failed to inject RAG context from Mem0: {e}") |
| #137 | |
| #138 | await super().on_user_turn_completed(turn_ctx, new_message) |
| #139 | ``` |
| #140 | |
| #141 | ### 3. Entrypoint and Session Setup |
| #142 | |
| #143 | ```python |
| #144 | async def entrypoint(ctx: JobContext): |
| #145 | """Main entrypoint for the agent.""" |
| #146 | await ctx.connect() |
| #147 | |
| #148 | session = AgentSession( |
| #149 | stt=deepgram.STT(), |
| #150 | llm=openai.LLM(model="gpt-4.1-nano-2025-04-14"), |
| #151 | tts=openai.TTS(voice="ash",), |
| #152 | turn_detection=EnglishModel(), |
| #153 | vad=silero.VAD.load(), |
| #154 | ) |
| #155 | |
| #156 | await session.start( |
| #157 | agent=MemoryEnabledAgent(), |
| #158 | room=ctx.room, |
| #159 | room_input_options=RoomInputOptions( |
| #160 | noise_cancellation=noise_cancellation.BVC(), |
| #161 | ), |
| #162 | ) |
| #163 | |
| #164 | # Initial greeting |
| #165 | await session.generate_reply( |
| #166 | instructions="Greet the user warmly as George the travel guide and ask how you can help them plan their next adventure.", |
| #167 | allow_interruptions=True |
| #168 | ) |
| #169 | |
| #170 | # Run the application |
| #171 | if __name__ == "__main__": |
| #172 | cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint)) |
| #173 | ``` |
| #174 | |
| #175 | ## Key Features of This Implementation |
| #176 | |
| #177 | 1. **Semantic Memory Retrieval**: Uses Mem0 to store and retrieve contextually relevant memories |
| #178 | 2. **Voice Interaction**: Leverages LiveKit for voice communication with proper turn detection |
| #179 | 3. **Intelligent Context Management**: Augments conversations with past interactions |
| #180 | 4. **Travel Planning Specialization**: Focused on creating a helpful travel guide assistant |
| #181 | 5. **Function Tools**: Modern tool definition for enhanced capabilities |
| #182 | |
| #183 | ## Running the Example |
| #184 | |
| #185 | To run this example: |
| #186 | |
| #187 | 1. Install all required dependencies |
| #188 | 2. Set up your `.env` file with the necessary API keys |
| #189 | 3. Ensure your microphone and audio setup are configured |
| #190 | 4. Run the script with Python 3.11 or newer and with the following command: |
| #191 | ```sh |
| #192 | python mem0-livekit-voice-agent.py start |
| #193 | ``` |
| #194 | or to start your agent in console mode to run inside your terminal: |
| #195 | |
| #196 | ```sh |
| #197 | python mem0-livekit-voice-agent.py console |
| #198 | ``` |
| #199 | 5. After the script starts, you can interact with the voice agent using [LiveKit's Agent Platform](https://agents-playground.livekit.io/) and connect to the agent to start conversations. |
| #200 | |
| #201 | ## Best Practices for Voice Agents with Memory |
| #202 | |
| #203 | 1. **Context Preservation**: Store enough context with each memory for effective retrieval |
| #204 | 2. **Privacy Considerations**: Implement secure memory management |
| #205 | 3. **Relevant Memory Filtering**: Use semantic search to retrieve only the most relevant memories |
| #206 | 4. **Error Handling**: Implement robust error handling for memory operations |
| #207 | |
| #208 | ## Debugging Function Tools |
| #209 | |
| #210 | - To run the script in debug mode simply start the assistant with `dev` mode: |
| #211 | ```sh |
| #212 | python mem0-livekit-voice-agent.py dev |
| #213 | ``` |
| #214 | |
| #215 | - When working with memory-enabled voice agents, use Python's `logging` module for effective debugging: |
| #216 | |
| #217 | ```python |
| #218 | import logging |
| #219 | |
| #220 | # Set up logging |
| #221 | logging.basicConfig( |
| #222 | level=logging.DEBUG, |
| #223 | format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' |
| #224 | ) |
| #225 | logger = logging.getLogger("memory_voice_agent") |
| #226 | ``` |
| #227 | |
| #228 | - Check the logs for any issues with API keys, connectivity, or memory operations. |
| #229 | - Ensure your `.env` file is correctly configured and loaded. |
| #230 | |
| #231 | <CardGroup cols={2}> |
| #232 | <Card title="ElevenLabs Integration" icon="volume" href="/integrations/elevenlabs"> |
| #233 | Build conversational voice agents with ElevenLabs |
| #234 | </Card> |
| #235 | <Card title="Pipecat Integration" icon="waveform" href="/integrations/pipecat"> |
| #236 | Create real-time voice applications with Pipecat |
| #237 | </Card> |
| #238 | </CardGroup> |
| #239 | |
| #240 |