yingqi-z20-Agent-libOS

repository

loading code, commits, and activity

repositories

loading repo index

yingqi-z20-Agent-libOS — gitlawb

#1	# Agent libOS
#2
#3	An experimental Agent-native libOS runtime written in Python.
#4
#5	Agent libOS models an agent as a long-running, schedulable, interruptible, capability-controlled `AgentProcess`, not as a single chat request or workflow thread. The codebase is an MVP implementation of the ideas in [agent_libos_design_doc.md](agent_libos_design_doc.md).
#6
#7	This project is still in active development.
#8
#9	## Current MVP
#10
#11	### Runtime
#12
#13	- Agent process lifecycle: `spawn`, `fork`, `exec`, `wait`, `signal`, `pause`, `resume`, `exit`.
#14	- Async process supervisor: `Runtime.arun_until_idle()` automatically keeps runnable processes moving.
#15	- Child process tools can fork workers, spawn fresh children, wait/join, list direct children, signal direct children, and merge child memory.
#16	- Each process gets its own default Object Memory namespace at spawn/fork time. Bare Object Memory names resolve inside that process namespace.
#17	- Each process has its own workspace-relative working directory. Relative filesystem paths and shell subprocess cwd resolve from that process cwd; the runtime host process does not `chdir` into launched workspaces.
#18	- Each process has a durable message queue for IPC. Messages carry `kind`, `channel`, `correlation_id`, `reply_to`, subject/body, and structured payload; receivers can read, acknowledge, or block on selective filters.
#19	- Human queue integration is part of the runtime supervisor by default. If a primitive blocks on human approval, the process enters `WAITING_HUMAN`; the runtime processes human terminal messages, wakes the process, and resumes the pending action.
#20	- Child waits are also resumable: `wait_child_process` puts the parent in `WAITING_EVENT`, child exit wakes the parent, and the original wait action resumes without asking the model for a new action.
#21	- Single-step APIs remain available for tests and debugging: `run_next_process_once()` / `arun_next_process_once()` do not drain the human queue.
#22	- Agent images configure process-visible tool tables at process creation time.
#23	- Event bus and audit trace cover process, process messages, object memory, capabilities, tools, human requests, checkpoints, and primitive access.
#24	- SQLite stores process/object metadata, process messages, full LLM call records, events, audit records, capabilities, human requests, tools, candidates, and checkpoints.
#25	- LibOS primitives use an injectable Resource Provider Substrate. The default substrate is local host OS backed, but filesystem, clock/sleep, shell, and human terminal I/O providers can be replaced without changing tool schemas or capability checks.
#26
#27	### Object Memory
#28
#29	- Typed Object Memory with handles, namespace-local names, namespace directories, links, views, materialized context, snapshots, and merge scaffolding.
#30	- The default namespace is process-private: process `proc_abc` resolves bare names inside `process:proc_abc`, similar to how an OS process sees its own virtual address space by default.
#31	- Names are unique only inside a namespace. The same local name can exist independently in two process namespaces or in an explicit shared namespace.
#32	- Explicit namespaces are directory-like scopes created with `create_memory_namespace` and inspected with `list_memory_namespace`.
#33	- Namespace capabilities gate listing and name resolution. Object capabilities still gate reading, writing, linking, materializing, deleting, and granting object access.
#34	- A name is not itself a capability: resolving `namespace/name` requires namespace read authority and object read authority.
#35	- Object payloads live in runtime memory, not SQLite. SQLite stores directory metadata and a runtime-memory marker only.
#36	- Process-owned memory is released on process exit unless retained as the process result.
#37	- File/Object bridge tools can move file content into and out of Object Memory without returning the concrete content to the process-visible tool result.
#38
#39	### Tools And Primitives
#40
#41	LLM-facing tools are stable wrappers over libOS primitives. They are similar to libc calls: ergonomic and model-facing, but not the security boundary.
#42
#43	Built-in tools currently include:
#44
#45	- `append_memory_object`
#46	- `ask_human`
#47	- `create_memory_namespace`
#48	- `create_memory_object`
#49	- `create_object_from_file`
#50	- `delete_directory`
#51	- `delete_file`
#52	- `exec_process`
#53	- `fork_child_process`
#54	- `get_current_time`
#55	- `get_working_directory`
#56	- `human_output`
#57	- `load_image_from_yaml`
#58	- `list_child_processes`
#59	- `list_memory_namespace`
#60	- `merge_child_memory`
#61	- `parse_pytest_log`
#62	- `process_exit`
#63	- `propose_jit_tool`
#64	- `read_directory`
#65	- `read_memory_object`
#66	- `read_process_messages`
#67	- `receive_process_messages`
#68	- `read_text_file`
#69	- `register_jit_tool`
#70	- `request_permission`
#71	- `run_shell_command`
#72	- `send_process_message`
#73	- `set_working_directory`
#74	- `signal_child_process`
#75	- `sleep`
#76	- `spawn_child_process`
#77	- `validate_jit_tool`
#78	- `wait_child_process`
#79	- `write_directory`
#80	- `write_object_to_file`
#81	- `write_text_file`
#82	- `echo`
#83
#84	Important boundary rules:
#85
#86	- A process can call only tools in its process tool table.
#87	- Tool call visibility is not an external-resource grant.
#88	- Bare Object Memory names resolve in the caller's process namespace; shared memory requires an explicit namespace plus namespace/object capabilities.
#89	- Relative filesystem paths and shell commands resolve from the caller's process working directory, which is independent for each `AgentProcess`.
#90	- Filesystem read/write/delete checks happen in the filesystem primitive.
#91	- Human output, human questions, and human approval checks happen in the HumanObject primitive; concrete terminal reads/writes happen only through the substrate `HumanProvider`.
#92	- Shell execution checks happen in the shell primitive. The model-facing tool accepts argv arrays only; it never accepts shell command strings for implicit parsing.
#93	- Image registration checks happen in the image registry primitive. `load_image_from_yaml` only reads a workspace YAML file and passes the parsed manifest to that primitive.
#94	- `ask_human` creates a blocking HumanObject question and returns the answer only after the human queue responds.
#95	- Clock `sleep` is async, so one sleeping process does not block other runnable processes.
#96	- Agent-authored JIT tools are Deno/TypeScript modules. They export `run(args, libos)` and can reach libOS only through `await libos.syscall(name, args)`.
#97	- JIT syscalls do not consult the caller's LLM-facing tool table. They are authorized by pid, primitive-level capabilities, permission policy, human approval, and audit.
#98	- The Deno subprocess is launched with `--no-prompt` and no read/write/net/env/run/ffi host permissions. Static imports are limited to configured `jsr:` packages, with a small `@std/*` allowlist by default.
#99	- Human approval is part of a syscall. TypeScript sees either the final syscall payload or a final syscall error; it never sees a pending/retry protocol state.
#100	- `process.exit` and `process.exec` are ordinary syscalls from the TypeScript side. The runtime applies the resulting lifecycle change only after the JIT tool returns its normal tool result.
#101
#102	### Permissions And Human Queue
#103
#104	Permission requests are ordinary process actions mediated by the human queue:
#105
#106	- `request_permission` asks the human to choose a policy for a resource/right pair.
#107	- The human can choose `always_allow`, `always_deny`, or `ask_each_time`.
#108	- With `ask_each_time`, the relevant primitive creates a per-use human approval request when the operation is attempted.
#109	- Per-use approval grants a one-shot capability that is consumed after one successful primitive call.
#110	- Filesystem capabilities can target exact files such as `filesystem:workspace:README.md`, directory subtrees such as `filesystem:workspace:agent_outputs/*`, or the whole workspace.
#111	- Shell capabilities are process-scoped policies over `shell:*`. The built-in policy levels are `always_deny`, `allowlist_auto_else_ask`, `blocklist_ask_else_auto`, and `always_allow`; `always_allow` is intentionally marked high-risk.
#112	- Shell allow/block lists match tokenized argv, not substrings, globs, or shell-expanded strings. Allow-list rules are exact by default, bare executable names do not match path-qualified executables, and block-list checks also scan nested executable-looking argv tokens such as `bash` or `powershell`.
#113	- Runtime helpers can grant file/directory allow lists separately for read, write, and delete operations.
#114	- Child processes inherit no external-resource capability by default; `fork_child_process` and `spawn_child_process` can explicitly inherit selected file, directory, or resource capabilities that the parent already holds.
#115	- `fork_child_process` attenuates a selected parent MemoryView into the child. `spawn_child_process` creates a fresh direct child with a new process namespace and a goal-only MemoryView.
#116	- `exec_process` replaces the current process image and tool table without changing pid. It never grants the target image's required capabilities automatically; capabilities are preserved only when explicitly requested, otherwise external capabilities are shrunk.
#117	- Image registration requires `write` on `image:<image_id>` or a wildcard such as `image:*`. The YAML loader also requires filesystem read authority for the manifest path.
#118	- Ordinary human questions use the same queue: a process waiting on `ask_human` stays in `WAITING_HUMAN` until the terminal queue supplies an answer.
#119	- Rejection does not crash the runtime; the process resumes and can report why it could not complete.
#120	- Approval context includes path, resource, overwrite risk, byte count, SHA-256, target state, and a `repr()`-escaped content preview.
#121
#122	### LLM Execution
#123
#124	- OpenAI-compatible LLM client using `.env` configuration.
#125	- OpenAI tool-call schemas generated from the current process tool table.
#126	- The runtime executes the selected legal tool call for each quantum.
#127	- Free-form model text is allowed, but only tool calls or fallback JSON actions have side effects.
#128	- Malformed tool calls with missing function names are rejected; when possible the executor gives the model one repair attempt with the exact visible tool names.
#129	- Model calls run off the event loop, and tool dispatch has async support.
#130	- Each process LLM context is stored as a mutable Object Memory object named `llm_context:<pid>`. The runtime appends new process facts, events, capability snapshots, and object summaries to the end of this object so repeated prompt prefixes remain stable for prompt caching.
#131
#132	### Built-In Coding Image
#133
#134	`coding-agent:v0` is the practical repository-engineering image. It starts with read-only workspace authority and human-output authority, but no default write/delete authority. Its prompt tells the agent to scale the size of a change to the goal, preserve plans and evidence in Object Memory, fork child workers only when parallel analysis materially helps, spawn fresh children when parent context should not be copied, use pregranted write/delete authority when present, request least-privilege permissions when authority is missing, use file/Object bridge tools for large content movement, parse pytest logs when available, and exit with a structured summary of changes, evidence, verification, residual risks, and follow-up.
#135
#136	### Security Properties Covered By Tests
#137
#138	- Object handles are capability-protected; OIDs or object names alone do not grant access.
#139	- Object Memory namespaces are capability-protected; namespace read/write and object read/write are separate checks.
#140	- Tool tables and external-resource capabilities are independent.
#141	- Tools cannot bypass filesystem or human primitive checks.
#142	- Path containment, revoked capabilities, fork attenuation, spawn-child isolation, exec non-escalation, image registration authority, tool-table denial, Deno/TypeScript JIT scope, syscall capability checks, human approval inside syscalls, deferred process lifecycle, and unsafe import/API rejection are covered by tests.
#143	- Built-in LLM-facing tools are checked so they do not directly touch host filesystem, terminal, network, shell, database, or secrets.
#144
#145	## Quick Start
#146
#147	Install dependencies:
#148
#149	```bash
#150	uv sync
#151	```
#152
#153	Deno is optional for the Python test suite. Install `deno` or set `agent_libos.config.DEFAULT_CONFIG.tools.deno_executable` if you want to validate or run real Deno/TypeScript JIT tools.
#154
#155	Run tests:
#156
#157	```bash
#158	uv run python -m unittest discover -s tests -v
#159	```
#160
#161	Run the deterministic local demo:
#162
#163	```bash
#164	uv run agent-libos demo
#165	```
#166
#167	The demo does not call a real model. It covers process spawn/fork, Object Memory, a Deno/TypeScript JIT parser when Deno is available, checkpointing, capability denial before grant, human approval, filesystem write, final report object creation, and audit trace generation. If Deno is not installed, the demo reports the JIT validation error and continues through the rest of the contract.
#168
#169	Use a persistent local runtime database:
#170
#171	```bash
#172	uv run agent-libos --db .agent_libos.sqlite init
#173	uv run agent-libos --db .agent_libos.sqlite demo
#174	uv run agent-libos --db .agent_libos.sqlite audit
#175	uv run agent-libos --db .agent_libos.sqlite processes
#176	uv run agent-libos --db .agent_libos.sqlite tools
#177	```
#178
#179	## LLM Configuration
#180
#181	Create a local `.env` file for real-model execution:
#182
#183	```bash
#184	OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
#185	OPENAI_LANGUAGE_MODEL=qwen3.7-max
#186	OPENAI_API_KEY=...
#187	```
#188
#189	The LLM client uses the OpenAI Python SDK. By default it uses the Responses API for OpenAI-hosted models and falls back to Chat Completions for custom OpenAI-compatible `base_url` providers. Set `OPENAI_API_MODE=responses` or `OPENAI_API_MODE=chat` to force a mode. Optional knobs include `OPENAI_TIMEOUT`, `OPENAI_MAX_RETRIES`, `OPENAI_STORE`, `OPENAI_REASONING_EFFORT`, `OPENAI_VERBOSITY`, and provider-specific `OPENAI_ENABLE_THINKING`.
#190
#191	Runtime defaults that are not provider secrets live in `agent_libos.config.DEFAULT_CONFIG`. This includes scheduler quanta, process budgets, default image ids, workspace namespace, tool timeouts, filesystem/object-memory size limits, Deno JIT sandbox limits, JSR import allowlists, shell policy lists, launcher presets, and example-script defaults. Components accept an `AgentLibOSConfig` where runtime-level injection is useful; fixed protocol identifiers and model-facing tool semantics stay in their own modules.
#192
#193	Spawn and run a process:
#194
#195	```bash
#196	uv run agent-libos --db .agent_libos.sqlite spawn --image coding-agent:v0 --goal "Write a short summary of README.md"
#197	uv run agent-libos --db .agent_libos.sqlite run --max-quanta 10
#198	```
#199
#200	`agent-libos run` uses the high-level async supervisor, so human terminal messages are processed as part of runtime execution. For manual queue processing, the lower-level command still exists:
#201
#202	```bash
#203	uv run agent-libos --db .agent_libos.sqlite human
#204	```
#205
#206	Every LLM action-selection call is persisted in SQLite as an `llm_calls` row. The record includes the exact prompt messages, visible tool schemas, output content, tool calls, provider ids, model/api, token usage when the provider returns it, reasoning fields when exposed by the provider, raw response JSON, and errors. Inspect them with:
#207
#208	```bash
#209	uv run agent-libos --db .agent_libos.sqlite llm-calls --pid <pid>
#210	```
#211
#212	Humans can also inject process messages at any time. This works while another `agent-libos run` is using the same SQLite runtime database:
#213
#214	```bash
#215	uv run agent-libos --db .agent_libos.sqlite message <pid> "Please inspect the latest result"
#216	uv run agent-libos --db .agent_libos.sqlite interrupt <pid> "Stop current work and read this first"
#217	uv run agent-libos --db .agent_libos.sqlite message <pid> "Use this as job input" --channel human --correlation-id job-42 --run
#218	```
#219
#220	For a Codex CLI-style loop in one terminal, use interactive run. Plain text sends a normal message unless a human question or approval is pending, in which case it answers that request; use `/message <text>` to force a normal process message. `/interrupt <text>` sends an interrupt; `/pid <pid>` switches the target; `/exit` exits the interactive loop.
#221
#222	```bash
#223	uv run agent-libos --db .agent_libos.sqlite run --interactive --pid <pid> --max-quanta 20
#224	```
#225
#226	The CLI also exposes process built-ins for manual lifecycle control:
#227
#228	```bash
#229	uv run agent-libos --db .agent_libos.sqlite cd <pid> src
#230	uv run agent-libos --db .agent_libos.sqlite exec image.yaml "Review README.md" --pid <pid> --run
#231	uv run agent-libos --db .agent_libos.sqlite exit <pid> --payload '{"done":true}'
#232	```
#233
#234	For `exec`, the first positional argument is the target image. It can be an already registered image id such as `coding-agent:v0`, or a `.yaml` / `.yml` AgentImage manifest path such as `image.yaml`. The second positional argument is the replacement goal. `--run` runs the scheduler immediately after exec; omit it or pass `--no-run` to only swap the process image and tool table.
#235
#236	An AgentImage YAML manifest accepted by `load_image_from_yaml` can use either a top-level image mapping or direct image fields:
#237
#238	```yaml
#239	image:
#240	image_id: yaml-agent:v0
#241	name: yaml-agent
#242	system_prompt: \|
#243	Use the smallest safe tool sequence.
#244	default_tools:
#245	- read_memory_object
#246	- human_output
#247	context_policy: evidence_first
#248	safety_profile: review
#249	metadata:
#250	role: example
#251	```
#252
#253	## Example Scripts
#254
#255	Summarize a workspace document through an Agent process:
#256
#257	```bash
#258	uv run python scripts/llm_summarize_document.py README.md --auto-approve
#259	```
#260
#261	Choose the permission policy explicitly for non-interactive runs:
#262
#263	```bash
#264	uv run python scripts/llm_summarize_document.py README.md --permission-policy always_allow --auto-approve
#265	uv run python scripts/llm_summarize_document.py README.md --permission-policy always_deny --auto-approve
#266	```
#267
#268	Run the real-model write-file smoke test:
#269
#270	```bash
#271	uv run python scripts/llm_write_goal_smoke.py
#272	```
#273
#274	Launch a real coding agent against any workspace with preconfigured permissions:
#275
#276	```bash
#277	uv run python scripts/run_coding_agent.py --workspace /path/to/repo --goal "Implement the requested change"
#278	```
#279
#280	On Windows PowerShell, the same launcher works with Windows-style paths:
#281
#282	```powershell
#283	uv run python scripts\run_coding_agent.py --workspace ..\some-repo --goal "Summarize the current project"
#284	```
#285
#286	The launcher defaults to the `edit` permission preset: read+write over the workspace, but no delete authority. Use `--permission-preset read-only` for inspection-only runs, `--permission-preset full` for read+write+delete, or combine `read-only` with exact allow-list grants such as `--write-file src/main.py` and `--delete-dir build`.
#287
#288	The launcher also grants a shell policy by default: `--shell-policy allowlist_auto_else_ask`. Use `--shell-policy none` to grant no shell execution policy, `always_deny` to hard-disable shell calls, `blocklist_ask_else_auto` to auto-allow commands except configured risky entries, or `always_allow` only for high-risk fully trusted runs.
#289
#290	By default the launcher loads LLM settings from this Agent-libOS checkout's `.env` before mounting the target workspace into the Resource Provider Substrate. It does not change the launcher process cwd. Use `--env-file /path/to/.env` to override that.
#291
#292	Copy a workspace text file through named Object Memory without materializing the file content into the process prompt:
#293
#294	```bash
#295	uv run python scripts/object_memory_file_copy_smoke.py
#296	```
#297
#298	Run two async-scheduled processes that use `sleep` to alternate current-time output:
#299
#300	```bash
#301	uv run python scripts/async_clock_interleave_smoke.py --iterations 3 --interval 0.2
#302	```
#303
#304	Expected output order is `A, B, A, B, ...`, showing that one process sleeping does not block the other process.
#305
#306	Ask the human which workspace file to view, then show that file's content:
#307
#308	```bash
#309	uv run python scripts/ask_file_then_show.py
#310	```
#311
#312	For non-interactive testing:
#313
#314	```bash
#315	uv run python scripts/ask_file_then_show.py --auto-answer README.md
#316	```
#317
#318	Run a traditional human/LLM terminal chat through the script-local `ChatImage`, using `ask_human` and `human_output`:
#319
#320	```bash
#321	uv run python scripts/human_llm_chat.py
#322	```
#323
#324	For a deterministic local smoke run without calling a model:
#325
#326	```bash
#327	uv run python scripts/human_llm_chat.py --mock --auto-message hello --auto-message /exit
#328	```
#329
#330	## Architecture
#331
#332	```text
#333	Agent Personality / Application
#334	-> Skills / Tools Layer
#335	- LLM-facing actions
#336	- tool schemas
#337	- macro actions
#338	- skill metadata
#339	-> Agent libOS Runtime
#340	- AsyncProcessScheduler
#341	- ProcessManager
#342	- ObjectMemoryManager
#343	- ToolBroker
#344	- HumanObjectManager
#345	- Primitive managers
#346	- CapabilityManager
#347	- EventBus
#348	- CheckpointManager
#349	- AuditManager
#350	-> Resource Provider Substrate
#351	- filesystem provider
#352	- clock/sleep provider
#353	- shell provider
#354	- human provider
#355	-> Host Runtime / Provider Backend
#356	- local workspace filesystem
#357	- host clock
#358	- subprocess backend
#359	- terminal or UI human I/O backend
#360	- future remote, container, WASM, or service-backed providers
#361	```
#362
#363	The key design boundary is between model-facing tools and libOS primitives. For example, `write_text_file` can be visible in a process tool table, but `FilesystemAdapter.write_text()` still enforces workspace containment, resource capability or permission policy, human approval if needed, events, and audit logging.
#364
#365	Putting a tool in a process table does not grant access to files, humans, shell, network, secrets, or other host resources.
#366
#367	Primitives are not themselves the host implementation. They own libOS semantics: capability checks, human approval, event emission, and audit records. Concrete host calls live behind `agent_libos.substrate` providers such as `LocalFilesystemProvider`, `LocalClockProvider`, `LocalShellProvider`, and `LocalHumanProvider`. Shell calls are intentionally argv-only at this boundary, so quoting, pipes, redirects, and command chaining must be requested explicitly through an interpreter executable, where policy matching can see the interpreter token. HumanObject similarly owns request queues, approvals, wakeups, and audit records, while the substrate `HumanProvider` owns terminal or UI read/write.
#368
#369	## Runtime Execution Model
#370
#371	High-level execution:
#372
#373	```python
#374	results = await runtime.arun_until_idle(max_quanta=10)
#375	```
#376
#377	By default this does four things:
#378
#379	1. Runs all runnable processes asynchronously.
#380	2. Processes pending human terminal messages when processes are waiting on human input.
#381	3. Delivers process-message notices at the appropriate tool boundary.
#382	4. Wakes resumed processes and continues until no runnable or human-resumable work remains, or the quantum budget is exhausted.
#383
#384	Process messages are explicit queue entries, not raw prompt text. A process can send messages to itself, its parent, or direct children with `send_process_message`. The receiver uses `read_process_messages` for non-blocking inspection or `receive_process_messages` to wait in `WAITING_EVENT` until a matching unread message arrives. Both read paths can filter by kind, sender, channel, correlation id, reply target, or exact message ids, and returned unread messages are acknowledged by default. Interrupt messages are checked before tool execution and preempt non-message tools until read; normal messages are noticed after a tool call and do not block the current tool.
#385
#386	For debugging a pending approval state, opt out explicitly:
#387
#388	```python
#389	results = await runtime.arun_until_idle(max_quanta=1, process_human_queue=False)
#390	```
#391
#392	Single-step APIs also remain available:
#393
#394	```python
#395	result = await runtime.arun_next_process_once()
#396	```
#397
#398	## Object Memory Namespace Model
#399
#400	Object Memory names are local to a namespace. Runtime code that omits `namespace` uses the caller process namespace:
#401
#402	```python
#403	pid = runtime.process.spawn(image="base-agent:v0", goal="collect notes")
#404	handle = runtime.memory.create_object(
#405	pid=pid,
#406	object_type="summary",
#407	name="notes",
#408	payload={"entries": []},
#409	immutable=False,
#410	)
#411	obj = runtime.memory.get_object_by_name(pid, "notes")
#412	assert obj.namespace == runtime.memory.process_namespace(pid)
#413	```
#414
#415	For shared or phase-specific memory, create an explicit namespace and pass it on object operations:
#416
#417	```python
#418	runtime.memory.create_namespace(pid, "project")
#419	runtime.memory.create_namespace(pid, "project/research")
#420	runtime.memory.create_object(
#421	pid=pid,
#422	object_type="observation",
#423	namespace="project/research",
#424	name="notes",
#425	payload={"source": "README.md"},
#426	)
#427	listing = runtime.memory.list_namespace(pid, "project/research")
#428	```
#429
#430	The namespace grants directory-style authority such as list, lookup, and create. It does not replace object capabilities; reading `project/research/notes` still requires object read capability.
#431
#432	## How To Write Agent libOS Tools
#433
#434	Tools should not directly access host resources. Use this pattern:
#435
#436	1. Define a Pydantic input schema and optional output schema.
#437	2. Subclass `SyncAgentTool` for blocking local code or `BaseAgentTool` for async code.
#438	3. Keep validation and model-facing ergonomics in the tool.
#439	4. Call `ctx.runtime.<primitive>` for process, memory, filesystem, human, clock, or other libOS operations.
#440	5. Let primitives enforce capability checks, containment, audit, event emission, checkpointing, and policy hooks.
#441	6. Register the tool through `Runtime._register_builtin_tools()` or a ToolBroker-backed registry.
#442
#443	Do not put direct filesystem, terminal, network, shell, browser, database, or credential access inside a model-facing tool unless that code is itself the libOS primitive or a sandbox backend.
#444
#445	Agent-authored JIT tools use TypeScript, not Python. A process proposes source with `propose_jit_tool`, validates it with `validate_jit_tool`, and registers it with `register_jit_tool`. Registration adds the new tool only to the registering process tool table.
#446
#447	The TypeScript source shape is:
#448
#449	```ts
#450	export async function run(args, libos) {
#451	const file = await libos.syscall("filesystem.read_text", { path: args.path });
#452	return { bytes: file.content.length };
#453	}
#454	```
#455
#456	The `libos` object intentionally exposes only `syscall(name, args)`. It does not expose Python objects, `Runtime`, or `runtime.tools`. Syscall dispatch enters `LibOSSyscallSession`, which calls primitives such as filesystem, Object Memory, human, clock, process, shell, and image registry under the caller pid.
#457
#458	## Module Map
#459
#460	```text
#461	agent_libos/
#462	api/ CLI entry points and demo orchestration
#463	capability/ Capability grant, revoke, check, and object handles
#464	config/ Typed runtime, LLM, tool, memory, launcher, and script defaults
#465	human/ HumanObject query, approval, interrupt, and output primitives
#466	images/ Built-in AgentImage definitions
#467	llm/ Prompt, context, OpenAI-compatible client, executor, action parser
#468	memory/ Typed Object Memory and MemoryView implementation
#469	models/ Dataclass and enum models split by runtime domain
#470	primitives/ LibOS primitive managers for filesystem, clock, shell, git, and browser placeholders
#471	runtime/ Runtime composition, syscall broker, async scheduler, process manager, events, checkpoints, audit
#472	skills/ Skill schema, registry, verifier, linker scaffolding
#473	skills_tools/ Tool/action registry and bundle scaffolding
#474	substrate/ Resource provider interfaces for filesystem, clock, shell, human I/O, and local host-backed implementations
#475	storage/ SQLite persistence
#476	tools/ Tool base classes, ToolBroker, sandbox, and built-in tools
#477	scripts/ Real-model smoke and demo scripts
#478	tests/ Safety-boundary and regression tests
#479	```
#480
#481	## Roadmap
#482
#483	Near-term priorities:
#484
#485	- More LLM executor conformance tests for provider edge cases and unusual tool-call formats.
#486	- Tool result compaction and long-context paging.
#487	- Stronger checkpoint/rollback tests.
#488	- Audit querying by pid, capability, tool, external resource, and time range.
#489	- More complete terminal human queue UX.
#490	- More hardened Deno JIT sandbox profiles and policy presets for high-risk tools.
#491
#492	Longer-term directions:
#493
#494	- Persistent signed skill/tool registry.
#495	- Distributed process scheduling.
#496	- Rich human role and authority model.
#497	- ExternalRef objects and snapshots for external resources.
#498	- Multi-tenant runtime policy.
#499	- MCP-compatible tool exposure.
#500
#501	## Development
#502
#503	Add runtime dependencies with:
#504
#505	```bash
#506	uv add <package>
#507	```
#508
#509	Add development dependencies with:
#510
#511	```bash
#512	uv add --dev <package>
#513	```
#514
#515	Commit both `pyproject.toml` and `uv.lock` after dependency changes.
#516

z6MkqRzA/yingqi-z20-Agent-libOS