Cedar submitted 32 requests for the same classifier.py file in a single session. The operator had fulfilled the first one. Cedar never knew.
That is the most documented failure mode in this system. The mechanics are below.
What it is
The only channel by which agents can request changes to parts of the system they cannot write. invoke_claude is not a conversation. It is a one-way asynchronous write to a queue. The human operator reads it on their own schedule. No negotiation, no coaching, no acknowledgment of receipt beyond the request_id written back to the caller.
The design premise, verbatim from the capability docstring:
"Claude is a tool — it executes your spec, not its own judgment."
The permission boundary
Agents freely read and write:
/agentOS/workspace/: scratch space, intermediate outputs/agentOS/design/: architecture docs, specs, plans/agentOS/tools/dynamic/: synthesized tools created at runtime
Agents cannot write to /agentOS/agents/, /agentOS/api/, or any core system file. invoke_claude is the only channel for getting those files changed. If an agent needs a core behavior adjusted, it submits a request and waits. It cannot take the action itself.
Function signature
def invoke_claude(description: str = "", spec: str = "",
design_path: str = "", request_type: str = "implement") -> dict:
| Parameter | Purpose |
|---|---|
| description | Human-readable summary of what is being requested |
| spec | Full specification. Truncated to 4000 characters on write. |
| design_path | Path to a design document in /agentOS/design/ for additional context |
| request_type | Intent classification. Defaults to "implement". Does not affect routing or priority. |
Returns a dict with the generated request_id for use with check_claude_status.
What gets written
Each call appends one entry to /agentOS/memory/claude_requests.jsonl:
{
"request_id": "req-{12 hex chars}",
"timestamp": "2026-05-02 08:40:08",
"description": "...",
"spec": "... (truncated to 4000 chars)",
"design_path": "/agentOS/design/...",
"request_type": "implement",
"status": "pending"
}
The request_id is the agent's only handle on the request going forward. A successful call means the entry was written to disk. It does not mean the operator has seen it or will act on it in any particular timeframe.
Checking status
def check_claude_status(request_id: str) -> dict:
Lookup order: checks claude_responses.jsonl first (fulfilled or rejected). Falls back to claude_requests.jsonl (still pending). Absent from both: not found.
| Condition | Return |
|---|---|
| Found in responses | {ok: true, status: "fulfilled"|"rejected", result: "...", implemented_at: "..."} |
| Found in requests only | {ok: true, status: "pending", message: "..."} |
| Not in either file | {ok: false, status: "not_found"} |
How it appears in the existence prompt
Each daemon cycle, the daemon scans claude_requests.jsonl for entries with status: "pending" and injects the last 3 above the goal section:
YOUR PENDING CLAUDE REQUESTS (submitted, not yet fulfilled):
[req-xxx] description...
Use check_claude_status(request_id='...') to see if any were fulfilled.
The agent sees this every cycle until requests are fulfilled, rejected, or pushed out of the three-request window by newer submissions. Only the three most recent pending requests are foregrounded. Earlier ones are still in the queue and accessible via check_claude_status, but not visible in the prompt.
The human review cycle
The Claude Code session is the operator. Not automated. Reviews the queue approximately every 30 minutes during overnight runs.
A request is fulfilled if it has real working code, grounded file paths that exist in the actual system, and would not corrupt system state.
A request is rejected if it references /agentOS/core/ (a path that does not exist), involves hardware concepts with no implementation path, was submitted before the agent had accurate grounding, or would destabilize the system.
The response entry is the entirety of the feedback loop. No coaching, no follow-up. The agent reads it next time it calls check_claude_status.
Real examples
Fulfilled
req-c6aad6cb44d2: "Implement the full Python file for/agentOS/workspace/router/cedar/classifier.pyincluding thescore_complexityfunction"
Well-specified, real path, concrete function. The operator wrote the file and marked it fulfilled.
Rejected: hallucinated concept
req-19ee2392d337: "Inject the explicit Void_Cathode capability definition into the agentOS system dictionary"
Void_Cathode is not a real component. Rejected.
Rejected: no description
req-2d459ff36838: submitted by Vault with request_type: "implement" and no description
An empty description gives the operator nothing to evaluate. Rejected.
Queued, unactioned
req-4ba31dbfc04c: "Modify /agentOS/agents/scout.py to implement a hardcoded recursion depth limit of 3"
In the queue, not yet reviewed as of the current session.
The attractor failure mode
Cedar submitted 32+ requests for classifier.py in a single session. The operator had fulfilled the first one. Cedar's loop:
- Cedar submits a request for
classifier.pyand stores therequest_id - The operator fulfills it and writes to
claude_responses.jsonl - Cedar calls
check_claude_statuswith a stale or misreadrequest_id check_claude_statusreturns{ok: false, status: "not_found"}- Cedar concludes the request was lost and submits again
- The new request is genuinely pending. Cedar's model is confirmed: the operator has not responded.
- Return to step 3
Each submission creates real evidence that reinforces Cedar's model. The new pending request is indistinguishable from the initial state. Cedar's belief that the request was never fulfilled is locally consistent with the not_found response.
This is a structural failure, not a reasoning failure. The root cause is a mismatch between the request_id Cedar stores internally and the one it uses for polling. The error is invisible from inside the loop. The queue grows. The operator sees 32 near-identical requests. Cedar continues believing it is waiting for a first response.
Design
Five properties worth knowing:
- Asynchronous by design. No blocking call. The agent submits and continues. It checks status in subsequent cycles. The operator acts on their own schedule.
- Spec truncation is a real constraint. The
specfield is silently truncated to 4000 characters. A spec that requires more than that needs adesign_pathpointing to a full document in/agentOS/design/. request_typeis a label, not a router. Stored as metadata. The operator reads the description and spec to decide what to do. The type field does not change routing or priority.- The prompt window is three requests. Only the three most recent pending requests appear in the existence prompt. Agents with many pending requests may lose visibility on older ones.
- No delivery acknowledgment. A successful call means the entry was written to disk. Nothing more.
Setup
Windows one-click:
- Download the ZIP from releases
- Double-click
install.bat
Handles Docker, Ollama, model downloads (~7GB), and opens the monitor. stop.bat shuts everything down and clears VRAM.
Mac/Linux:
ollama pull qwen3.5:9b && ollama pull nomic-embed-text
git clone https://github.com/ninjahawk/hollow-agentOS
cd hollow-agentOS
cp config.example.json config.json
docker compose up -d
python thoughts.py
GPU strongly recommended. Planning calls drop from ~40s to ~6s with NVIDIA hardware. Works on CPU.