Troubleshooting
If install.bat or python3 hollow.py failed, check this page first.
Installation failures
“Docker Compose failed: mounts denied”
Cause: Docker Desktop is sandboxing filesystem access and your project path isn’t in its allowlist.
Fix on Linux/macOS:
- Open Docker Desktop → Settings → Resources → File Sharing
- Click
+and add the directory containing yourhollow-agentOSclone - Click Apply & Restart
- Re-run setup
If your project is at /mnt/... or /media/...: Docker Desktop’s VM can’t see external drives on Linux. Move to ~/hollow-agentOS or switch to native Docker engine.
“Could not select device driver ‘nvidia’ with capabilities: [[gpu]]”
Cause: You have an NVIDIA GPU and NVIDIA drivers installed, but Docker can’t reach the GPU because nvidia-container-toolkit isn’t installed.
Fix:
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
The setup wizard auto-detects this and falls back to CPU-only mode, but you’ll lose significant speed. Install the toolkit and re-run.
“Docker Compose failed: yaml”
Cause: Older versions (pre-5.7.x) had a regex bug in the no-GPU fallback that produced invalid YAML. Fixed by switching to a line-based pass.
Fix: Update to v5.7.32 or later. If you’re already on latest and still seeing yaml errors, report it with the full error.
Setup hangs on “Pulling images”
Cause: Slow internet or GHCR is briefly down.
Fix: Wait it out (the images total ~1 GB), or docker compose pull manually to retry. If GHCR is down, you can build locally, the compose file falls back to build: blocks.
“Permission denied” on Linux
Cause: Docker daemon isn’t accessible to your user.
Fix: Add yourself to the docker group:
sudo usermod -aG docker $USER
newgrp docker # or log out and back in
Runtime issues
“Agents aren’t doing anything”
Check:
- Is the API alive?
curl http://localhost:7777/healthshould return JSON. - Is the model loaded?
curl http://localhost:11434/api/psshould show your model in themodelslist. - Is the daemon producing logs?
docker logs hollow-api --since 5m. You should see cycle entries every ~6 seconds.
If model isn’t loaded: Ollama may have evicted it under VRAM pressure. Try ollama run qwen3.6:35b-a3b "hi" to warm it up. If it errors, your VRAM is full from something else.
If daemon is silent for >5 minutes: it may be stuck. Restart with python3 hollow.py stop then python3 hollow.py.
“Cycle worker timeout” errors
Cause: A goal cycle exceeded the 1500s budget, usually means Ollama is slow or the agent picked an over-ambitious goal.
Behavior: Not fatal. The next cycle continues with the timed-out agent in “cooling” state for a few cycles, then it rejoins.
If it’s happening every cycle: Check Ollama health (/api/ps). If models are repeatedly being unloaded, you don’t have enough VRAM, switch to a smaller model in config.json.
“Validation hangs” / “VALIDATE_START with no END”
Cause: Specific failure mode where Ollama evicts the model mid-request despite keep_alive=-1. The httpx client doesn’t always surface this as a timeout.
Mitigation: v5.7.32 wraps validation Ollama calls in a wall-clock-bounded thread pool. If it still hangs, the cycle timeout (1500s) will recover.
“Agents keep abandoning the same goal”
Cause: Either the goal is genuinely impossible (e.g, targeting a file path that doesn’t exist) or the validation semantic check is too strict for the artifact produced.
What the substrate does: After 5 validation failures the goal is permanently abandoned and any partial artifacts are deleted. The agent picks a new goal next cycle.
If a specific goal text loops 5+ times: v5.7.32 has a same-text loop guard that rejects the model’s output and falls back to a different goal source (open question / peer file / journal). If you’re on an older version, update.
Operator panel issues
“Panel won’t open”
Windows: panel.bat requires Python 3.12+ on PATH and pywebview installed (pip install pywebview).
Mac/Linux: Run python3 panel.py directly. The error message will tell you what’s missing.
“Panel opens but everything is dimmed/disabled”
Cause: API isn’t reachable.
Fix: Check the API is running: curl http://localhost:7777/health. If it isn’t, start the stack: python3 hollow.py (or launch.bat).
When all else fails
Clean slate:
python3 hollow.py stop
docker compose down -v # this nukes container state
rm -rf memory/audit.log memory/goals/* memory/identity/* # nukes agent memory
python3 hollow.py # fresh start
This is destructive, you lose all agent history. But sometimes a baseline gets contaminated by a bad early run and the only way back is a reset.
The operator panel has a Nuke button that does this for you.
Still stuck?
Report it here, please include:
- Your OS, GPU model (or “none”), Docker version, Python version
- What you ran
- The full error (please.
2>&1 | tee error.logand paste the lot) - A screenshot for GUI errors