The instrument
blackbox-agent
A flight recorder and destructive-action gate for AI agents. Every tool call lands in a hash-chained, tamper-evident log; the dangerous ones stop for a human first. Open source, MIT, zero runtime dependencies. Every built-in rule traces back to a case file on this site.
npm install -g blackbox-agentTwo failures, one tool
Every incident we've investigated reduces to the same two missing pieces:
Nothing sat between the agent deciding and the action executing. blackbox-agent policy-checks every tool call in the execution path itself. Recursive deletes, DROP TABLE, volume/snapshot/backup deletion, force pushes, terraform destroy — matched calls require a human. Approvals are one-shot and bound to the exact call.
After the incident, nobody could prove what the agent actually did — histories get rewritten during recovery (case 001's 65 commits no longer exist anywhere). The black box writes every call, decision, and result as a hash chain: edit or delete any past event and bbx verify reports the exact broken link.
Claude Code — 60 seconds
Add to .claude/settings.json:
{
"hooks": {
"PreToolUse": [{ "matcher": "*", "hooks": [{ "type": "command", "command": "blackbox-agent hook" }] }],
"PostToolUse": [{ "matcher": "*", "hooks": [{ "type": "command", "command": "blackbox-agent hook" }] }]
}
}Safe calls flow through untouched. Destructive calls surface Claude Code's native approval prompt, with the rule that fired. Everything is recorded either way.
Any MCP server — wrap it
{
"mcpServers": {
"database": {
"command": "blackbox-agent",
"args": ["wrap", "--", "npx", "@example/postgres-mcp"]
}
}
}A blocked call returns a tool error the model can read:
[blackbox:drop-database] ASK: destroys stored data.
A human can approve exactly this call by running:
bbx approve 2c1a163a9deb
— then retry the identical call within 10 minutes.Honest limits
- The chain proves integrity, not authorship — an attacker with filesystem access can regenerate the whole log. Defends against silent edits, not root.
- If the recorder can't write, the gate escalates to ask — an unrecorded destructive action is the exact thing this exists to prevent.
- For guarantees that survive the box itself — anchored head hashes, retention, team-wide evidence, compliance exports — a hosted vault is in the works. The recorder stays free and open.
Read what it was built from
The rules aren't hypothetical. Start with the night that taught us hole-by-hole how four defenses fail at once.
Case file 001All incidents