The instrument

blackbox-agent

A flight recorder and destructive-action gate for AI agents. Every tool call lands in a hash-chained, tamper-evident log; the dangerous ones stop for a human first. Open source, MIT, zero runtime dependencies. Every built-in rule traces back to a case file on this site.

npm install -g blackbox-agent

Two failures, one tool

Every incident we've investigated reduces to the same two missing pieces:

MISSING PIECE 1 · THE GATE

Nothing sat between the agent deciding and the action executing. blackbox-agent policy-checks every tool call in the execution path itself. Recursive deletes, DROP TABLE, volume/snapshot/backup deletion, force pushes, terraform destroy — matched calls require a human. Approvals are one-shot and bound to the exact call.

MISSING PIECE 2 · THE EVIDENCE

After the incident, nobody could prove what the agent actually did — histories get rewritten during recovery (case 001's 65 commits no longer exist anywhere). The black box writes every call, decision, and result as a hash chain: edit or delete any past event and bbx verify reports the exact broken link.

Claude Code — 60 seconds

Add to .claude/settings.json:

{
  "hooks": {
    "PreToolUse":  [{ "matcher": "*", "hooks": [{ "type": "command", "command": "blackbox-agent hook" }] }],
    "PostToolUse": [{ "matcher": "*", "hooks": [{ "type": "command", "command": "blackbox-agent hook" }] }]
  }
}

Safe calls flow through untouched. Destructive calls surface Claude Code's native approval prompt, with the rule that fired. Everything is recorded either way.

Any MCP server — wrap it

{
  "mcpServers": {
    "database": {
      "command": "blackbox-agent",
      "args": ["wrap", "--", "npx", "@example/postgres-mcp"]
    }
  }
}

A blocked call returns a tool error the model can read:

[blackbox:drop-database] ASK: destroys stored data.
A human can approve exactly this call by running:
  bbx approve 2c1a163a9deb
— then retry the identical call within 10 minutes.

Honest limits

Read what it was built from

The rules aren't hypothetical. Start with the night that taught us hole-by-hole how four defenses fail at once.

Case file 001All incidents