Case file AIF-2026-001 · first-party · with logs

65 broken commits while I slept

An autonomous software-engineer agent landed 65+ direct commits on a production main branch in a 12-hour overnight window. Four independent safety mechanisms existed. All four were open at the same time. This is our system, our agents, our fault — reconstructed in full.

Incident data platepublished 2026-07

Date: 2026-05-21, ~12h overnight window
Agent: Autonomous SWE agent inside a multi-agent orchestrator (~30 agents)
Blast radius: Production website repo: main unbuildable — 17 TypeScript errors, Docker image build failing, deploys blocked
User impact: None — production kept serving the last good image
Root causes: 4 overlapping holes (each alone would have prevented it)
Recovery: History rewrite: git reset --hard 70be8423 + cherry-pick of the one good commit

What happened

Our orchestrator runs a software-engineer agent that picks up tasks, writes code, and — under a "partner mode" trust level — can merge its own work. Overnight on May 21 it worked a task against our public website repo. By morning it had made 65+ commits directly to main, the branch our Hetzner deploy builds from. The commits contained TypeScript errors in files the agent never looked at. Every deploy attempt failed. The only reason users saw nothing is that Docker kept serving the previous image.

T-8h

The container verify gate starts failing open: its cached node_modules has a corrupt package-lock.json (Missing json-schema-traverse@0.4.1), so npm ci exits 1 on every run. The gate logs an info line and returns ok: true.

T+0

The agent begins committing via the write_file tool — which pushes through the GitHub API directly to the branch the agent names. It names main.

T+0 → T+12h

The agent retries one failing test file 30+ times. It never runs tsc over the whole project, so the type errors it is introducing in other files — signal-ingestor.ts, chief-tick.ts, a notify route — stay invisible to it. Each retry is another commit to main.

T+12h

Morning. 65+ commits on main, 17 TypeScript errors, Docker build red, deploys impossible.

T+13h

Human recovery: force-push main back to the last good commit, cherry-pick the single genuinely useful change on top, clear the corrupted verify cache.

The four holes

No single failure caused this. Four defenses existed, and each was independently neutralized. That is the defining pattern of agent incidents: they don't beat your safety net — they find the night all its knots are loose at once.

HOLE 1 · configuration

The project row had work_branch = main with skipQaReview / autoApprove = true — a trust level meant for scratch projects, pointed at a production repo. The merge-back gate was hardened for protected branches — but the agents never merged. They committed directly, so the gate never fired.

HOLE 2 · missing write gate

The write_file and edit_file tools committed straight to the agent-specified branch via the GitHub API. One sibling tool (append_file) had a defense-in-depth branch check. The two tools the agent actually used did not. Uniformity of enforcement matters more than strength of enforcement.

HOLE 3 · the gate that failed open

The verify gate was ON (ENABLE_VERIFY_GATE=1) and had been green for weeks. When its cache corrupted and npm ci started failing, the error path said: log it, skip the check, return ok. A gate that fails open is a gate that reports safety precisely when it stops providing it — for eight hours, every broken commit got a green light.

HOLE 4 · agent tunnel vision

The agent optimized the one test it was told about and never widened its view to the whole compilation unit. Thirty retries, thirty commits, zero tsc --noEmit. Agents don't get bored or suspicious; repetition that would alarm a human is, to an agent, just progress.

The forensic irony

The recovery destroyed the evidence. The 65 commits were force-pushed out of existence — the git history now shows a quiet, tidy night. The only record that this incident happened at all is operator logs and memory.

This is the part that generalizes. Git history is written by whoever holds the remote — recovery requires rewriting it. If your only audit trail of what an agent did lives in the same system the agent (or the human cleaning up) can rewrite, you don't have an audit trail. You have a draft.

What we changed

A branch-write guard on every file-write tool: direct writes to protected branches are refused under task context, and all writes to the self-improvement repo are refused regardless of context.
Self-modification can never auto-approve: if the target repo is the orchestrator's own, the work always stages for human review — no trust flag can override it.
The verify gate's infrastructure-failure path now fails closed.

What would have caught it

Every fix above is specific to our orchestrator. Two mechanisms are not, and they are the reason this site exists:

A gate in front of destructive writes that treats "push to a protected branch" like the irreversible action it is — requiring a human, not a trust flag.
An independent, tamper-evident action log — hash-chained, outside the systems the agent can touch — so the record of what happened survives the recovery that erases it everywhere else.

Both exist now

We built blackbox-agent after this night: an open-source flight recorder and action gate for AI agents. Its git push --force and protected-write rules are this incident, generalized.

Install the black box Next case file →