The Twilio Incident

February 28, 2026

A code bug triggered a runaway SMS process through Twilio. In five days:

Total messages sent:     1,039,939
Failed:                    528,000
Undelivered:               394,000
Actually delivered:        117,000
Still queued when caught:   33,900

The code had a loop that was supposed to send follow-up SMS messages to leads who hadn't responded. The loop didn't have a proper termination condition. It also didn't check whether a message had already been sent. It ran against the production database.

Over a million messages went out in hours, not days. The database buckled under the write load — connections overwhelmed. Chase flagged the charges as fraud. First a $100 decline, then $200. But Twilio's queue doesn't stop when your credit card declines. The messages kept queuing and retrying. 33,900 were sitting in the pipeline when we finally killed the process.

Leads who had opted out received messages. Leads who had already converted received re-engagement texts. Some leads got the same message dozens of times. The TCPA exposure — sending unsolicited messages to people who had explicitly opted out — is the kind of thing that generates class action lawsuits.

The engineering team's response: "The deploy didn't land in production."

It had. The evidence was in a million message logs.

What I Built After — Three Layers of Defense

Layer 1: Code-Level Hooks pre-bash-safeguard blocks dangerous commands Weakest — same system that caused the bug Layer 2: Infrastructure Limits Twilio daily caps, Postgres statement timeouts Strong — code can't override infrastructure Layer 3: External Watchdog Standalone script, separate Twilio number Strongest — fully independent of the app

The real lesson: hooks are instructions to the AI. The AI is the thing that caused the disaster. You're asking the thing that broke your system to follow rules about not breaking your system. The wall isn't in the code. It's in the infrastructure around the code. Twilio caps on Twilio's side. Postgres limits on Postgres's side. A watchdog that runs independently of everything else.

Twilio offered $5,119 back — 75% of the charges. I took it immediately.

That was the moment I stopped trusting humans with production systems. Not because humans are bad at engineering. But because the cost of a mistake in a system processing 30,000 conversations a day is catastrophic, and humans make mistakes at a rate that's incompatible with that scale.

The engineers who manually intervened to kill the queue — the same team I was about to fire — did solid crisis response. They canceled the 33,900 queued messages via the Twilio API. I was grateful. I told them so. They went right back to doing nothing the next week. They fixed the acute crisis but never built anything to prevent the next one. That pattern — heroic firefighting, zero prevention — is why the team no longer exists.

Everything I built after traces back to this incident:

# From bash-guard.sh — the PreToolUse hook born from this incident
#
# THREE jobs, each deterministic:
#   1. BLOCK destructive commands (SQL, kubectl mutations, rm -rf)
#   2. DETECT deploy commands (set IS_DEPLOY flag)
#   3. GATE git push → require fresh-eyes + syntax check
#
# Exit codes: 0 = allow, 2 = BLOCK
# This hook NEVER prints instructions for Claude to follow.
# It either BLOCKS or ALLOWS. That's it.
#
# STATE PHILOSOPHY (Mar 22, 2026):
#   All checks are STATELESS or session-scoped.
#   No persistent marker files.

But the hook is the seatbelt. The infrastructure limits are the crash barriers on the highway. Twilio's daily send cap, set in the Twilio console, is something no code bug can override. Postgres connection limits and statement timeouts mean the database says no when the application goes insane. The external watchdog — a standalone Python script on the Mac Studio in its own tmux session, querying the database every five minutes via a completely separate Twilio number — texts my personal phone if anything looks wrong.

The AI that builds my code never sees the watchdog script. Never edits it. Never deploys it. It exists in a different tmux session, on a different code path, using different credentials. Defense in depth, where each layer is physically separate from the others.

Every safety mechanism in my system is a monument to this incident. Every hook is a scar. The CLAUDE.md database safety section, the bottom-up analysis law, the fresh-eyes review gate, the pre-push check — all of it starts here, in a million text messages and a database on its knees.