$ cat choices/observability.md

Observability

the call

I treat observability as the thing that makes verification possible: you can't fix, or trust, what you can't see. The bar isn't 'the tests are green,' it's 'I can watch this work the way a real user hits it, and prove it's right.' When generation is free, the signal is the job.

why

When you ship constantly and generation is cheap, the scarce skill is knowing what’s actually true in production, and observability is how you know. Real metrics, traces, error rates, and real-user signals are what turn “we deployed” into “we deployed and it works.” It gates a canary, tells you a clean-looking diff is quietly wrong, and answers the question tests can’t: not “did the code do what we said,” but “is the system doing what the user needs.” Logs tell you what you thought to log. Observability lets you ask questions you didn’t know you’d need to.

The tools I reach for: OpenTelemetry for instrumentation, vendor-neutral on purpose so the data is yours and the backend stays swappable instead of locked to one bill. Sentry for errors, and SigNoz when I want an OTel-native platform I can run and own outright. Same instinct as everywhere else on this list: instrument through an open standard, and own the layer that matters.

when I don’t

This isn’t a mandate to hoard dashboards. A wall of graphs nobody reads is theater, and vanity metrics are worse than no metrics because they manufacture false confidence. I instrument what answers a real question (is this flow completing? is the error rate up on this release?) not everything that can be counted. And observability has a real cost in cardinality, storage, and vendor bills, so I instrument with intent, not reflex.

in production

The sharpest use is as the gate: a release advances only when the real signals (error rate, latency, flow-completion) hold green for a bake window, and auto-rolls-back the moment they don’t. That’s observability doing the verifying a human can’t do at fleet speed. The signal, not someone’s confidence, decides what ships. It’s the other half of robust CI: CI proves the code was right before merge; observability proves the system is right after deploy. As the Volume essay puts it, knowing what’s true is the whole job now.— see: choices / robust-ci · writing / Volume Is Free Now

the principle under it

Verification is the core competency now, and observability is how verification scales past what one human can eyeball. Measure what’s true, gate on it, and let the signal decide. “It looked fine” and “the tests passed” are exactly the confidence that ships the confident-wrong failure. The teams that win the agent era won’t be the ones generating the most; they’ll be the ones who can prove what’s working.

the gaps — what it costs even when it’s right

Dashboards-as-theater. A graph nobody acts on is cost with no return, and a roomful of them creates the illusion of control. Signal you don’t gate on or alert from is just decoration.

It isn’t free. Cardinality, retention, and vendor bills scale with how much you instrument. Measure everything and the observability bill becomes its own line item to defend.

It tells you what, not why. A red signal points at the symptom; a human still has to diagnose the cause. And too many alerts is the same as none: alert fatigue trains the team to ignore the one that mattered.

used at

choices / robust-ci choices / the-agent-fleet writing / Volume Is Free Now