/builds/vibe-check
// damage meters for AI-assisted development
In WoW, I ran damage meters obsessively: DPS, interrupts, and deaths for every M+ run. You can't improve what you can't measure.
vibe-check does the same thing for AI-assisted coding sessions. It analyzes git history and tells you if that session was actually productive or just busy.
// the 5 core metrics
How tight are feedback loops?
Building or debugging?
Does code stick?
How long stuck?
What % productive?
Trust Pass Rate is THE key metric—it measures whether you vibed at the right level.
npx @boshu2/vibe-check// why this matters
AI reliability varies by task type. Formatting is nearly always correct. Architecture needs line-by-line verification. The vibe levels answer the question: when can you trust AI output, and when do you need to verify every line?
Declaring the level upfront forces you to think about what kind of task you're doing. After the session, compare what actually happened to what you expected.
// the 40% rule
Gene Kim and Steve Yegge found a hard threshold in their research. When context utilization stays under 40%, success rate is 98%. Above 60%, it drops to 24%. The AI starts forgetting instructions and contradicting itself.
This is why spiral detection matters. When you're stuck in a fix loop, context fills up fast.
// the insight
Git history doesn't lie. It's easy to feel productive, but are you actually shipping or just spinning? The commits tell the truth.
vibe-check analyzes your commit patterns to detect debug spirals before they consume your whole session. If you're stuck for 30 minutes on the same thing, that's a wipe—reset, do some research, and come back with a plan.
"Last week, the CLI flagged a spiral at 18 minutes. I realized I was arguing with the LLM about a circular dependency. I stepped away, drew the schema on paper, and fixed it in one commit. Without the alert, I would have wasted two hours."
// results
I've run this methodology across 200+ sessions over the past year. When I follow the discipline, it works. When I skip calibration because I'm in a hurry, I pay for it in rework.
// for autonomous agents
vibe-check measures human-AI collaboration sessions. For autonomous agents that run without a human in the loop, 12-Factor AgentOps applies DevOps and SRE patterns to AI systems.
12-Factor AgentOps →