moodmosaic

Oracles, Traces, Triage

In 2026, “agentic” tooling is moving fast enough that yesterday’s workflow advice goes stale quickly. This series is my attempt to write down the parts that seem durable: how to give agents norms instead of scripts, how to coordinate multiple agents through the repo, and how those patterns connect back to fuzzing and stateful testing.

Why

Skepticism is healthy here. Agent outputs still need oracles, reproducibility, and a bar for correctness. The tools are changing quickly; the craft is not.

Agents can draft tests fast; the hard part is still choosing the right oracles and insisting on reproducible failures.

Examples use Claude Code because that’s what I run day-to-day, but the patterns are meant to travel to any agent that can read a codebase, run checks, and write down findings.

This is not a tutorial. It’s a practitioner’s notebook.

The lens: oracles, traces, triage

If there’s a unifying theme here, it’s that most bug-finding systems succeed or fail on three things:

That’s also a useful way to read the series:

Articles

Companion post

If you want the deeper motivation for “why traces,” start here:

Next

Background

Most of what I know about testing came from shipping production systems and learning in public through open source: contributing to AutoFixture starting around 2011, then maintaining Hedgehog, which once powered Echidna, an early and widely used property-based fuzzer for Ethereum smart contracts.

Along the way: Fare for regex-constrained test generation, a SplitMix port for reproducible failure discovery. Consensus fuzzers at Stacks that caught a production bug a 533-line integration test couldn’t reproduce.

That background is why I’m interested in AI tooling—not as a replacement for any of this, but as a way to do more of it.

Feedback

The ideas in this series come from daily practice—shipping agent-assisted testing tools for real protocol security work. But daily practice has blind spots.

If you think I’m wrong about something, I’d like to hear it. If you think I’m right but missing a nuance, I’d especially like to hear that.


Next: Agent Skills and claude-lint