The Podcast

The studio was remote—Justin in his home office in the Bay Area, the host in a podcast studio in Brooklyn. Two camera feeds, two microphone levels, a green light that meant "recording." The host had done his homework. He'd read the factory.strongdm.ai site, skimmed the Attractor repo, and prepared questions that were three levels deeper than the usual "so tell me about AI coding."

Justin settled into the conversation the way he settled into everything: deliberately, with a patience that could be mistaken for slowness by people who didn't know better. He explained the factory's origin. The October 2024 inflection point with Claude 3.5. The moment he realized that long-horizon agentic workflows were compounding correctness instead of error. The decision to remove humans from the code-writing loop entirely.

The host nodded along. Good questions followed. What did the daily workflow look like? How did scenarios differ from traditional tests? What was the satisfaction metric, really? Justin answered each one with the structured clarity that characterized everything he said. He used analogies from manufacturing, from biology, from quality engineering. He described the Digital Twin Universe. He described Leash's containment model.

Then the host asked the question.

"But who reviews the AI's code?"

Justin's pause lasted four seconds. In podcast time, four seconds is an eternity. The host's face showed a flicker of concern—had the audio dropped? Was there a connection issue? But Justin's camera feed was steady, his expression unchanged. He was thinking. Not searching for an answer. Choosing how to frame an answer that would challenge the question's premise without being dismissive.

"Nobody reviews the AI's code," Justin said finally. "That's the point. And I need to explain why that's not the alarming statement it sounds like."

He explained. Code review was a heuristic, he said. A proxy for the thing you actually cared about, which was: does the software do what it's supposed to do? Humans reviewed code because they couldn't run every scenario every time. It was too expensive, too slow, too dependent on infrastructure that didn't exist. So instead, a human looked at the code and made a judgment call based on experience and pattern matching.

"The factory replaces that heuristic with a measurement," Justin continued. "Scenarios run against every change. Satisfaction is computed. The question isn't 'does this code look right to a human?' The question is 'does this code produce correct behavior across all known scenarios?' One of those questions scales. The other doesn't."

The host was quiet for a moment. Then: "So you're saying code review was always a hack?"

"I'm saying code review was a brilliant adaptation to constraints that no longer apply. When you can validate behavior directly, at scale, you don't need a human to squint at diffs and guess."

Jay and Navan listened to the episode when it aired two weeks later. They'd heard Justin make this argument before, in standups and planning sessions. But hearing it delivered to an external audience, with that four-second pause, gave it a different weight. The pause wasn't hesitation. It was Justin deciding to be precise in a medium that rewarded being fast.

Navan noted, in his physical notebook, two words: "The Pause." He underlined them twice.

Software Factory Archive

Kudos: 132