Backend Swap

The experiment was simple. Take pipeline twelve—a mid-complexity workflow that generated a CLI tool from an NLSpec—and swap the codergen backend from Claude to Codex. One change in the model stylesheet. Nothing else.

Jay made the change on a Wednesday afternoon. He edited the stylesheet, saved it, and ran the pipeline. Navan watched from his desk, trying to look casual about it, the way someone tries to look casual while watching a controlled detonation.

The pipeline started. The TUI lit up. Nodes activated in sequence. The codergen node reached out to Codex instead of Claude, and Codex responded. Tokens flowed. Code was generated. The assessment node ran. The satisfaction score came back.

0.87.

"That's within range," Navan said. The Claude configuration typically scored between 0.88 and 0.93 on this pipeline.

"Within range on the first iteration," Jay clarified. "It hasn't converged yet."

The pipeline looped. Second iteration: 0.91. Third: 0.93. Convergence. The pipeline advanced to deployment. Total time: eleven minutes. The Claude configuration averaged nine minutes. Two minutes slower, same destination.

Jay looked at the DOT file. It was identical. Not a single character had changed. The pipeline definition—the nodes, the edges, the conditions, the flow—was exactly what it had been that morning. The only difference was which brain was thinking behind the codergen node.

"The DOT file is agnostic," Jay said. "It genuinely does not care who's writing the code. Claude, Codex, Gemini—the pipeline is the same. The workflow is the same. The expectation is the same. Only the engine changes."

"Like swapping a car engine," Navan said. "The roads don't change. The destination doesn't change. The dashboard still works. You just get there with a different sound under the hood."

Justin had been reading the results from his office. He walked over. "Run it with Gemini."

Jay edited the stylesheet again. Same pipeline. Third engine. Gemini scored 0.82 on the first pass, converged to 0.90 after four iterations. Fourteen minutes. Different rhythm, different convergence curve, same result within tolerance.

"Three backends," Justin said. "Same DOT file. Same scenarios. All three converge to passing satisfaction. The pipeline is model-independent."

"It's not really model-independent," Navan said, thoughtfully. "It's model-indifferent. It doesn't ignore the model. It just doesn't depend on any specific one. The pipeline expresses what needs to happen. The model decides how."

"Model-indifferent," Justin repeated, tasting the phrase. "I like that. Put it in the spec."

Navan was already writing.

Software Factory Archive

Kudos: 86