The Chess Engine

Before the factory, before the twins, before he knew what a satisfaction metric was, Navan wrote a chess engine in Swift.

It wasn't a good chess engine, by competitive standards. It couldn't beat Stockfish. It couldn't beat most rated players. But it understood the game at the level that mattered for what Navan was trying to learn: it could look ahead, evaluate positions, and prune branches that weren't worth exploring.

Minimax with alpha-beta pruning. The classic algorithm. You build a tree of all possible moves, your opponent's responses, your responses to those responses, and so on, as deep as your compute budget allows. Then you evaluate the leaf nodes—is this position good for me or bad?—and propagate those values back up the tree, assuming both players play optimally. Alpha-beta pruning cuts away branches that can't possibly lead to a better outcome than one you've already found. You don't explore every possibility. You explore the right ones.

Navan hadn't thought about the chess engine in months. Then Justin asked him to design a new scenario suite for the Google Sheets twin, and something clicked.

"The problem," Navan said, standing at Justin's whiteboard with a marker in his hand, "is that there are too many possible user trajectories. A user can create a sheet, share it, modify permissions, add formulas, delete rows, re-share, revoke access, export—the combinatorial space is enormous."

"So prune it," Justin said.

Navan stopped drawing. He looked at the whiteboard, where he'd sketched a tree of branching user actions. A game tree. He was already looking at a game tree and hadn't realized it.

"Alpha-beta," Navan said quietly.

"What?"

"In chess, you don't explore branches where the opponent already has a winning response to your move. You prune them. In scenario design, we don't need to explore trajectories where the user has already encountered a blocking failure. If the Sheets twin throws a 500 on the share operation, every downstream path from that point is dominated by the error state. We can prune the entire subtree."

Justin leaned back in his chair. "You're treating the user as one player and the system as the other."

"The user tries to accomplish their goal. The system responds. Some responses advance the goal, some block it. The satisfaction metric is the evaluation function at the leaf nodes. We propagate it back up and prune the branches that are already dominated."

"How deep can you search?"

"With the twins running locally? No rate limits. No API costs. We can go as deep as the scenario spec allows. Thousands of trajectories per hour."

Justin stood up and looked at the whiteboard. Navan had drawn a perfect game tree without intending to. Branching nodes, leaf evaluations, pruned subtrees marked with X's. It was the chess engine all over again, except the pieces were API calls and the board was a digital twin.

"Build it," Justin said.

Navan didn't build it. He wrote the spec. The agents built it. But the thinking—the game-tree thinking, the pruning instinct, the understanding that you don't explore every possibility, you explore the right ones—that was his. The chess engine had taught him that, years ago, in a language he still loved, on a problem that had nothing and everything to do with the factory.

Software Factory Archive

Kudos: 74