Twin Drift Detection

The alert came in at 3:47 AM on a Tuesday, which is when infrastructure problems prefer to announce themselves. The weekly drift detection pipeline had run, and the Slack twin had failed seventeen comparison tests that it had passed the previous week.

Jay saw the alert when he woke up. By the time he opened his laptop, Navan had already triaged it.

"Slack shipped an API update over the weekend," Navan said via their morning Slack message—the real Slack, not the twin. "They changed the response format for the conversations.history endpoint. The reply_count field is now optional instead of always present. Messages with zero replies used to return "reply_count": 0. Now they just omit the field entirely."

Seventeen scenarios relied on the presence of that field. The drift detection pipeline had caught the discrepancy because it ran the same API calls against both the real Slack and the Slack twin every Sunday night, then diffed the responses.

"How does the pipeline work exactly?" Jay asked. He'd seen the alerts before but never dug into the machinery.

"Three phases," Navan explained. "Phase one: we run a curated set of API operations against the real service and record the responses. Phase two: we run the identical operations against the twin and record those responses. Phase three: we compare. Status codes, response structure, field presence, value types, error messages. Anything that differs gets flagged as drift."

"And you run this weekly?"

"Weekly for all six twins. The curated set covers about sixty percent of each twin's API surface. We can't test everything—some operations are destructive, some require state that's expensive to set up—but sixty percent catches most behavioral changes."

The Slack drift was straightforward to fix. The spec update was simple: The reply_count field in message objects is now optional. When absent, the value should be interpreted as zero. An agent generated the code change. The twin's behavioral model was updated. The seventeen scenarios passed again by noon.

"This is the tax," Justin said when he joined the conversation that afternoon. "Every twin is a promise to stay current. The services don't notify us when they change. They don't version their behavioral quirks. They just change, and our twins have to notice and adapt."

"At least we have the pipeline," Jay said. "Without automated drift detection, we'd be finding these discrepancies when scenarios fail for no apparent reason. Phantom failures that look like twin bugs but are actually reality moving underneath us."

"We had those before the pipeline," Navan confirmed. "In August, the Jira twin had a drift that went undetected for three weeks. The real Jira had changed how it sorted custom fields in search results. Our scenarios were passing, but the results were in a slightly different order than production would return. It didn't break anything, but it undermined fidelity."

"Order-dependent drift," Jay said. "The most insidious kind. Everything looks right until you look at the sequence."

The pipeline ran every Sunday. The twins drifted between Sundays. It was a weekly ritual of realignment, the factory's way of acknowledging that the world was not static and neither were its models of the world.

Software Factory Archive

Kudos: 54