The rule was simple, and Jay had written it on a sticky note that lived on the edge of his monitor: If the SDK works with the twin, the twin is correct.
The phrase in the factory documentation was more formal: "popular publicly available reference SDK client libraries" as compatibility targets. What it meant in practice was this: take the official Okta SDK for Python, point it at the Okta twin instead of Okta's servers, and run every operation the SDK supported. If the SDK completed without errors, the twin was behaviorally compatible. If the SDK threw an exception, the twin had a fidelity gap.
"It's the ultimate integration test," Navan said. "We didn't write these SDKs. The service providers wrote them. They encode every assumption the provider makes about their own API. If our twin satisfies those assumptions, it satisfies the API contract."
They ran the Okta SDK compatibility suite every night. Two hundred and eighteen API operations across user management, group management, application management, and authentication flows. The twin passed two hundred and eleven of them. The seven failures were documented, tracked, and slowly shrinking.
The Jira SDK was harder. Atlassian's official Python client had been deprecated in favor of a REST API that people accessed through generic HTTP libraries. So the team used three different community SDKs as reference targets. Each SDK made slightly different assumptions about the API's behavior. Where the SDKs disagreed, the team tested against real Jira to determine which assumption was correct.
"We found a bug in one of the community SDKs that way," Jay said. "It was sending a malformed JQL query for board-scoped searches. Real Jira accepted it because Jira is lenient about JQL syntax. Our twin rejected it because we'd modeled the documented syntax strictly. We had to decide: match Jira's lenient behavior, or match the documentation?"
"What did you decide?" Justin asked.
"We matched Jira's lenient behavior. The twin's job is behavioral compatibility, not specification compliance. If real Jira accepts it, the twin accepts it. Even when Jira is wrong."
Justin nodded. That was the right call, and he didn't need to say so.
The Google SDKs were the most comprehensive. Google published official client libraries in Python, Java, Node.js, Go, Ruby, PHP, and .NET. The team ran compatibility suites in four of those languages. Each language's SDK exercised the API differently—different serialization, different retry logic, different error handling. The Sheets twin had to produce responses that satisfied all four.
"The Go SDK sends requests with a slightly different content-type header than the Python SDK," Navan noted during a compatibility review. "Both are valid. The twin accepts both. But it took us three days to figure out why the Go SDK was getting 415 errors that the Python SDK wasn't."
Jay updated his sticky note. He crossed out the old text and wrote a new version: If all the SDKs work with the twin, the twin is correct. If one doesn't, check the SDK first.
Finding bugs in community SDKs through twin testing is a wonderful side effect. The twin becomes a compliance oracle for the ecosystem.