The Migration

Day one of the migration was the quiet kind of terrifying. Not the fire alarm kind, not the server-is-down kind. The kind where everything is working and you know that one mistake will make it stop working and nobody will notice until it's too late.

They were moving every agent conversation history into CXDB. Months of accumulated context. Thousands of sessions. Hundreds of thousands of individual messages, tool calls, error reports, and structured outputs. All of it currently living in a collection of JSON files, SQLite databases, and one deeply regrettable spreadsheet that Navan refused to discuss.

"The source data is in eleven different formats," Jay said, looking at the inventory he'd compiled. "Eleven. One per agent configuration. Some of them are newline-delimited JSON. Some are SQLite with custom schemas. Some are—what is this one, Navan?"

"Don't ask."

"It looks like it's CSV with embedded JSON in the third column."

"I said don't ask."

Jay wrote a converter for each format. Eleven converters, each one reading its source format and emitting CXDB turns through the Go client library. Each converter validated every field against the type registry before writing. Each converter logged every turn it processed, every blob it stored, every hash it generated.

The process was not fast. Jay did not want it to be fast. He wanted it to be correct. He set the batch size to one hundred turns, with a verification step after each batch: read back every turn just written, compare it byte-for-byte against the source data, and halt if anything differed.

"This is going to take three days," Navan observed, watching the progress counter.

"Three days of correct is better than three hours of maybe," Jay replied.

Day one: the newline-delimited JSON sources. The cleanest format, the easiest conversion. Forty-two thousand turns migrated. Zero discrepancies on readback. The BLAKE3 hashes matched. The DAG structure was intact. Every parent pointer pointed to a valid turn.

Day two: the SQLite databases. Trickier. Some of them had nullable timestamp columns. Some had conversations stored as single text blobs rather than individual messages. Jay's converters parsed, split, and reconstructed. Navan spot-checked a random sample of five hundred turns by hand, comparing the CXDB output against the original SQLite rows.

"Five hundred for five hundred," Navan reported. "Perfect match."

Day three: the rest. The odd formats. The CSV with embedded JSON. The one directory that was just a tree of text files named by Unix timestamp. Jay processed them all with the same care, the same batch size, the same verification step.

At 4:47 PM on the third day, the last turn was written and verified. Jay ran the final count: 283,416 turns across 4,891 conversations, stored in 197,203 unique blobs after deduplication. Zero data lost. Zero discrepancies. Zero corrupted hashes.

Navan taped a note to his monitor: 283,416 / 283,416.

Jay turned off the converters and deleted nothing. The source data stayed where it was for another month, just in case. It was never needed.

Software Factory Archive

Kudos: 68