Welcome, Guest | Browse

Software Factory Archive

← Previous Work All Works Next Work →

BLAKE3

Rating:
General Audiences
Fandom:
StrongDM Software Factory
Characters:
Jay Taylor Navan Chauhan Justin McCarthy
Tags:
CXDB BLAKE3 Hashing Benchmarking Content Addressing
Words:
461
Published:
2025-09-22

Jay had a spreadsheet. He was not proud of the spreadsheet—it was ugly, the columns were misaligned, and he'd used Comic Sans for the header row as a joke that had somehow calcified into permanence. But the data in the spreadsheet was beautiful.

Five hash algorithms. Five contenders for the job of content-addressing every blob in CXDB. SHA-256, SHA-512, BLAKE2b, xxHash, and BLAKE3. Each one benchmarked against the same corpus: ten thousand agent conversation turns ranging from 200 bytes to 4 megabytes, representing a realistic cross-section of what CXDB would actually store.

"SHA-256 is the obvious safe choice," Jay said, narrating to Navan as the numbers rolled in. "Everyone uses it. Every library supports it. It's boring and reliable."

"But?" Navan prompted.

"But look at the throughput." Jay pointed at the column. SHA-256 managed 400 megabytes per second on their hardware. Respectable. Unremarkable. "Now look at BLAKE3."

BLAKE3 processed the same corpus at 4.2 gigabytes per second.

Navan blinked. "That's—ten times faster?"

"Ten point five." Jay scrolled to the breakdown. "BLAKE3 is built for parallelism. It uses a Merkle tree internally, so it can hash multiple chunks simultaneously across cores. On our eight-core dev machines, it scales almost linearly. On production hardware with more cores, it'll be even faster."

"What about collision resistance?"

"256-bit output, same as SHA-256. Cryptographically secure, peer-reviewed, published. It's not some toy hash function someone hacked together in a weekend. The designers include the people behind BLAKE2 and ChaCha20."

Navan studied the spreadsheet. xxHash was fast too—nearly as fast as BLAKE3—but it wasn't cryptographic. It was built for hash tables and checksums, not content addressing. One collision in a content-addressed store and you'd be serving the wrong blob to the wrong agent. Not an option.

SHA-512 was cryptographically solid but even slower than SHA-256 on 32-bit payloads. BLAKE2b was respectable, around 900 megabytes per second, but BLAKE3 still lapped it by a factor of four.

"The thing that clinches it," Jay said, pulling up the Rust crate documentation, "is that the BLAKE3 crate is maintained by the algorithm designers themselves. It's not a third-party binding. The reference implementation is in Rust. We're writing the server in Rust. It's alignment all the way down."

He sent the benchmark results to Justin in a three-line message. Algorithm, throughput, recommendation. No preamble, no hedging.

Justin's reply came back in eleven seconds: Ship it.

Navan added the BLAKE3 dependency to the Cargo.toml that afternoon. The first content hash appeared in the database log at 3:47 PM, a 256-bit digest of a twenty-word agent prompt that would eventually become the root of a conversation tree containing forty thousand turns.

Jay deleted the spreadsheet. Then he recreated it without Comic Sans. Some decisions, once made, deserved clean records.

Kudos: 67

hash_aficionado 2025-09-24

The Comic Sans spreadsheet becoming the basis for a critical infrastructure decision is peak engineering culture. Also BLAKE3 absolutely deserves the win here.

crypto_casual 2025-09-25

Love that the reference implementation being in Rust was the clincher. When your language choice and your algorithm choice align like that, the universe is telling you something.

← Previous Work All Works Next Work →