Skip to content
nevzheng
Go back

Building strata

I just left Google BigQuery after four years. The work was interesting, but I missed building things that were mine. I wanted a project that was fun, challenging, and a little ambitious — something where I could learn, play, and reconnect with the part of engineering I got into this for. I also wanted something tangible I could point to and say: I built this, I understand it end to end, come talk to me about databases.

So I’m building an LSM storage engine from scratch in Rust. I’m calling it strata.

This isn’t a tutorial. It’s what I ran into along the way and what I’m learning from it.

Table of contents

Open Table of contents

Why an LSM Tree?

I have an ambition to build a full database from scratch someday. When I looked at the options, LSM trees stood out. Compared to B-trees, they’re easier to reason about — the core data structures are immutable, the write path is append-only, and you don’t have to fight concurrency bugs from the start. I wanted to optimize for correctness and developer experience, not raw performance on day one.

There’s also something elegant about the design. Every write is just an append. On-disk files are never modified, only created and deleted. The complexity lives in how you merge and organize data across levels, and that part is genuinely fun to think about.

The name comes from geology — layers of rock built up over time. Levels, tiers, strata. It also follows the naming tradition of storage engine projects that sound like they could be indie rock bands.

Designing Before Building

Before writing code, I spent time reading about LSM design, taking notes, and identifying specific things I wanted to get right early. Some stuff I wanted to explore:

Building It

I used Claude as a design partner and code generator throughout. It let me think at the design level — “should the WAL use segments or truncation?” — instead of fighting syntax for an hour. I like working at this level. It feels more productive, and honestly I either spend the time I save on things I value outside of work, or I just end up spending more time coding and engaging with the design.

After building it, I wasn’t fully confident I could explain every detail. Not because I didn’t understand the design, but because generating code at speed creates a gap between “I understand the architecture” and “I can trace every line.” So I did something that turned out to be really valuable — I had the AI interview me about my own system.

Walk through a write. Walk through a read. Walk through recovery. Explain your decisions.

It surfaced real issues. My range scan implementation probably stops too early — it might return results from only the first level that has matches instead of merging across all levels. I also realized my compaction wasn’t handling tombstones correctly in all cases — if a delete and a put for the same key end up in different SSTables, you have to be really careful about the order you process them or you’ll resurrect deleted data.

These weren’t things I would have caught from running happy-path tests. They came from having to explain my own system out loud and getting pushed on the details.

Embracing Imperfection

Perfectionism has cost me a lot. Not just in engineering — in my life. I’ve spent too much time not shipping things, not sharing things, not starting things, because they weren’t ready yet. It’s exhausting and it doesn’t actually produce better work. It just produces less of it.

There’s a Japanese concept called wabi-sabi — finding beauty in things that are incomplete, imperfect, impermanent. I’ve been thinking about that a lot lately. strata is my practice at it. My level implementation is janky. Compaction doesn’t drop old versions. The WAL segments aren’t implemented. There are too many clones and vector materializations. I know all of this, and I’m sharing it anyway.

Kaizen over perfection. The code is broken in ways I can name, and I can fix them one at a time, at my own pace.

What’s Next

There’s a lot to do:

The bigger dream is to add a SQL engine on top. Maybe GlueSQL, maybe something I hand-roll. That’s the whole point of owning the storage layer — when I’m ready to build up, I know exactly what’s underneath.

If you’re curious, the code is at github.com/nevzheng/strata-db. It’s very much a work in progress.

I might write up some technical deep dives later if I feel like it. No promises.

If you enjoyed this post or found it useful, consider buying me a coffee. ☕

Discuss on Hacker News

If you’re interested in storage engines, these are worth looking at:

RocksDB is the reference implementation for modern LSM design. Facebook’s fork of LevelDB, used everywhere from CockroachDB to TiKV. Over a decade of development and still actively evolving.

LevelDB is Google’s original, written by Jeff Dean and Sanjay Ghemawat. Simpler than RocksDB and great for reading the source to understand core LSM mechanics.

Pebble is CockroachDB’s storage engine in Go. Started as a LevelDB port, now heavily diverged. Clean codebase with well-documented design decisions.

BadgerDB is Dgraph’s engine in Go. It separates keys and values based on the WiscKey paper — keys live in the LSM tree, values in a separate log.

sled is the most prominent Rust embedded database, though it’s a lock-free B+ tree rather than an LSM. Worth studying for Rust-specific idioms in storage engine design.

mini-lsm is a great step-by-step LSM tutorial in Rust, written by someone I know from CMU. Highly recommend if you’re building your own.

AgateDB is a Rust port of BadgerDB from the TiKV project. Another good Rust LSM reference.

References


Share this post on:

Next Post
Ow! My Foot!