Replacing Option<SmithyTraits> with Option<Box<SmithyTraits>> and a custom serde deserializer cut a Rust program’s memory from 895 MB to 420 MB, saving 475 MB.
Key Takeaways
Option<BigStruct> always occupies the full struct size on the stack, even when None; Option<Box<BigStruct>> collapses to one pointer word when None.
The null-pointer niche optimization applies to Option<Box<T>>, so boxing adds zero overhead compared to a bare Box<T>.
A custom deserialize_with function lets serde skip heap allocation entirely for all-None structs, storing None instead of Some(Box::new(empty)).
Measuring with jemalloc (tikv-jemallocator + tikv-jemalloc-ctl behind a profile feature flag) gave pre/post allocation numbers to confirm real savings.
The added CPU cost of deserializing-then-discarding empty structs was offset by lower memory pressure, making the overall task faster.
Hacker News Comment Review
Commenters extended the technique to string types: Box<str> drops the unused capacity word that String carries, and CompactString handles short strings inline, both useful when data is text-heavy.
Discussion surfaced a dual-representation hazard: None and Some(Box::new(all-None struct)) encode the same logical state, which pollutes pattern matching and leaves a trap for anyone constructing the type outside the deserializer. Keeping the storage-optimized types private was the consensus fix.
The heap fragmentation caveat in the article drew pushback: well-maintained allocators like jemalloc handle small, repeated allocations without significant fragmentation, so the warning applies mainly to glibc malloc in edge cases.
Notable Comments
@_alphageek: recommends dhat-rs for pinpointing exactly which fields and call sites are consuming memory, replacing guesswork about where to start.
@el_pollo_diablo: notes the dual-representation problem creates diverging pattern-match paths in every field access, not just construction; argues for private storage types with a clean public API.
@tialaramex: details Box<str> and CompactString as orthogonal wins for string fields, since String always stores length, capacity, and pointer regardless of content.