Δ-Mem: Efficient Online Memory for Large Language Models

May 16, 2026 · ai history · Source ↗

TLDR

arXiv paper proposes δ-mem, a frozen-backbone memory add-on using an 8×8 associative state matrix to boost long-context LLM performance without fine-tuning.

δ-mem attaches a fixed-size online state to a frozen full-attention backbone, applying low-rank corrections to attention during generation via delta-rule learning.
Only an 8×8 state matrix is needed; average score improves 1.10× over the frozen backbone and 1.15× over the strongest non-δ-mem memory baseline.
Largest gains on memory-heavy benchmarks: 1.31× on MemoryAgentBench, 1.20× on LoCoMo, with general capabilities largely preserved.
Requires no full fine-tuning, backbone replacement, or explicit context extension, making it a drop-in augmentation path for existing LLMs.
Addresses the core failure mode of naive context-window expansion: cost and poor utilization of distant context.

Commenters are skeptical about real-world utility; the dominant ask is production evidence, especially for coding agents, not benchmark scores.
One commenter raised a separate angle: shared memory across users for energy savings, which the paper does not address and may reflect confusion about the mechanism.