DeepSeek-V4-Flash is the first local model strong enough to make LLM activation steering practically worth trying for independent engineers.
Key Takeaways
Steering manipulates model activations mid-inference using a “steering vector” derived by diffing activations across prompt pairs with/without a target concept.
Anthropic uses sparse autoencoders for more sophisticated feature extraction, but primarily for interpretability and safety, not capability tuning.
Most basic steering use cases are outcompeted by prompt engineering; steering only wins for concepts that resist prompting (like “intelligence”) or require massive token context.
antirez’s DwarfStar 4 strips llama.cpp to run only DeepSeek-V4-Flash and makes steering a first-class feature, though currently limited to toy examples like verbosity control.
Author is skeptical ambitious goals (“knows my codebase”, “intelligence” vectors) will work without reducing to full fine-tuning; expects the open-source community to clarify this within six months.