DeepSeek V4 -- almost on the frontier, a fraction of the price

· ai · Source ↗

TLDR

  • DeepSeek releases V4-Pro (1.6T params, 49B active) and V4-Flash (284B, 13B active) with MIT license, priced far below all frontier competitors.

Key Takeaways

  • V4-Pro at $1.74/$3.48 per million tokens undercuts Gemini 3.1 Pro ($2/$12) and GPT-5.4 ($2.50/$15); V4-Flash at $0.14/$0.28 beats GPT-5.4 Nano.
  • Efficiency gains explain the pricing: V4-Pro uses only 27% of V3.2’s single-token FLOPs and 10% of KV cache at 1M-token context, via new HCA and mCH attention methods.
  • V4-Pro is now the largest open-weights MoE at 1.6T total parameters, surpassing Kimi K2.6 (1.1T) and DeepSeek V3.2 (685B); weights are 865GB on Hugging Face.
  • DeepSeek’s own benchmarks place V4-Pro roughly 3-6 months behind GPT-5.4 and Gemini 3.1 Pro on reasoning; V4-Pro-Max with extended reasoning tokens closes some of that gap.
  • Both models support 1M-token context; quantized Flash may run on a 128GB M5 MacBook Pro, Pro may be streamable from disk.

Hacker News Comment Review

  • Commenters broadly confirm quality and cost advantages in real coding workflows, with flash handling routine refactors cheaply and pro doing architectural planning; cost per task is dramatically lower than frontier alternatives.
  • A practical caveat surfaces around true effective cost: Pro and K2.6 can emit heavy reasoning token streams, eroding the headline price advantage on short or moderate context tasks; official API discounts also complicate direct comparisons.
  • Data privacy is a recurring concern – DeepSeek’s official API makes no data privacy guarantees, and using Chinese-hosted models via OpenRouter routes data accordingly; commenters note the issue gets less scrutiny than equivalent Western provider controversies.

Notable Comments

  • @antirez: live demo of V4-Flash running locally on a 128GB MacBook, confirming the local-inference path is already viable.
  • @rurban: asked V4-Pro to produce a working arm64 compiler port; it delivered in 30 minutes, a task previously impractical at frontier pricing.

Original | Discuss on HN