Poolside releases Laguna M.1 (225B-A23B MoE) and open-weight Laguna XS.2 (33B-A3B MoE, Apache 2.0), both agentic coding models trained from scratch for long-horizon tasks.
Key Takeaways
Laguna XS.2 (33B total, 3B activated) hits 44.5% SWE-bench Pro and 30.1% Terminal-Bench 2.0; weights downloadable now under Apache 2.0, NVFP4 variant included for Blackwell hardware.
Laguna M.1 (225B-A23B, 30T tokens, 6,144 Hopper GPUs) scores 46.9% SWE-bench Pro and 40.7% Terminal-Bench 2.0; closed API only, free during research preview.
Poolside is shipping the same ACP (Agent Client Protocol) server harness used internally for agent RL training and evaluation alongside the model weights.
Both models trained entirely in-house using the Titan codebase, Muon optimizer, and async on-policy RL; synthetic data covers ~13% of XS.2 pre-training mix (4.4T+ synthetic tokens across the family).
AutoMixer framework runs ~60 proxy models per sweep to optimize data mixture proportions, targeting code, math, STEM, and common sense tradeoffs without manual heuristics.
Hacker News Comment Review
Commenters broadly noted that Qwen3.6 35B-A3B beats Laguna XS.2 on Terminal-Bench 2.0 (51.5 vs 30.1) and also edges out the much larger M.1 (225B-A23B), raising questions about what Poolside’s model training investment actually buys over frontier open-weight competitors.
The decision to co-release the ACP harness was treated as the genuinely differentiated move: it’s the same runtime exercised in production RL rather than a demo wrapper, which is rare among lab releases.
Early testers reported fast inference and strong ACP spec adherence via the “pool” agent in Zed, though benchmark-vs-real-use gaps remain an open question given the Terminal-Bench numbers.
Notable Comments
@vijgaurav: “Most labs dump the model and make you figure out the agent layer yourself” – shipping the RL-exercised harness is the distinguishing detail.
@jaen: Side-by-side Terminal-Bench 2.0 comparison shows Qwen3.6 35B-A3B at 51.5 vs XS.2 at 30.1, a gap that persists even against M.1 at 40.7.