GitHub Copilot’s June 2026 shift to usage-based billing confirms flat-rate AI subscriptions were never economically viable against variable token costs.
Key Takeaways
Microsoft lost $20+/month per Copilot user on a $10 subscription; some individual users cost the company $80/month (WSJ, Oct 2023).
One Copilot “premium request” consumes roughly $11 in tokens: 60k context window tokens, tool calls, and multiple internal model turns.
Newer reasoning models increase per-task token burn, so total inference costs rose even as per-token list prices declined.
Anthropic allowed users to burn ~$8 in compute per $1 of subscription; OpenAI’s ratio is similarly skewed.
AI labs deliberately obscured usage behind “tokens,” “messages,” and percentage gauges, preventing users from modeling their own costs before pricing structures shifted.
Hacker News Comment Review
The article’s core subsidy claim is contested: commenters cite ~80% profit margins at frontier API providers and point to Kimi K2.6 serving profitably at $4/1M tokens as evidence that token-level economics are not underwater – the subsidy lives in flat-rate packaging, not inference economics.
The variable-cost framing is partially challenged: GPU data center costs are mostly fixed, so electricity variability from utilization is marginal; the subscription mismatch is real but the mechanism the article describes is not precise.
Usage-based billing introduces a cost-overrun anxiety that flat-rate plans deliberately removed; commenters note this changes user behavior and product perception as much as it changes unit economics.
Notable Comments
@joshjob42: Kimi K2.6 is profitable at $4/1Mtok and roughly Sonnet-level; frontier labs are not subsidizing tokens, they are subsidizing flat-rate plan packaging.
@Glyptodon: At $7k+/year/eng in token spend, buying high-RAM GPU workstations to run local models starts to undercut cloud pricing over a 3-year horizon.