#regulation #infrastructure #artificial intelligence

AI pricing war erupts as OpenAI and DeepSeek take opposite paths

Christine Miller•

Heute um 20:22• 6 Min.

Diagram on a whiteboard illustrating a global network strategy with interconnected circles, arrows, and explanatory text.

AI pricing war erupts as OpenAI and DeepSeek take opposite paths

In 24 hours last week, OpenAI and DeepSeek made opposite bets on what frontier AI is worth. One says it is a closed product that just got more expensive. The other says it is open infrastructure that just got dramatically cheaper. The price gap between the two ends of the market is now wider than it has been in years, and the comfortable middle that most coding agents have been routing through is thinning out.

Until last week, you could pick a model on a fairly smooth price-performance curve. There was a top tier, a middle tier, and a budget tier, and most workloads found a comfortable home somewhere on the slope. That curve still exists, but it has stretched. What used to be a continuous gradient now looks more like two clusters with a gap in between, and developers building agents, coding assistants, and high-volume inference pipelines now have to think harder about which side to route to.

The comfortable middle that most coding agents have been routing through is thinning out.

The 24-hour split

On April 23, OpenAI shipped GPT-5.5, priced at $5 per million input tokens and $30 per million output tokens. That is exactly double the GPT-5.4 rate of $2.50 and $15. The model uses a 1M token context window and scores 82.7% on Terminal-Bench 2.0, up from 75.1% on GPT-5.4. OpenAI argues that the price hike is offset by token efficiency, claiming that GPT-5.5 uses fewer tokens to complete the same Codex task. The company has not published a precise effective-cost figure on its launch page, so the per-task economics depend on the workload.

On April 24, DeepSeek released V4-Pro and V4-Flash. V4-Pro is listed at $1.74 per million input tokens and $3.48 per million output tokens, with a launch discount documented through May 5, 2026. V4-Flash is priced at $0.14 input and $0.28 output. Both ship under the MIT license with full open weights on Hugging Face, and both default to a 1-million-token context window. V4-Pro hits 80.6% on SWE-bench, verified per the model card, within striking distance of Claude Opus 4.6.

The widening gap for AI costs

| Model | Input (per 1M) | Output (per 1M) | Context | | --- | --- | --- | --- | | Open AI GPT-5.5 | $5.00 | $30.00 | 1M Tokens | | Anthropic Opus 4.7 | $5.00 | $25.00 | 1M Tokens | | DeepSeek V4-Pro | $1.74 | $3.48 | 1M Tokens | | DeepSeek V4-Flash | $0.14 | $0.28 | 1M Tokens |

What OpenAI is actually selling

GPT-5.5 is not just a smarter model. It is the centerpiece of a stack. Codex inherits the upgrade with expanded computer use, browser interaction, and longer agentic runs. ChatGPT is the default for the Plus, Pro, Business, and Enterprise tiers. The API gets it with the same 1M context window the consumer surface now has.

The bet is that intelligence, the serving stack, the agent harness, and computer use are one product, and that product is worth twice the per-token price of the previous generation. Greg Brockman framed it during the launch briefing as a model that takes a sequence of actions, uses tools, checks its own work, and keeps going until a task is finished. The customer is the enterprise that wants the whole thing from a single vendor, with a single API key, a single safety review, and a single billing line. OpenAI is not selling tokens. It is selling outcomes, and outcomes are now priced accordingly.

What DeepSeek is actually shipping

V4 is not a price war move. The pricing is downstream of three different decisions.

The first is architectural. V4-Pro is a Mixture-of-Experts model with 1.6 trillion total parameters and 49 billion active per token. V4-Flash runs 284 billion total with 13 billion active. DeepSeek's model card describes a hybrid attention scheme that combines compressed sparse attention with heavily compressed attention, designed to reduce 1M-token inference FLOPs and KV cache. The model achieves near-frontier benchmark scores while activating a small fraction of its weights per token. Smarter architecture, less compute.

The second is distribution. The MIT license is the most permissive open-source license available. Anyone can download the weights, host them, fine-tune them, embed them in a product, and ship that product commercially. V4-Flash at 13B active parameters runs on a multi-GPU cluster that mid-size teams can afford. V4-Pro requires more serious infrastructure, but the option exists. DeepSeek is betting that frontier intelligence becomes infrastructure the way Linux did, and that the lab releasing the weights captures the ecosystem rather than the runtime margin.

The third is hardware. On the same day, Huawei announced that its Ascend supernodes offer full support for V4 inference. Reuters reported that V4 was adapted for Huawei's most advanced Ascend AI chips and that Huawei said its chips were used for part of V4-Flash's training.

The middle is thinning, not gone

Before last week, a developer building a coding agent had a comfortable middle option. GPT-5.4 at $2.50 and $15 sat in a sweet spot. Cheap enough to scale, smart enough for most agentic work, hosted by a vendor everyone trusts. That tier is still on the price list, but it is no longer the flagship, and the new flagship costs twice as much.

GPT-5.5 took the upper slot at $5 and $30. V4-Pro took the lower slot at one-ninth of GPT-5.5 on output, before any discount. V4-Flash sits another order of magnitude below that. Anthropic's Opus 4.7 at roughly $5 input and $25 output sits next to GPT-5.5 in the premium tier, not in the gap between premium and open-weight.

For developers, the choice is no longer purely about which model is on a smooth curve. The choice is which economics to route to for which task. Pay for the integrated product or run the open infrastructure. Many production stacks will end up routing across both because the price gap is now wide enough to justify the engineering cost of routing logic.

What's next

The cost frontier no longer behaves like a smooth curve. It is two clusters of economics with a stretched gap in the middle, and the gap is not going to close on its own in the near term. OpenAI will continue to release fast and price up, because the integrated product is the moat. DeepSeek will continue to release open weights and price down, because the commodity infrastructure thesis depends on adoption. Both can be right for different workloads, and the same agent can route between both within a single task.

Neueste Nachrichten