Por amor al código Code snippets y reflexiones sobre tecnología

Something I wrote today about the recent releases and the IPOs, and the endless posts by people pronosticating the bubble is going to burst, this time for sure.

I think we are getting to the top of the S curve, but not because the tokens are too expensive and the models are brute-forcing their way through. I think that training it is getting too expensive, specially because these companies plan to go public and they need the numbers to look good for investors (S&P500 already blocked SpaceX and OpenAI and Anthropic will follow by the same reasoning)

I think that the piece is wrong though about the price of the tokens. This is the usual argument of those who use only Anthropic and extrapolate their use (or abuse) to all users… But there are some points wrong in the analysis:

  • The cost per task doesn’t have to go up so dramatically if you keep the quality of the code, you keep the code modular, add tests, refactor often etc. You need a human in the loop for this, but the reality is that you need the human in the loop for many reasons, using the agents naively to brute force their way through will only take you so far.

  • The cost of the same task (“cost of intelligence”) is going down with every model release, and more so if you look at open source models. I showed last month that you can do the same task with Opus 4.7 or with GLM5 at 0.1 of the cost and I didn’t even try with DeepseekV4 or Qwen3.5 Max Plus that are even cheaper.

  • Not all users use the models the same way. Some heavy users spend 5000$ in the 100$ subscription, but I bet there are many more that don’t spend the whole budget.

About the claim of the cost of intelligence going down we have some numbers. In Artificial Analysis they have a chart of the cost of running the benchmarks with each model. You can see in the chart that GPT5.5 medium is less than 1/3 the price of GPT5.4 xhigh, when both get the same score. And Opus 4.8 is cheaper than Opus 4.7 with a better score… (edited) 

Artificial Analysis Chart 1

Artificial Analysis Chart 2

Minimax M3 is even cheaper, and my personal experience is that you can do many coding tasks with DeepseekV4 or Minimax 2.5, which are even cheaper.

So, to me the analysis is completely backwards: cost per token is going down, tokens will be a commodity soon and margins will go down for big labs except for frontier models while they have an edge. Inference is profitable even for Anthropic, and many cloud providers like Amazon or Azure count on that, the numbers don’t add up right now because of the huge investment on more datacenters because demand is so high…

The problem is the cost of training, that goes up exponentially too with each generation,  and is difficult to justify to investors (unless they are AI pilled) when you could just “pause training” and put all that computing into inference and make even more money. Thus the post from Anthropic asking to “pause” training. The cost of research and security is the main reason for the battles inside OpenAI and arguably for the creation of Anthropic.

Even if OpenAI and Anthropic crash and burn tomorrow or everyone stops training because it is not worthy anymore, the open source models that we have are enough to keep using agents and change dramatically the industry.