AI's False Efficiency Curve: How To Save And Protect Your Margins
The popular narrative around AI economics is changing. At one time, Moore’s Law conditioned us to expect that smarter, faster computing would steadily get cheaper. When it comes to AI, that expectation holds true at the unit level. Per-token costs are indeed declining. But the number of tokens consumed per task is growing exponentially, making total costs spike. The tension here is important: on paper, inference is getting cheaper.