Outsourcing plus LocalAI will soon become more economical vs. Frontier labs
We keep hearing that the inference costs are supposed to be on a downward trajectory but they are evidently not, not for the frontier US labs anyways. GPT 5.5 ($5/$30) that released less than 2 months after GPT-5.4 doubled the API pricing across the board. GPT 5.5 costs over 3x of what GPT-5 cost 8 months ago ($1.25/$10). Gemini 3.5 Flash ($1.50/$9.00) tripled the API pricing across the board over its predecessor Gemini-3-flash-preview ($0.50/$3.00) which was already price-hiked from its predecessor 2.5 Flash (0.30/$2.50)
"When discussing LLM pricing, people are missing the plot. The subscription token price is 10x-40x cheaper than API pricing. Your 90$ Claude subscriptions give you close to $1000 to $4000 in equivalent API token pricing. The second issue is that the quality of the model “operator”"
By: Chyzwar