Cloud & infrastructure
1
min read

AI is eating your cloud budget

AI workloads demand a fundamentally different approach to cloud economics.
Article author
Written by
Patrick Jamal
Published on
March 28, 2026
Last updated on
April 1, 2026

Cloud providers will spend nearly $600 billion on AI infrastructure this year, a 40% jump from 2025, according to S&P Global. Enterprises are the biggest spenders and most have no idea what the final number will be: 80% of companies miss their AI infrastructure cost forecasts by more than 25%.

And it's getting worse, not better.

The unsustainable burden on FinOps teams

Leadership wants AI. The budget has to come from somewhere. So they tell FinOps to find savings in existing cloud spend and redirect it toward AI initiatives.

Ninety-eight percent of FinOps teams now manage AI spend, up from 31% just two years ago. But headcount hasn't grown to match.

The same 8–10 person teams are now responsible for public cloud, AI spend, SaaS, software licensing, private cloud, and data center costs.

Traditional cloud optimization required deep knowledge of one cost domain. AI adds token billing, model selection, accelerator utilization, output variance, and rapid vendor changes, all moving at once. You can't layer five new responsibilities onto the same team and expect the same quality of governance.

AI cost management is the number-one skill gap FinOps teams report needing to fill. "Find savings to fund AI" creates another loop: the pressure to cut cloud costs fast leads to blunt-instrument approaches (shutting down environments, restricting usage) rather than strategic optimization that actually sustains.

Where AI costs really hide in the modern enterprise

The most visible AI expense is token pricing for LLM APIs, but it’s actually the smallest part of the problem. Inference costs have dropped 9× to 900× per year, depending on the model class. GPT-4-class capabilities that cost $60 per million input tokens in 2023 now run at $2.50 to $5.00. But cheaper tokens don't mean cheaper AI.

Data platforms are the top surprise

According to the 2025 State of AI Cost Management report from Benchmarkit and Mavvrik, data platforms are the number-one source of unexpected AI spend, cited by 56% of enterprises. The data preparation, feature engineering, vector databases, and storage layers that feed AI models cost more than the models themselves.

Network and egress add up fast

AI workloads are data-hungry. Training pipelines move massive datasets between storage and compute. Inference serving requires low-latency responses distributed across regions. Every data movement incurs egress charges.

On hyperscalers, egress starts at $0.08–0.09 per GB and compounds quickly. Move 10 TB of training data and you're looking at $900 in transfer fees alone, before a single model runs.

GPU pricing is a starting point, not a total

Storage I/O bottlenecks force teams to over-provision compute. Inter-region transfers for distributed training add bandwidth charges. NAT gateways, API gateways, and logging accumulate. The gap between GPU list price and the actual monthly bill is routinely 50% to 100% on hyperscalers.

AI infrastructure needs a workload-first approach

The enterprises managing AI costs effectively share one trait. They treat AI infrastructure as an architecture decision, not a procurement default.

That means evaluating each AI workload on its own merits:

Training workloads with massive compute requirements and predictable schedules often benefit from reserved capacity or on-premises GPU clusters. When cloud costs exceed 60–70% of equivalent on-premises acquisition costs, the math favors capital investment for steady-state training.

Inference workloads serving global users need distributed compute close to end users. Centralized hyperscale inference creates latency bottlenecks and egress penalties. Distributed platforms like Akamai Cloud Inference push AI processing to the edge, delivering 86% lower inference costs, 3× throughput, and 60% lower latency compared to centralized alternatives by running models on an infrastructure network that was built for distribution.

Experimentation and fine-tuning workloads with bursty, unpredictable usage are well-suited to elastic cloud resources, where you pay only for actual consumption. The key is keeping these workloads portable, so they're not locked to a single provider's GPU ecosystem.

Data pipelines feeding AI models need an architecture that minimizes movement. Co-locating storage with compute, using tiered caching, and choosing providers with favourable egress economics directly reduce the hidden costs that catch most organizations off guard.

Maxima has helped organizations reduce AI infrastructure costs by 35%+ by matching workloads to the right platform rather than defaulting to whichever hyperscaler won the first contract.

The window for optimizing infrastructure for AI is closing

Gartner projects AI will account for 41.5% of all IT spending in 2026, up from 31.7% last year. By 2027, it could reach half. Every month an organization operates without AI cloud cost governance is a month of margin erosion.

Yet 79% of executives believe AI won't significantly contribute to revenue until 2030, according to the IBM Institute for Business Value.

The organizations that move early on workload-first AI infrastructure, treating placement as a strategic decision, not a default, will carry a structural cost advantage into the years when AI revenue finally arrives.

The ones that don't will spend 2027 thinking where the budget went.

Table of contents

more articles from

Cloud & infrastructure