Platform Engineering 2.0: Manage AI Costs and Risks Without Rebuilding Infrastructure - Platform Engineering
Skip to content
Platform Engineering 2.0: Manage AI Costs and Risks Without Rebuilding Infrastructure
Published On: July 1, 2026By Steven Vaughan-Nichols
Platform engineering teams have spent the better part of a decade building something admirable: Standardized Kubernetes clusters, CI/CD pipelines, internal developer platforms (IDPs), and self-service infrastructure that let developers ship applications safely, efficiently, and repeatedly. That foundation held up well — until AI arrived and changed every assumption underneath it.
The debate over DevOps versus platform engineering now feels quaint.
A far more consequential challenge has taken its place: How do you build, govern, isolate, and operate AI workloads on infrastructure that was never designed to carry them? Broadcom and PlatformEngineering.org say there is an answer in Broadcom’s new Platform Engineering 2.0 framework. With its creation of the framework, Broadcom says it drew on its collective enterprise experience across AI, software and private cloud infrastructure when it created the framework.
Platform Engineering 2.0 also builds upon the successful foundations of Platform Engineering 1.0, rather than replacing them. The framework was designed to serve as a natural progression rather than a departure from existing platform investments.
The Gap that AI Exposed
Platform Engineering 1.0 was built for containerized, developer-centric, human-paced workflows. AI broke that model in three distinct ways.
First, AI introduces prompts and retrieved content as a live, unpredictable input channel. Data is no longer just data — it is an executable influence on model output behavior, making traditional isolation properties — which once held for compute and storage — suddenly unreliable.
Second, AI workloads span multiple execution planes at once: SaaS APIs, fine-tuned hosted models, on-premises inference, retrieval layers, and tool-calling agents that reach deep into existing systems. The platform was never designed to govern across all of those simultaneously.
Third, and most critically, AI moves the trust boundary away from the application — and into the interplay between models, tools, data sources, humans, and non-human agents steering them. That is not a gap you can patch. It is a structural problem. In addition to introducing new security and governance concerns, AI also creates operational fragmentation as teams independently adopt different models, tooling, retrieval approaches, and observability practices.
The consequences are already landing in enterprise budgets. At June’s FinOps X conference, Mike Eisenstein, Accenture’s FinOps Global Practice Lead, relayed a CIO’s account of Claude API costs escalating from $250,000 per day to $400,000 per day in a single month. As J.R. Storment, Executive Director of the FinOps Foundation, put it plainly: “AI’s rate of change is exceptionally fast. What’s a good policy one day can be out of date the next week.”
This is not sustainable — and application-level fixes won’t solve it.
Why App-Level Fixes Don’t Scale
The instinctive response has been to let each application team wrap its own guardrails around its AI use case. Chatbots add prompt hardening. Document tools bolt-on access checks. Code assistants get separate logging. Every team builds its own perimeter.
That reflex has structural limits. Policy interpretation fragments across teams. For example, “no PII to external models” means something different in marketing than it does in finance. Security leaders cannot answer basic questions about which models are running, where, and under what policies, because the answers are scattered across dozens of services and vendor consoles.
Shadow AI compounds this further. It is more dangerous than shadow IT ever was — not just because of data exposure risk, but because of the cost profile it creates.
AI governance confined to documents and application code is not enough. Security responsibilities must move down into the platform itself.
Two of those pillars are foundational: model governance as a control plane and workload isolation as a structural guarantee.
Model Governance as a Control Plane
Most enterprises now run multiple models across multiple providers. As the AI security firm Airia notes, “model-specific governance breaks at scale.” It proposes “a control layer that sits above the model level — one that enforces policy, logs decisions, and monitors behavior regardless of which underlying model is executing a task.”
Platform Engineering 2.0 turns that principle into a concrete service: a central model registry and routing layer; unified authentication? Policy enforcement applied uniformly across OpenAI, Anthropic, or any on-premises model, and a single pane of glass for audit,...