Announcing Envoy AI Gateway 1.0 — A Stable, Production-Ready AI Gateway | Envoy AI Gateway
Skip to main content
Today we're thrilled to announce Envoy AI Gateway 1.0 — the first stable, generally available<br>release of the open source AI gateway built on CNCF's Envoy Gateway .
When we shipped v0.1 in February 2025,<br>we closed that post with three words: "Onward to 1.0!" Sixteen months and many releases later,<br>backed by a community of maintainers and adopters across the industry, we're here. 1.0 means you can<br>build on Envoy AI Gateway with confidence: a control-plane API we're committing to keep stable, running<br>on the same battle-tested Envoy foundation that already powers production traffic at the world's largest<br>companies.
🔒 What 1.0 Means: Stability You Can Build On
The headline of 1.0 isn't a single new feature — it's a promise . Our<br>release policy has always said that<br>we'd cut the major v1.0.0 release once we had a first stable control-plane API. That moment is here.
For stable APIs, the commitment is simple and strong:
We will never break the APIs unless there is a critical security issue , and we will always provide<br>a migration path in the release notes if we ever must.
Concretely, that means:
GuaranteeWhat it means for youStable CRDs The resources you author — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, MCPRoute, all served at v1beta1 — won't break under you.Predictable upgrades Upgrading the controller won't break a valid, migrated configuration.Documented migrations Any future change that requires action will ship with a clear, documented upgrade path.<br>This is the foundation enterprises have been asking for — the ability to standardize on a single,<br>provider-agnostic AI gateway without betting on a moving target.
🛣️ The Road from 0.1 to 1.0
v0.1 launched with a unified API in front of two providers and the essentials: upstream<br>authorization and token-based rate limiting. 1.0 is a different animal. Here's how far the project has<br>come:
Capabilityv0.1 (Feb 2025)1.0AI providers 2 (OpenAI, AWS Bedrock)16 , with cross-provider request/response translationAPI surface Chat completionsChat, completions, embeddings, image generation, audio (transcription / translation / speech), and the OpenAI Responses APIMCP (Model Context Protocol) —A full MCP gateway : server multiplexing, tool routing & filtering, and fine-grained authorizationMultimodal —Image, audio, and video inputs across supported providersObservability Basic metricsOpenTelemetry tracing, OpenInference, GenAI token metrics, separate reasoning-token accountingMulti-tenancy & routing Token rate limitingHostname-based routing, model virtualization, and quota-aware rate limitingControl-plane API v1alpha1 (experimental)v1beta1 (stable)<br>Sixteen providers, integrated through a single OpenAI-compatible interface, including OpenAI, Azure<br>OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock, Anthropic, Mistral, Cohere, Groq, Together AI,<br>DeepInfra, DeepSeek, Hunyuan, SambaNova, Grok, and the Tetrate Agent Router Service.
✨ What's in 1.0
One API, every provider
Point your application at a single OpenAI-compatible endpoint and let the gateway handle<br>provider-specific translation, authentication, and routing. Switch or mix providers without touching<br>application code — and use model virtualization to keep your app code stable while routing changes<br>underneath:
backendRefs:
- name: openai-backend
modelNameOverride: "gpt-4o"
- name: anthropic-backend
modelNameOverride: "claude-opus-4"
This is the key to A/B testing, gradual migrations, multi-provider strategies, and safeguarding against<br>vendor lock-in.
Provider authentication, handled for you
BackendSecurityPolicy keeps provider credentials out of your application and centralizes upstream auth —<br>API keys, AWS, Azure, and GCP cloud-native identity (including Workload Identity), all managed at the<br>gateway.
An MCP gateway for the agent era
Aggregate multiple Model Context Protocol servers behind one endpoint, route and filter the tools<br>clients can see, and enforce fine-grained, CEL-based authorization — so tools/list only ever<br>returns what a caller is actually allowed to use.
Enterprise observability, built in
Token-aware metrics, OpenTelemetry tracing with OpenInference compatibility (for evaluation tools like<br>Arize Phoenix), and separate accounting for reasoning tokens give you the cost control and visibility<br>that AI workloads demand.
Standards all the way down
Built on the Kubernetes Gateway API and the<br>Gateway API Inference Extension, Envoy AI Gateway<br>is an additive layer on Envoy Gateway — it expands what Envoy can do for GenAI traffic without changing<br>how you already deploy and operate it.
🌍 Built by a Community, on CNCF Envoy
1.0 is the work of a genuinely cross-industry community. Maintainers come from Tetrate, Bloomberg,<br>Tencent, Netflix, and Nutanix , alongside a growing roster of independent contributors who join our weekly<br>community...