Envoy AI Gateway 1.0 – A Stable, Production-Ready AI Gateway

Announcing Envoy AI Gateway 1.0 — A Stable, Production-Ready AI Gateway | Envoy AI Gateway

Today we're thrilled to announce Envoy AI Gateway 1.0 — the first stable, generally available release of the open source AI gateway built on CNCF's Envoy Gateway .

When we shipped v0.1 in February 2025, we closed that post with three words: "Onward to 1.0!" Sixteen months and many releases later, backed by a community of maintainers and adopters across the industry, we're here. 1.0 means you can build on Envoy AI Gateway with confidence: a control-plane API we're committing to keep stable, running on the same battle-tested Envoy foundation that already powers production traffic at the world's largest companies.

🔒 What 1.0 Means: Stability You Can Build On

The headline of 1.0 isn't a single new feature — it's a promise . Our release policy has always said that we'd cut the major v1.0.0 release once we had a first stable control-plane API. That moment is here.

For stable APIs, the commitment is simple and strong:

We will never break the APIs unless there is a critical security issue , and we will always provide a migration path in the release notes if we ever must.

Concretely, that means:

GuaranteeWhat it means for youStable CRDs The resources you author — AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, MCPRoute, all served at v1beta1 — won't break under you.Predictable upgrades Upgrading the controller won't break a valid, migrated configuration.Documented migrations Any future change that requires action will ship with a clear, documented upgrade path. This is the foundation enterprises have been asking for — the ability to standardize on a single, provider-agnostic AI gateway without betting on a moving target.

🛣️ The Road from 0.1 to 1.0

v0.1 launched with a unified API in front of two providers and the essentials: upstream authorization and token-based rate limiting. 1.0 is a different animal. Here's how far the project has come:

Capabilityv0.1 (Feb 2025)1.0AI providers 2 (OpenAI, AWS Bedrock)16 , with cross-provider request/response translationAPI surface Chat completionsChat, completions, embeddings, image generation, audio (transcription / translation / speech), and the OpenAI Responses APIMCP (Model Context Protocol) —A full MCP gateway : server multiplexing, tool routing & filtering, and fine-grained authorizationMultimodal —Image, audio, and video inputs across supported providersObservability Basic metricsOpenTelemetry tracing, OpenInference, GenAI token metrics, separate reasoning-token accountingMulti-tenancy & routing Token rate limitingHostname-based routing, model virtualization, and quota-aware rate limitingControl-plane API v1alpha1 (experimental)v1beta1 (stable) Sixteen providers, integrated through a single OpenAI-compatible interface, including OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock, Anthropic, Mistral, Cohere, Groq, Together AI, DeepInfra, DeepSeek, Hunyuan, SambaNova, Grok, and the Tetrate Agent Router Service.

✨ What's in 1.0

One API, every provider

Point your application at a single OpenAI-compatible endpoint and let the gateway handle provider-specific translation, authentication, and routing. Switch or mix providers without touching application code — and use model virtualization to keep your app code stable while routing changes underneath:

backendRefs:

- name: openai-backend

modelNameOverride: "gpt-4o"

- name: anthropic-backend

modelNameOverride: "claude-opus-4"

This is the key to A/B testing, gradual migrations, multi-provider strategies, and safeguarding against vendor lock-in.

Provider authentication, handled for you

BackendSecurityPolicy keeps provider credentials out of your application and centralizes upstream auth — API keys, AWS, Azure, and GCP cloud-native identity (including Workload Identity), all managed at the gateway.

An MCP gateway for the agent era

Aggregate multiple Model Context Protocol servers behind one endpoint, route and filter the tools clients can see, and enforce fine-grained, CEL-based authorization — so tools/list only ever returns what a caller is actually allowed to use.

Enterprise observability, built in

Token-aware metrics, OpenTelemetry tracing with OpenInference compatibility (for evaluation tools like Arize Phoenix), and separate accounting for reasoning tokens give you the cost control and visibility that AI workloads demand.

Standards all the way down

Built on the Kubernetes Gateway API and the Gateway API Inference Extension, Envoy AI Gateway is an additive layer on Envoy Gateway — it expands what Envoy can do for GenAI traffic without changing how you already deploy and operate it.

🌍 Built by a Community, on CNCF Envoy

1.0 is the work of a genuinely cross-industry community. Maintainers come from Tetrate, Bloomberg, Tencent, Netflix, and Nutanix , alongside a growing roster of independent contributors who join our weekly community...

Envoy AI Gateway 1.0 – A Stable, Production-Ready AI Gateway

Related Articles

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI