Netflix Simplified Batch Compute with Kueue

dalvrosa1 pts0 comments

How Netflix Simplified Batch Compute with Kueue | by Netflix Technology Blog | Jun, 2026 | Netflix TechBlogSitemapOpen in appSign up<br>Sign in

Medium Logo

Get app<br>Write

Search

Sign up<br>Sign in

Netflix TechBlog

Learn about Netflix’s world class engineering efforts, company culture, product developments and more.

How Netflix Simplified Batch Compute with Kueue

Netflix Technology Blog

6 min read·<br>4 days ago

Listen

Share

By Alvin Bao, Alex Petrov, Jennifer Lai, Aidan Sherr, and Samartha Chandrashekar<br>As a part of the journey to transition Netflix’s compute infrastructure to be more Kubernetes-native, we have leaned into incorporating components from the Kubernetes ecosystem into our container platform Titus. One example of this is our use of Kueue, a cloud-native job queueing system for batch workloads, which has largely replaced the custom queuing and scheduling logic in our homegrown managed batch solution Compute Managed Batch (CMB). In this post, we’ll give an overview of what motivated the migration, how we migrated millions of batch jobs to use Kueue, and what Kueue allows us to offer as a Compute platform.<br>Brief Overview of CMB and Titus<br>CMB is a managed batch solution that allows users and applications to execute and manage workloads that run to completion. Using a tenant hierarchy, workloads are managed and queued with ordered execution through priorities, and capacity is managed on a per-tenant basis. Workloads that are submitted to CMB are then run on Titus. The features of Titus relevant to CMB are workload federation across multiple cells (Kubernetes clusters) and federated capacity reservations. This means CMB can talk to a single Titus endpoint to get/submit workloads and update capacity reservations without having to worry about the underlying cell/cluster topology.<br>CMB Tenant Hierarchy<br>Tenants provide a grouping mechanism for jobs submitted on behalf of certain organizations, platforms, or applications. Users can create and organize tenants however best suits their organization or use case. For example, an organization may use a single tenant across several applications or a complex hierarchical structure that matches its team and application ownership structure.<br>Tenants are associated with a capacity configuration. The capacity configuration defines the amount of compute capacity available to the tenant and provides certain guarantees around isolation from other tenants. The capacity configuration contains weight (used for fair sharing) and resource dimensions.<br>There are two types of tenants in CMB:<br>Internal Tenants — meant to facilitate the creation of a tree of tenants. Internal tenants’ children can be both internal and leaf tenants. Internal tenants themselves do not accept work and thus do not have associated queues.<br>Leaf Tenants — can accept work and have queues associated with them. Leaf tenants cannot have any children.<br>With regards to capacity configuration, tenants can use 2 types of capacity:<br>Reserved Capacity<br>For internal tenants, if a user specifies reserved capacity, it is fair-shared across the subtree and usable by the leaf tenants under that internal tenant.<br>For leaf tenants, if a user specifies reserved capacity, it partitions capacity within the hierarchy so that other tenants cannot reserve the same resources. Those reserved resources are not shared with any other tenant, ensuring throughput for a given leaf tenant.<br>Shared Capacity<br>The Compute team maintains a global pool of shared capacity that any tenant can burst into, in addition to its reserved capacity. Reservations are not required to use CMB, so a tenant can run out of shared capacity entirely. The pool is fair-shared across tenants, but in CMB, this applied only at admission: CMB had no preemption, so once a job was admitted, it ran to completion regardless of shifts in fair-share demand.<br>Kueue changes the semantics for both types of capacity, which the fair sharing and preemption section covers.<br>Here is an example of what a tenant hierarchy looks like:<br>Press enter or click to view image in full size

CMB User/Application Workload Submission Flow<br>Press enter or click to view image in full size

CMB User/Application Tenant Management Flow<br>Press enter or click to view image in full size

Why Kueue?<br>CMB was created in 2018, before or alongside many of the open-source batch compute offerings available today. Over the years, as the Kubernetes ecosystem has evolved, many of the features that CMB offered or strived to offer have been included in these open source projects e.g., fair sharing, hierarchical tenants, capacity management, priority queuing. In addition, it became increasingly cumbersome to develop new features such as preemption when CMB was so far removed from the underlying Kubernetes cluster.<br>Get Netflix Technology Blog’s stories in your inbox

Join Medium for free to get updates from this writer.

Subscribe

Subscribe

Remember me for faster sign in

The team took a look at what it would take to...

capacity tenants tenant netflix batch compute

Related Articles