Kubernetes CPU requests and limits, explained through cgroups

Kubernetes Resource Management: CPU Request and Limit in PracticeSkip to main content Aleksander Roszig April 4, 2026 | 10 min ReadKubernetes Resource Management: CPU Request and Limit in Practice Kubernetes has been with us for 11 years now, and resource management is one of its most fundamental functions . Yet it remains one of the most common issues we observe in projects. Many people don’t know how to properly use resource management mechanisms or use them incorrectly. It’s still one of the most discussed topics in the Kubernetes world. In this article, I’d like to explain how CPU request and CPU limit work. In this post, we’ll focus on CPU, and in the next part (Kubernetes memory request and limit), we’ll discuss memory management. Kubernetes Resource Management – How Does CPU Resource Allocation Work? The answer is: cgroups . Cgroups version 1 have been available since Linux kernel 2.6.24 (January 2008). They are the foundation of containerization technologies like Docker, Podman, and LXC. Today, most systems use cgroups v2, which were declared stable in kernel 4.5 in 2016 – and this is the version we’ll continue to reference. “Control groups, usually called cgroups, are a Linux kernel feature that enables organizing processes into hierarchical groups, whose use of various types of resources can then be limited and monitored.” - https://man7.org/linux/man-pages/man7/cgroups.7.html In practice, this means that all processes in a container can be placed in a single cgroup, for which we set parameters of controllers, such as CPU or memory. Kubelet, which is responsible for creating containers, uses among others the following controllers: cpu cpuset memory hugetlb pids When setting CPU requests and limits, we’re interested in the cpu controller . According to the Linux kernel documentation: “CPU controllers regulate the distribution of processor cycles. This controller implements weight and absolute bandwidth limitation models for standard scheduling policy, as well as absolute bandwidth allocation model for real-time scheduling policy.” We’ll focus on standard scheduling mode (normal scheduling), because that’s how Kubernetes works by default. With certain changes, it’s possible to run real-time applications on Kubernetes, but it’s not the default platform for this type of workload and is not yet supported in cgroups version 2. CPU Request – Weight Model (weight) CPU request in Kubernetes corresponds to the weight model in cgroups. This is the amount of CPU we guarantee ourselves under load. If we set in a Deployment: resources: requests: cpu: 500m Kubernetes converts this to a shares value (in cgroups v1) or weight (in cgroups v2). It does this using the MilliCPUToShares() function: // MilliCPUToShares converts the milliCPU to CFS shares. func MilliCPUToShares(milliCPU int64) uint64 { if milliCPU == 0 { // Docker converts zero milliCPU to unset, which maps to kernel default // for unset: 1024. Return 2 here to really match kernel default for // zero milliCPU. return MinShares // Conceptually (milliCPU / milliCPUToCPU) * sharesPerCPU, but factored to improve rounding. shares := (milliCPU * SharesPerCPU) / MilliCPUToCPU if shares MaxShares { return MaxShares return uint64(shares) In simplified form: shares = milliCPU * 1024 / 1000 For our set request value of 500m, the calculation will be: 500 * 1024 / 1000 = 512 This value goes to the cgroups part that is responsible for CPU time allocation. In cgroups v1, this is the cpu.shares parameter, and in cgroups v2, it is cpu.weight. This means our pod will have a weight of 512 relative to other pods, which have their own shares/weight values. Next, it’s used by the Linux scheduler . In many distributions with kernel version above 6.6, such as Amazon Linux 2023, Ubuntu 22.04.5+, 24.04+, Red Hat Enterprise Linux 10, there’s already a newer EEVDF scheduler version. The Linux scheduler is the part of the kernel that handles CPU resource allocation. Even if you sometimes see the “CFS” metric name in Kubernetes code, which refers to an older scheduler that was added to the Linux kernel in October 2007 in version 2.6.23, your node still uses the scheduler provided by your Linux system. How Does CPU Sharing Work in Kubernetes? Let’s assume we have 3 pods and for each of them a cgroup has been created with the following values: cgroupmilliCPU parameterG1150G2100G350 The total sum of weights is 300. The Linux scheduler will allocate CPU time proportionally to these weights. This means that if all three cgroups are active and competing for CPU time. We can also calculate shares = milliCPU * 1024 / 1000 to see how this translates to percentages: cgroupmilliCPU parameterCPU Time AllocationsharesG1150~50% (150/300)~50% (153/306)G2100~33% (100/300)~33% (102/306)G350~16%(50/300)~16%(51/306) We set how much time each cgroup can get at full CPU...

Kubernetes CPU requests and limits, explained through cgroups

Related Articles

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7