The guaranteed contiguous memory allocator [LWN.net]
LWN<br>.net<br>News from the source
Content Weekly Edition<br>Archives<br>Search<br>Kernel<br>Security<br>Events calendar<br>Unread comments
LWN FAQ<br>Write for us
Edition Return to the Front page
User:<br>Password: |
Log in /<br>Subscribe /<br>Register
The guaranteed contiguous memory allocator
By Jonathan Corbet<br>March 21, 2025
As a system runs and its memory becomes fragmented, allocating large,<br>physically contiguous regions of memory becomes increasingly difficult.<br>Much effort over the years has gone into avoiding the need to make such<br>allocations whenever possible, but there are times when they simply cannot<br>be avoided. The kernel's contiguous memory<br>allocator (CMA) subsystem attempts to make such allocations possible,<br>but it has never been a perfect solution. Suren Baghdasaryan is is trying<br>to improve that situation with the guaranteed<br>contiguous memory allocator patch set, which includes work from Minchan<br>Kim as well.
In the distant past, Dan Magenheimer introduced the concept of transcendent memory — memory that is not<br>directly addressable, but which can be used opportunistically by the kernel<br>for caching or other purposes. Most of the transcendent-memory work has<br>since gone unused and been removed from the kernel, but the idea persists,<br>and this patch series makes use of it to provide guaranteed CMA.
Specifically, the patch set includes a subsystem called "cleancache", which<br>is a concept that was proposed by<br>Magenheimer in 2012. If the kernel has to dump a page of data, but would<br>like to keep that data around if possible, it can put it into the<br>cleancache, which will stash it aside somewhere. Should the need for that<br>data arise, the kernel can copy it back out of the cleancache — if it is<br>still there. Meanwhile, the page that initially contained that data can be<br>reclaimed for other uses.
Guaranteed CMA then builds on cleancache by allocating a region of<br>physically contiguous memory at boot, when such allocations are relatively<br>easy. That memory is then turned into a cleancache and made available to<br>the kernel. Whenever the memory-management system reclaims pages of<br>file-backed memory, it can choose to place the data from those pages into<br>the cleancache. Should that data be needed, an attempt will be made to<br>retrieve it from the cleancache before rereading it from disk. The memory<br>reserved for CMA is thus available to the kernel when not allocated to a<br>CMA user, but in a restricted way.
$ sudo subscribe today
Subscribe today and elevate your LWN privileges. You’ll have<br>access to all of LWN’s high-quality articles as soon as they’re<br>published, and help support LWN in the process. Act now and you can start with a free trial subscription.
At some point, some kernel subsystem will need a large, physically<br>contiguous buffer. Requesting that buffer from the guaranteed CMA<br>subsystem will result in an allocation from the reserved memory, after<br>dropping any cached data that happens to be in the allocated region. This<br>allocation can happen quickly, since that data has been cached with the<br>explicit stipulation that it can be dropped at any time. This<br>approach was proposed by Seongjae Park and<br>Kim in 2014.
This new subsystem is integrated with the existing CMA API, so CMA users<br>need not change to make use of it. The reserved region is set up by way of<br>a devicetree property explicitly requesting the "guaranteed" behavior.
The end result is a version of CMA that is guaranteed to succeed as long as<br>the total allocations do not exceed the size of the reserved area; existing<br>CMA has a higher likelihood of failure. Since CMA usage is often<br>restricted to a problematic device or two with known needs, sizing the<br>reserved area for a specific system should be straightforward.
The other advantage of guaranteed CMA is latency; if the memory is<br>available, it can be allocated quickly. CMA in current kernels may have to<br>migrate data out of the allocated region first, which takes time. The<br>downside is that the memory reserved for guaranteed CMA can only be used<br>for data that can be dropped at will; that will increase the pressure<br>on the rest of the memory in the system.
This patch series was posted just ahead of the 2025 Linux Storage,<br>Filesystem, Memory-Management, and BPF Summit, where it is currently<br>scheduled for a discussion in the memory-management track. There will<br>probably not be a lot of comments on it ahead of that discussion. The<br>patches are relatively small, though, and do not intrude into the<br>memory-management subsystem on systems where CMA is not in use, so we might<br>just see a transcendent-memory application actually go forward, some<br>15 years after the idea was first proposed.<br>Index entries for this article<br>KernelContiguous memory allocator<br>KernelMemory management/Large allocations
The LWN site is currently under high scraper load, so comment<br>display has been suppressed for anonymous users. If you are a<br>human, you may read the comments by clicking the button below:
Note :...