Meta uses CXL to reuse old DDR4 and cut some inference fleets by 25%

Zuck saves Meta bucks by reusing memory from old servers with a custom CXL ASIC

Jump to main content

REG AD

SYSTEMS

Zuck saves Meta bucks by reusing memory from old servers with a custom CXL ASIC

In production on millions of boxes and the payoff is a 25% reduction in machines needed for some inference workloads

Simon Sharwood

Simon Sharwood

APAC Editor

Published mon 29 Jun 2026 // 11:43 UTC

Meta is recovering DDR4 memory from old servers, installing it in new machines, and using a custom Compute Express Link (CXL) ASIC to share the memory across applications – without encountering latency problems. The social networking giant calls its tech "Vistara" and will present it at ISCA 2026 on Monday, but The Register found the company's paper ahead of the talk. Our sister site, Blocks and Files, also happens to have reported on this on Friday. The document opens with the admission that Meta can't increase the amount of memory in around 40 percent of its vast server fleet, meaning millions of servers can't handle some of its workloads. That's unfortunate because the expected service life of its servers is three to five years, but memory is useful for seven to ten years.

REG AD

Meta's response is to rip DDR4 DIMMs from old servers, put them into new machines that rely on DDR5, and turn it all into a pool of capacity – which in theory makes it possible to compose virtual servers that share resources across multiple physical hosts.

REG AD

The paper points out that CXL is hard to put into production because sharing memory across hosts can mean low bandwidth, high latency, and extra computing overheads to manage additional memory layers. Those problems can arise in systems that combine different memory technologies. Meta wanted to blend memory types in a single machine but found off-the-shelf CXL kit can't do the job. "Most CXL solutions bundle DRAM with the controller – preventing DIMM reuse – and often omit DDR4 support, which is a requirement for repurposing older memory," the paper states. "Additionally, their high power consumption and high cost further limit their appeal." To make CXL sing, Meta created a custom ASIC called "Vistara." "At its core, the Vistara ASIC is designed to bridge DDR4 memory to host processors via a CXL 2.0/1.1-compliant PCIe Gen5 x16 interface," the paper explains. "Each Vistara ASIC integrates two independent 72-bit DDR4 memory channels, supporting speeds up to 3,200 MT/s and up to 256 GB per chip with 64 GB DIMMs." A pair of custom RISC-V processors drive the ASICs. Vistara hardware lives in devices Meta calls a "MemServer" powered by an AMD Turin processor packing 158 cores and running 316 threads. Each MemServer combines 768 GB of DDR5 memory alongside 256 GB of DDR4 connected through Vistara ASICs.

MORE CONTEXT

Memory godboxes could offer relief from the RAMpocalypse

One vendor doesn't mind high RAM prices: VMware

PCIe 7.0 first official draft lands, doubling bandwidth yet again

Micron joins the CXL 2.0 party with a 256GB memory expander

"The Vistara CXL cards are installed in dedicated rear-accessible slots within each MemServer chassis," the paper reveals. "To manage the increased thermal load from high-density memory and CXL devices, the chassis employs directed airflow with high-capacity fans that channel cool air directly across the Vistara modules, for stable operation under heavy workloads." The software side of Vistara sees the DDR4 presented to the OS "as a distinct, CPU-less NUMA node, separate from the local DRAM nodes directly attached to the processor." Meta's platforms first use all available local DDR4, then employ the CXL-enabled memory when needed.

REG AD

Zuck's house of hyperscale hypnotism makes this happen with custom tweaks to the Linux CXL driver. "All Linux kernel CXL driver code in use for Vistara is either present in the upstream kernel, or is on its way to being included in the upstream kernel," the paper states. The paper says Meta has put this CXL stuff to work "in hyperscale infrastructure with millions of servers, across a variety of production workloads, including disaggregated ML inference (embedding tables in recommendation systems), big data processing, databases, distributed caches, and CI/CD build systems." Some workloads, including big data tools such as Spark and Hive, use terabyte and petabyte-scale datasets, and need hundreds of gigabytes of memory per job. The paper says that if those workloads experience out-of-memory events, it can "disrupt critical business analytics and ML pipelines." "The expanded memory headroom provided by CXL enhances system reliability," the paper explains. "By mitigating the risk of out-of-memory (OOM) events, CXL reduces the frequency of job failures and the associated overhead of job restarts and resource fragmentation by 33 percent." Meta says the system also cuts infrastructure costs. "These deployments have demonstrated large benefits, such as reducing the server count by up to 25...

Meta uses CXL to reuse old DDR4 and cut some inference fleets by 25%

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7