Decoupling Compute and Memory for Async GPUs

yiyingzhang1 pts0 comments

Cool open-source project that introduces a new programming model for decoupling compute and memory for NVIDIA GPUs that supports asynchronous memory operations (e.g., Hopper). 12% perf improvement over SOTA and 67% less kernel code.Paper: VDCores: Resource Decoupled Programming and Execution for Asynchronous GPU arXiv:2605.03190

memory decoupling compute gpus programming asynchronous