Scalable GPU Acceleration of Scalar Functions in Analytical Databases

matt_d1 pts0 comments

Scalable GPU Acceleration of Scalar Functions in Analytical Databases: Compilation, Benchmarking, and Optimization - Microsoft Research

Skip to main content

Research

Publications<br>Code & data<br>People<br>Microsoft Research blog

Artificial intelligence<br>Audio & acoustics<br>Computer vision<br>Graphics & multimedia<br>Human-computer interaction<br>Human language technologies<br>Search & information retrieval

Data platforms and analytics<br>Hardware & devices<br>Programming languages & software engineering<br>Quantum computing<br>Security, privacy & cryptography<br>Systems & networking

Algorithms<br>Mathematics

Ecology & environment<br>Economics<br>Medical, health & genomics<br>Social sciences<br>Technology for emerging markets

Academic programs<br>Events & academic conferences<br>Microsoft Research Forum

Behind the Tech podcast<br>Microsoft Research blog<br>Microsoft Research Forum<br>Microsoft Research podcast

About Microsoft Research<br>Careers & internships<br>People<br>Emeritus program<br>News & awards<br>Microsoft Research newsletter

Africa<br>AI for Science<br>AI Frontiers<br>Asia-Pacific<br>Cambridge<br>Health Futures<br>India<br>Montreal<br>New England<br>New York City<br>Redmond

Applied Sciences<br>Mixed Reality & AI - Cambridge<br>Mixed Reality & AI - Zurich

Register: Research Forum

Microsoft Security<br>Azure<br>Dynamics 365<br>Microsoft 365<br>Microsoft Teams<br>Windows 365

Microsoft AI<br>Azure Space<br>Mixed reality<br>Microsoft HoloLens<br>Microsoft Viva<br>Quantum computing<br>Sustainability

Education<br>Automotive<br>Financial services<br>Government<br>Healthcare<br>Manufacturing<br>Retail

Find a partner<br>Become a partner<br>Partner Network<br>Microsoft Marketplace<br>Software companies

Blog<br>Microsoft Advertising<br>Developer Center<br>Documentation<br>Events<br>Licensing<br>Microsoft Learn<br>Microsoft Research

View Sitemap

Scalable GPU Acceleration of Scalar Functions in Analytical Databases: Compilation, Benchmarking, and Optimization

Kaushik Rajan

Sampath Rajendra

Momin Al-Ghosien

Nicolas Bruno

Carlo Curino

Matteo Interlandi

Yinan Li

Lukas M. Maas

Craig Peeper

Surajit Chaudhuri

Johannes Gehrke

VLDB 2026

| August 2026

Download BibTex

Accelerating SQL query execution with GPUs is a central focus in database research. While prior systems have achieved notable speedups by offloading relational operators, the acceleration of the wide range of scalar functions that are supported by analytical engines remains unaddressed. Our analysis reveals that many scalar functions incur substantial computational overhead and often constitute the primary bottleneck in analytical queries on CPUs. This observation motivates a systematic exploration of the opportunities and challenges in accelerating scalar functions on GPUs.

Unlike relational operators, which are few in number and standardized, production databases support hundreds of scalar functions. The absence of a standardized specification, combined with this diversity, renders manual GPU porting infeasible. To address this, we present an LLVM-MLIR-based compiler toolchain that automatically translates the CPU-based implementations of scalar functions from production databases into efficient GPU kernels, while preserving their original semantics. Our approach lifts scalar functions to a high-level intermediate representation, applies resource-optimizing transformations, and generates GPU assembly code, supporting all relevant data types, parameters, and database context variables.

As existing benchmarks do not sufficiently stress test scalar functions in analytical queries, we introduce a variant of TPC-H that utilizes scalar functions while preserving the original query intent. Integrating our GPU kernels into a state-of-the-art GPU data base system, we demonstrate substantial performance gains over a leading CPU database that uses slightly more expensive hardware: 7.6× on enhanced TPC-H and 6.4× on production queries, further widening the gap between GPU and CPU databases. The generated kernels deliver performance comparable to hand-optimized GPU implementations, establishing our approach as a scalable and practical solution for accelerating scalar functions on GPUs.

Opens in a new tab

Publication

Research Areas

Research Labs

Follow us:

Follow on X

Like on Facebook

Follow on LinkedIn

Subscribe on Youtube

Follow on Instagram

Subscribe to our RSS feed

Share this page:

Share on X

Share on Facebook

Share on LinkedIn

Share on Reddit

Surface Pro<br>Surface Laptop<br>Surface Laptop Ultra<br>Surface RTX Spark Dev Box<br>Copilot for organizations<br>Copilot for personal use<br>Explore Microsoft products<br>Windows 11 apps

Account profile<br>Download Center<br>Microsoft Store support<br>Returns<br>Order tracking<br>Certified Refurbished<br>Microsoft Store Promise<br>Flexible Payments

Microsoft in education<br>Devices for education<br>Microsoft Teams for Education<br>Microsoft 365 Education<br>How to buy for your school<br>Educator training and development<br>Deals for students and parents<br>AI for education

Microsoft AI<br>Microsoft Security<br>Dynamics 365<br>Microsoft 365<br>Microsoft Power Platform<br>Microsoft Teams<br>Microsoft 365 Copilot<br>Small Business

Azure<br>Microsoft...

microsoft research scalar functions analytical databases

Related Articles