Analyzing Bytes: Pre-Disassembly Static Binary Analysis
Skip to main content
Research
Search
Analyzing Bytes: Pre-Disassembly Static Binary Analysis
Huan Nguyen
Soumyakant Priyadarshan
ChenCheng Jiang
R. Sekar
Proceedings of the ACM on Programming Languages, Association for Computing Machinery (2026), pp. 1127-1151
Download
Google Scholar
Copy Bibtex
Abstract
Binary code analysis plays a central role in numerous applications in software security, performance optimization, reverse engineering, and so on. Existing techniques need to first disassemble binaries into functions in assembly code before an analysis can be performed. However, disassembly and function identification have proven to be major challenges for complex variable-length instruction sets such as the x86. A recent trend has been to use static analysis to improve the accuracy of these tasks. This raises a chicken-and-egg problem: a disassembly is needed for static analysis, but a static analysis is needed for accurate disassembly! We overcome this problem by developing a novel static analysis approach that can operate before committing to a disassembly. Our analysis operates on the output of exhaustive disassembly that considers each possible offset in a binary as an instruction, and constructs what is known as a super-set control-flow graph (CFG). The central technical challenge in analyzing this CFG is that it mixes legitimate instructions with unintended ones, causing analysis results from invalid code paths to pollute legitimate ones. To overcome this challenge, we begin with a key new insight that if we focus on backward analyses, we can ensure accuracy of analysis results at intended instructions even though we have no idea where these intended instructions are! Moreover, our analysis operates in time that is linear in the size of the binary. Specifically, in O(n) total time, it yields analysis results for every one of the n offsets in an n-byte binary. For this task, it is orders of magnitude faster than previous techniques, as the previous techniques typically need to repeat the analysis many times.
Meet the teams driving innovation
Our teams advance the state of the art through research, systems engineering, and collaboration across Google.
See our teams
Explore our other initiatives
Google AI
Discover how Google AI is committed to enriching knowledge and solving complex challenges
Products
Build
Research
Responsibility
Societal Impact
About
Google Cloud
High-performance infrastructure for cloud computing, data analytics & machine learning
Overview
Solutions
Products
Pricing
Resources
Google DeepMind
Our mission is to build AI responsibly to benefit humanity
Models
Research
Science
About
Google Labs
Explore the future of AI responsibly with Google Labs
About
Experiments
Stay connected
About Google
Google Products
Privacy
Terms
Cookies management controls
×