What science can tell us about C and C++'s security (2020)

tosh1 pts0 comments

What science can tell us about C and C++'s security · Alex Gaynor

Alex Gaynor

Hi, I'm Alex. I'm a software resilience engineer. I care about building systems that work. I've worked for the government, in the private sector, and on open source. I'm based in Washington, DC.

© 2026. All rights reserved.

What science can tell us about C and C++'s security

Wed, May 27, 2020<br>There are not a lot of very strong empirical results in the field of<br>programming languages. This is probably<br>because there&rsquo;s a huge amount of variables to control for, and most of the<br>subjects available to researchers are CS undergraduates. However, I have<br>recently found a result replicated across numerous codebases, which as far as I<br>can tell makes it one of the most robust findings in the field:

If you have a very large (millions of lines of code) codebase, written in a<br>memory-unsafe programming language (such as C or C++), you can expect<br>at least 65% of your security vulnerabilities to be caused by memory<br>unsafety.

This result has been reproduced across:

Android (cite): &ldquo;Our data shows that issues like use-after-free, double-free, and heap buffer overflows generally constitute more than 65% of High & Critical security bugs in Chrome and Android.&rdquo;

Android&rsquo;s bluetooth and media components (cite): &ldquo;Use-after-free (UAF), integer overflows, and out of bounds (OOB) reads/writes comprise 90% of vulnerabilities with OOB being the most common.&rdquo;

iOS and macOS (cite): &ldquo;Across the entirety of iOS 12 Apple has fixed 261 CVEs, 173 of which were memory unsafety. That’s 66.3% of all vulnerabilities.&rdquo; and &ldquo;Across the entirety of Mojave Apple has fixed 298 CVEs, 213 of which were memory unsafety. That’s 71.5% of all vulnerabilities.&rdquo;

Chrome (cite): &ldquo;The Chromium project finds that around 70% of our serious security bugs are memory safety problems.&rdquo;

Microsoft (cite): &ldquo;~70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues&rdquo;

Firefox&rsquo;s CSS subsystem (cite): &ldquo;If we’d had a time machine and could have written this component in Rust from the start, 51 (73.9%) of these bugs would not have been possible.&rdquo;

Ubuntu&rsquo;s Linux kernel (cite): &ldquo;65% of CVEs behind the last six months of Ubuntu security updates to the Linux kernel have been memory unsafety.&rdquo;

And these numbers are in line with what we&rsquo;ve seen in<br>0days that have been discovered being exploited.

This observation has been reproduced across numerous very large code bases,<br>built by different companies, started at different points in time, and using<br>different development methodologies. I&rsquo;m not aware of any counter-examples. The<br>one thing they have in common is being written in a memory-unsafe programming<br>language: C or C++.

Based on this evidence, I&rsquo;m prepared to conclude that using memory-unsafe<br>programming languages is bad for security. This would be an exciting result!<br>Empirically demonstrated technical interventions to improve software are rare.<br>And memory-unsafety vulnerabilities are one of the only kind that we know how<br>to completely eliminate, by choosing memory-safe languages. However, it&rsquo;s<br>critical we approach this question as rational empiricists, and see if the<br>evidence really merits the conclusion that memory-unsafe programming languages<br>are bad for security.

Let&rsquo;s consider the Venn diagram of vulnerabilities:

There are vulnerabilities that can exist only in memory-unsafe languages<br>(e.g. buffer overflows or use-after-frees)

There are vulnerabilities that can exist in any programming language (e.g.<br>SQL injection or XSS)

There are vulnerabilities that can only exist in memory-safe languages (e.g.<br>use of eval on untrusted inputs; eval tends to only exist in very<br>high-level languages, which are all memory-safe)

So the first set contains at least 65% of the vulnerabilities in these types of<br>codebases, and logically the second set must contain 35% of the<br>vulnerabilities. So if we change programming language to something memory-safe,<br>we get rid of at least 65% of our vulnerabilities. But does the magnitude of the<br>other sets change?

I posit that the second set stays the same size: there&rsquo;s no reason or evidence<br>to think that porting C++ to a memory-safe language results in additional SQL<br>injection.

Our third set is vulnerabilities that are specific to memory-safe languages.<br>Actual use of eval in production code is incredibly rare in my experience,<br>however its cousin &ldquo;unsafe deserialization&rdquo; does occur in the real world.<br>To investigate its frequency, I looked into Java&rsquo;s unsafe deserialization on<br>Android. Based on research I reviewed,<br>Android as a whole appears to have had maybe a dozen of these. Basically every<br>month it has more memory-unsafety issues than it&rsquo;s had vulnerabilities of this<br>class all time. So I believe this class to be orders of magnitude...

memory vulnerabilities rsquo security ldquo rdquo

Related Articles