american fuzzy lop
Written and maintained by Michal Zalewski
Copyright 2013, 2014, 2015, 2016 Google Inc. All rights reserved.<br>Released under terms and conditions of Apache License, Version 2.0.
For new versions and additional information, check out:<br>http://lcamtuf.coredump.cx/afl/
To compare notes with other users or get notified about major new features,<br>send a mail to .
** See QuickStartGuide.txt if you don't have time to read this file. **
1) Challenges of guided fuzzing
Fuzzing is one of the most powerful and proven strategies for identifying<br>security issues in real-world software; it is responsible for the vast<br>majority of remote code execution and privilege escalation bugs found to date<br>in security-critical software.
Unfortunately, fuzzing is also relatively shallow; blind, random mutations<br>make it very unlikely to reach certain code paths in the tested code, leaving<br>some vulnerabilities firmly outside the reach of this technique.
There have been numerous attempts to solve this problem. One of the early<br>approaches - pioneered by Tavis Ormandy - is corpus distillation. The method<br>relies on coverage signals to select a subset of interesting seeds from a<br>massive, high-quality corpus of candidate files, and then fuzz them by<br>traditional means. The approach works exceptionally well, but requires such<br>a corpus to be readily available. In addition, block coverage measurements<br>provide only a very simplistic understanding of program state, and are less<br>useful for guiding the fuzzing effort in the long haul.
Other, more sophisticated research has focused on techniques such as program<br>flow analysis ("concolic execution"), symbolic execution, or static analysis.<br>All these methods are extremely promising in experimental settings, but tend<br>to suffer from reliability and performance problems in practical uses - and<br>currently do not offer a viable alternative to "dumb" fuzzing techniques.
2) The afl-fuzz approach
American Fuzzy Lop is a brute-force fuzzer coupled with an exceedingly simple<br>but rock-solid instrumentation-guided genetic algorithm. It uses a modified<br>form of edge coverage to effortlessly pick up subtle, local-scale changes to<br>program control flow.
Simplifying a bit, the overall algorithm can be summed up as:
1) Load user-supplied initial test cases into the queue,
2) Take next input file from the queue,
3) Attempt to trim the test case to the smallest size that doesn't alter<br>the measured behavior of the program,
4) Repeatedly mutate the file using a balanced and well-researched variety<br>of traditional fuzzing strategies,
5) If any of the generated mutations resulted in a new state transition<br>recorded by the instrumentation, add mutated output as a new entry in the<br>queue.
6) Go to 2.
The discovered test cases are also periodically culled to eliminate ones that<br>have been obsoleted by newer, higher-coverage finds; and undergo several other<br>instrumentation-driven effort minimization steps.
As a side result of the fuzzing process, the tool creates a small,<br>self-contained corpus of interesting test cases. These are extremely useful<br>for seeding other, labor- or resource-intensive testing regimes - for example,<br>for stress-testing browsers, office applications, graphics suites, or<br>closed-source tools.
The fuzzer is thoroughly tested to deliver out-of-the-box performance far<br>superior to blind fuzzing or coverage-only tools.
3) Instrumenting programs for use with AFL
When source code is available, instrumentation can be injected by a companion<br>tool that works as a drop-in replacement for gcc or clang in any standard build<br>process for third-party code.
The instrumentation has a fairly modest performance impact; in conjunction with<br>other optimizations implemented by afl-fuzz, most programs can be fuzzed as fast<br>or even faster than possible with traditional tools.
The correct way to recompile the target program may vary depending on the<br>specifics of the build process, but a nearly-universal approach would be:
$ CC=/path/to/afl/afl-gcc ./configure<br>$ make clean all
For C++ programs, you'd would also want to set CXX=/path/to/afl/afl-g++.
The clang wrappers (afl-clang and afl-clang++) can be used in the same way;<br>clang users may also opt to leverage a higher-performance instrumentation mode,<br>as described in llvm_mode/README.llvm.
When testing libraries, you need to find or write a simple program that reads<br>data from stdin or from a file and passes it to the tested library. In such a<br>case, it is essential to link this executable against a static version of the<br>instrumented library, or to make sure that the correct .so file is loaded at<br>runtime (usually by setting LD_LIBRARY_PATH). The simplest option is a static<br>build, usually possible via:
$ CC=/path/to/afl/afl-gcc ./configure --disable-shared
Setting AFL_HARDEN=1 when calling 'make' will cause the CC wrapper to<br>automatically enable code hardening options that make it easier to detect<br>simple memory bugs. Libdislocator, a helper library...