Faster fuzzing ferrets out 42 fresh zero-day flaws

A group of researchers has found 42 zero-day flaws in a range of software tools using a new take on an old concept. The team, from Singapore, Australia and Romania, worked out a better approach to a decades-old testing technique called fuzzing.

A standard part of software testing involves developers placing inputs in software that they think might cause trouble. They then use scripts or tools to automatically run the program and test it with those inputs. They might test a web form that takes a first name as input for example, and ensure that it doesn’t allow a blank entry, or an entry that includes a command to manipulate a database.

This can be useful in ferreting out flaws, but it is difficult to make that comprehensive. Developers may not think of everything. And it gets even more complicated if you are uploading a sound file or a photograph. It’s far more difficult to produce testing data that might break the program, or even to know what that might look like.

Fuzzing programs fill that gap by automatically changing files and other inputs in many unpredictable ways. They can run thousands of different inputs against the program, often changing individual bits in each file that they present to it, to see if anything breaks.

There are three broad kinds of fuzzing.

Black box fuzzing knows nothing about the target program and just throws as many combinations as possible at it indiscriminately. This is fast, but it isn’t good at exposing bugs buried deep inside a program.

White box fuzzing is at the other end of the spectrum, analysing the structure of the program in depth to understand how it functions. This lets it tailor its tests to particular logic flows in the program code, increasing the percentage of a program’s function that it can look at, which testers call ‘coverage’. It can uncover some deep and meaningful bugs, but it can be slow and time-consuming.

Grey box fuzzing looks for a happy medium. Instead of analysing a program’s structure, it uses some ‘seed’ files designed to generate valid inputs and mutates them by flipping bits in those files. When it finds a result that it considers interesting, it adds the input that generated it to the list of seed files and then iterates on that.

Grey box fuzzers effectively feel their way around a program like a person feeling their way around a dark room. They are faster than white box fuzzers but more effective in increasing coverage than black box ones.

American Fuzzy Lop (AFL) is a good example of a grey box fuzzer. However, the researchers wanted to make it even better. They decided that just feeling its way through a program by flipping bits in an input file would only get the fuzzer so far. It would be unlikely to address major structural changes in a file that could expose deeper bugs. To change that, they decided to create a map of the input file structure. This map, known as a virtual structure, describes the file format and shows where different parts (chunks) of the file begin and end along with how each chunk differs from others. In their case, they developed a virtual structure for media formats like wave audio files.

This approach lets the researchers apply the same bit-flipping approach that traditional grey-hat fuzzers use, but to do it with seeds representing different file chunks. The tool can add, delete and splice different chunks when fuzzing at this level, producing more meaningful variations in its files. It then uses these to feel its way through the program, increasing its coverage by exploring variants on interesting files that can expose more bugs.

The research team used this concept to enhance AFL, creating a tool called AFLSmart. By using this file structure, the researchers have managed to improve upon AFL’s already impressive record. According to the research paper, AFLSmart discovered more than 42 zero-day vulnerabilities in software tools that are already widely used and well tested. At the time of publication, the tool had led to 17 CVEs.

This tool promises to refine the already highly effective grey box approach to fuzzing. What would be really interesting is to analyse its performance compared to a white box fuzzer that has the considerable resources of a large cloud infrastructure behind it, like Google’s OSS-Fuzz.

Is it better to try and compromise with efficient approaches that balance coverage and speed, or to use a tedious but highly productive approach and then throw lots of cheap computing power at it?

It’s an important question because both software engineers and malicious hackers are increasingly relying on fuzzing to ferret out zero-days.

Whichever way works best, one thing is clear: the venerable old fuzzing process is improving thanks to advancing techniques.