Settings

Theme

Hardening LLVM With Random Testing [PDF Slides]

llvm.org

34 points by achew22 15 years ago · 8 comments

Reader

bigmac 15 years ago

Their idea to checksum the global variables is really clever. Many benchmarks and testsuites simply rely on verifying program output. They're able to verify a greater surface area of the compiler by ensuring that all the intermediate global values used in their random programs have the same values at the end of the computation.

Its not in the slides, but during the Q&A they revealed that far fewer bugs were found in gcc than in LLVM. With compilers, as with most software, battle-tested dinosaurs win the day when it comes to code quality.

Its also interesting to note that the greatest number of bugs were found in the InstCombine pass, which has been completely refactored. In LLVM2.6 it was one monolithic source code file (13000 lines) with a zillion different peephole optimizations. Now its broken up into 15 files.

  • regehr 15 years ago

    Another reason we reported more bugs to LLVM is that on average, they fixed bugs faster.

smcl 15 years ago

I first ran into the tool used in the article when I was working in the compiler team of a company that produced DSPs. I figured I'd fire up a set of 100 randomly generated tests to see how we coped.

I was astonished, these few tests yielded something like 5 serious bugs (crashes, bad optimisations) in the development branch. That was only when built -O (full optimisation, rarely reveals bugs) on a single architecture, if I'd spent a bit more time I reckon I could've uncovered a few more just be adding or changing the build switches alone for the same tests.

Unfortunately I wasn't able to do this, as I was let go shortly afterwards and wasn't able to convince anyone that we should include this in regular testing before I left. I wasn't even able to submit bug reports to John Regehr and his team at Utah University - who were curious about what kinds of bugs their tool was uncovering - even though I promised I would.

pohl 15 years ago

Found deep optimization bugs unlikely to be uncovered by other means

Very cool.

The slides don't make it clear whether the various bugs discovered ended up being covered by unit tests in the regular LLVM suite.

Does anybody remember in the mid 90s there was a 'crashme' program that could be used to fuzz test the Linux kernel? I recall looking for it again about 5 years ago and couldn't find references to it. Did that technique fall out of use?

  • achew22OP 15 years ago

    I don't know specifically about the `crashme` tool but I can say that fuzzing is not a technique that has gone out of vogue. In fact it is standard security practice for finding ill defined behavior in programs for buffer overflows and other nasties. When you read the exacerbated cries of the security researchers who have been sitting on a critical IE/Firefox/Whatever bug they almost always scream something to the effect of "Why didn't they just use a fuzzer, it's easy to find these problems that way -- that's how I did it." I would like to give props to Google, their security teams have been diligent in running static analysis and fuzzing tools against their code (white box[1][2]/black box testing[3])

    As always, Wikipedia is a great source for information on this one[4] and I can personally testify to OWASP's fuzzer if you're going after webpages (my last local OWASP that I went to was on fuzzing and was REALLY interesting)

    [1] http://en.wikipedia.org/wiki/White-box_testing

    [2] http://en.wikipedia.org/wiki/Static_code_analysis

    [3] http://en.wikipedia.org/wiki/Black-box_testing

    [4] http://en.wikipedia.org/wiki/Fuzz_testing

    EDIT: Fixing formatting

  • eklitzke 15 years ago

    Dave Jones has been fuzzing Linux system calls recently; here's one LWN article about the topic: http://lwn.net/Articles/414273/ . There are also some posts about more bugs he's discovered this way on his blog.

tdj 15 years ago

LLVM is pretty cool. I've started using it in development simply because the compile times are much faster than GCC and the clang error messages actually make sense.

I also wish that this testing methodology would be adopted by other projects, makes QA much easier.

barrkel 15 years ago

If you prefer the Google viewer (seems more efficient and pleasant to use than scribd):

http://docs.google.com/viewer?url=http%3A%2F%2Fwww.llvm.org%...

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection