Intro • Psychological factorsin language design… • … and compiler error messages… • … and static analysis tools… • … and funny pictures of cats.
Who is thisguy? • Compiler developer / language designer at Microsoft from 1996 through 2012 • Visual Basic, VBScript, JScript, VS Tools for Office, C# / Roslyn • Static analysis architect for C# at Coverity since January • I will use “we” totally inconsistently • I have no formal background in static analysis • I take an engineering rather than academic approach
The business casefor C# • Productive, successful professional developers who target Microsoft platforms make those platforms more attractive to Microsoft’s customers • Original design goal was “a simple, modern, general- purpose language” • Any language with an 800 page specification is no longer simple, but modern and general-purpose still apply • Understanding developer psychology is key to achieving wide adoption of any developer tool
Target C# DeveloperCharacteristics • Professionals, not amateurs • Engineers, not hackers • Programming experts, not line-of-business experts • Pragmatists, not academics • Skeptics, not true believers • Conservatives, not radicals
Conservatism • C# developershate breaking changes imposed by tools • Even trivial breaking changes are agonized over • In 11 years and 6 releases C# has never added a new reserved keyword • New keywords are contextual so as to not be breaking • This imposes considerable restrictions on new syntaxes • For example, consider iterator blocks: double yield = 123.4; yield return yield;
Conservatism • C# appdevelopers also hate breaking their users • Facilitating versionable components was a pri 1 design goal • Numerous seemingly-counterintuitive features actually mitigate brittle-base-class failures: class Base { public void M(int x) { } } class Derived : Base { public void M(double x) { } } ... derived.M(123); // Base.M or Derived.M?
Conservatism C# 4.0 addeddynamic dispatch to facilitate interoperability with dynamic languages and “legacy” object models • Enormous MVP community pushback • I will use this feature correctly but my coworkers are going to abuse it and then I’m going to have to fix their god-awful hacked-up code • Anything that makes the compiler less capable of finding bugs is met with skepticism and resistance • Completely redesigned based on early feedback
Error reporting psychology •Dealing with correct code is literally the smallest problem • “Roslyn” does syntactic analysis of broken code in the time between keystrokes; semantic analysis takes a little longer • Error messages need to be understandable, accurate, polite and diagnostic rather than prescriptive • Let’s take a look at some examples
Error reporting psychology Aparams parameter must be the last parameter in a formal parameter list Is this saying: • If there is a params parameter, it must be the last one? or • The last parameter and only the last parameter must always be a params parameter? Or • The last parameter must be a params parameter; if others are as well, that’s fine too? The error is only clear if the feature is already understood
Error reporting psychology Errormessages must read the mind of a developer who wrote broken code and figure out what they meant. class C { public virtual static void M(){} }
Error reporting psychology Complexoperator + (Complex x, Complex y) { ... User-defined operator must be declared static and public • This is an example of a prescriptive error done right • The user absolutely positively has to do this to overload an operator • Odds that they were not trying to overload an operator are low
Warnings are harderthan errors • Must infer developers erroneous thoughts • Compiler must be fast • This makes an opportunity for third-party tools • Must be plausibly wrong • A warning for code that no one would reasonably type is unhelpful • Must be able to eliminate warning • And ideally the warning should tell you how • Must have low false positive rate • Encouraging developers to change correct code is harmful • We will return to this point later
What do C#developers want? Rigidly defined areas of doubt and uncertainty • Static type checking, type safety, memory safety… • … that can be disabled if necessary. • A compiler that infers developer intent… • … with predictable behavior and understandable rules • Actionable errors when inference fails… • …rather than muddling on through and getting it wrong
C# was originallycalled SafeC C# throws developers into the “Pit of Success”: • Eliminate unimportant dangerous features entirely • switch fall through • Restrict dangerous features to clearly-marked unsafe code regions • Eliminate implementation-defined behaviours • x = ++x + x++; is well-defined in C# … • …but still a bad idea. • Define common undefined behaviours • Accessing an array out of bounds causes an exception • Mandate compiler warnings There are numerous defects that the Coverity C/C++ analysis checkers detect which are impossible, unlikely, or already warnings in C#. Let’s look at a few dozen. Quickly. These are all defects found by Coverity in C/C++ that are not worth checking in C#…
C/C++ defects inapplicableto C#: • Local read before assignment • C# rejects programs that use uninitialized locals • Uninitialized fields / arrays • Fields and arrays are automatically zeroed out • Treating a pointer to a variable as a pointer to an array • Rare, must be marked as unsafe • Buffer length arithmetic errors • Strings and arrays know their lengths; checked at runtime • Pointer/integer/char/bool/enum type errors • Not inter-assignable in C# without explicit cast operators
C/C++ defects inapplicableto C#: • Failure to consistently check error return codes • C# uses exceptions • Accidental sign extension • Either error or warning • Implementation-defined side effect order • Side effect order is well-defined • Statement with no effect • is actually a parse time error in C# • Accidental use of ambiguous names • C# requires that a simple name have a unique meaning in a block
C/C++ defects inapplicableto C#: • sizeof mistakes • C#’s sizeof operator only takes types • Unintentional switch fall-through • Is an error • Unreachable code • Is a warning • Accidental assignment or comparison of variable to itself • Yep, that’s a warning too • Field never written or never read • Man that’s a lot of warnings • Missing return statement • Is illegal • malloc without free / free without malloc / allocator – deallocator mismatch / use after free • Not needed in a garbage-collected language • Dereferencing an address that lived longer than the storage it refers to • References to variables may not be stored in long-term storage • Accidental use of function pointer • Method group expressions can only be used in strictly limited locations • Overriding errors • The language was designed to mitigate brittle base class failures by default
Defects common toC/C++ and C# • Copy paste mistakes • Expression contains variables but always has the same result • You checked for null here, you dereferenced without checking there. • Some infinite loops • Dangling else and other indentation issues • Array index out of bounds • Integer overflow • checked arithmetic is off by default • Non-memory resource leaks • Such as forgetting to close a file • Stray semicolons • Swapped arguments • Unused return value • Uncaught exception • Missing or misordered critical sections • Including non-atomic operations inconsistently inside critical sections • And many more! And these are just a few that are common to C and C#; there are a whole host of defects specific to C# programs that we could find statically. Let’s consider the psychological aspects of static analysis tools beyond the compiler.
Developer Adoption isKey • Soundness is explicitly a non-goal • We don’t want to find all defects or even most defects • We want every defect reported to be a customer-affecting bug • Developers won’t adopt a product that they perceive as making their jobs harder for no customer benefit • Our business model requires adoption to drive renewals • How do developers – who, remember, are using C# because they like a statically-typed language – react to static analysis tools?
Developer psychology WRTanalysis tools • Egotistical • I don’t need this tool for my code • But my coworkers on the other hand… • Clever management uses this trait to advantage
Developer psychology WRTanalysis tools • Skeptical, conservative, dismissive • Resistant to change • Quick to criticize “stupid” false positives • The first five defects they see had better be true positives
Developer psychology WRTanalysis tools • “Busy” with, you know, “real work” • Code annotations are unacceptable • Analysis tool must adapt to customer’s build process • Overnight analysis runs are acceptable – barely
Developer psychology WRTanalysis tools • Any change in what defects are reported on the same code over time – a.k.a. “churn” – is the enemy • Randomized analysis is right out, unfortunately • Any improvement to our analysis heuristics can cause unwanted churn • We try to keep churn below 5% on every release
Developer psychology WRTanalysis tools • Responds well to perverse incentives • Hard-to-understand defect reports are easy to ignore • No downside to incorrectly triaging true positives as false positives • Finding defects is hard; presenting evidence that prevents incorrect classification as a false positive is harder • Deep analysis with theorem provers can be worse than shallow analysis with cheap heuristics. • Presenting the result is insufficient; the developer must understand the proof to fix the defect.
Displaying good defectmessages public void GetThing(Type type, bool includeFrobs) { bool isFrob = (type != null) && typeof(IFrob).IsAssignableFrom(type); object instance = this.objects[this.name] if (instance is IFrob && includeFrobs) { [...] } else if (type.IsAssignableFrom(instance.GetType()) { [...] }
Displaying good defectmessages public void GetThing(Type type, bool includeFrobs) { Assuming type is null. type != null evaluated to false. bool isFrob = (type != null) && typeof(IFrob).IsAssignableFrom(type); object instance = this.objects[this.name] instance is IFrob evaluated to true. includeFrobs evaluated to false. if (instance is IFrob && includeFrobs) { [...] } Dereference after null check: dereferencing type while it is null. else if (type.IsAssignableFrom(instance.GetType()) { [...] }
Management psychology • Thefirst time static analysis runs there may be thousands of errors; typical rate is one defect per thousand LOC • Academic answer: rank heuristics • Pragmatic answer: ignore them all • Simply ignore all defects in existing code • Triage and fix defects in new code • “Someday” get around to fixing defects in old code • Why is this so popular? • Old code is in the field. It works well enough. Risk is low. • New code is unproven. It might work, or it might not. Risk is high.
Management psychology • Managementactually pays for the developer tools • And typically has no idea how to use them effectively • Middle management has perverse incentives too • Time, cost and complexity are easily measured; quality is not • “Never upgrade the static analysis tool before release” • Worse tools are better; better tools are worse
Worse is better;better is worse KnownDefects Time No tool improvements == Management gets bonus
Worse is better;better is worse KnownDefects Time No tool improvements == Management gets bonus Tool upgrades find more defects == Management gets no bonus The fix rate is the same in these two graphs but if the tool improves faster than the fix rate, no bonus.
Good news If youhave a well-engineered product that: • makes good use of theoretical and pragmatic approaches, • finds real-world, user-affecting defects, and • takes developer and management psychology into account Then you can make a positive difference
Conclusion • Theoretical staticanalysis techniques are awesome; we can and do use them in industry… • … but doing all that math is actually only one small part of shipping a static analysis product • Understanding developer and management psychology is necessary to ensure adoption of any developer tools • C# was carefully designed to match a target developer mindset • Coverity thinks about developer and manager psychology at every stage in the analysis and overall product design • Research into better ways to present defects would be awesome
More information • Learnabout Coverity at www.Coverity.com • Read “A Few Billion Lines Of Code Later” • Find me on Twitter at @ericlippert • Or read my C# blog at www.EricLippert.com • Or ask me about C# at www.StackOverflow.com