Walter Tichy recently interviewed Westley Weimer and Martin Monperrus on the subject of Automated Bug Fixing; the interesting interview appears on the March 2015 issue of ACM Ubiquity and is available here. In a follow-up message, he invited the AutoFix team — which includes yours truly — to comment on the interview. In this post I take up his suggestion and briefly discuss the approach to automated bug fixing that we developed in the AutoFix project. Even if I’ll refer to the whole AutoFix team (which also includes Yu “Max” Pei, Yi “Jason” Wei, Martin Nordio, Bertrand Meyer, and Andreas Zeller), these comments reflect my own views — this is my blog, after all 😉 Nonetheless I believe my co-authors’ opinions are in line with mine; and everybody’s welcome to leave comments to this post if there’s anything I missed that they want to point out.

The term “automated bug fixing” — or, as it is more commonly referred to in software engineering research, “automated program repair” — denotes techniques and tools capable of suggesting changes that remove faulty behavior of programs. The fixes are often in the form of source-code patches but can also target binary executables or other representations. The most widespread approaches are driven by tests: failing tests identify incorrect behavior that must be corrected; a fix is a change that turns failing tests into passing tests (that is, it corrects some faulty behavior) without affecting the behavior on originally passing tests (that is, it preserves correct behavior).

AutoFix is our technique for automated program repair. Its main distinguishing feature — apparent from the titles of the main publications describing it and listed below — is that it takes advantage of contracts (assertions such as pre- and postconditions) available in the source code. Contracts, as they are normally used in Eiffel or in other contract-equipped languages, are simple specification elements embedded in the program text. The key word here is simple: don’t think quantifiers, first-order logic, and formal verification; think list /= Void, set.count > 0, and runtime checking.

How does AutoFix use contracts? The main usage is actually independent of the details of the program repair technique: AutoFix generates random unit tests (using another tool called AutoTest) and uses contracts as oracles to classify them into passing and failing. Contracts also remain available for runtime checking throughout the various steps of the AutoFix algorithm. For example, AutoFix uses that information to sharpen the precision of its fault localization algorithms. (As Weimer explains in the interview, fault localization answers the question: “where in the program should we edit?”, before fix generation decides “what should we edit it to?”.) Contract checking also helps regression testing of candidate fixes, and the Boolean expression of a violated assertion may suggest cases that require special handling, such as an empty list or an index on an array’s boundary.

In terms of final output, the greater precision in fault localization and fix generation enabled by having contracts helps build good-quality fixes with limited search. Indeed, emphasis on the quality and acceptability of fixes is an aspect that we stressed early on in our research on AutoFix, predating the growing concern with similar aspects that characterizes some recent related work (such as PAR, Monperrus’s, and Rinard’s). Since the fixes produced by automated program repair are validated against a finite set of tests, they come with no formal guarantees of correctness or even adequacy. The work with AutoFix is no exception here because it deals with simple, and hence glaringly incomplete, contracts that cannot be used for static verification. In our experiments, however, we have manually inspected the fixes produced by AutoFix to assess their quality from a programmer’s perspective. It turns out that over half of the valid fixes produced by AutoFix are what we call proper; that is, they not only pass the available tests (a requirement that may be trivial when the tests are few or of poor quality) but are ostensibly consistent with the program’s implicit specification and can be applied to the codebase without introducing undesired behavior.

Performance is another aspect indirectly impacted by improving precision through using contracts: since AutoFix’s repair process explores a search space that is better structured by means of the annotations, it needs limited computational resources. Running our experiments with AutoFix in a cloud computing environment would cost, on average, only few cents per produced fix.

In the quickly growing landscape of automated program repair techniques, AutoFix offers an interesting, particular trade-off that is different than other approaches. On the one hand, relying on contracts and related techniques restricts AutoFix’s applicability to programs annotated with some contracts; in exchange for this additional burden (which is not unreasonable!) AutoFix can produce, requiring limited computational resources, fixes that often are of high quality.

References