During the spring semester, I supervised three groups of students working on master’s thesis projects that I proposed. Tackling quite different problems, they all did an excellent job, and it was fun and interesting to supervise their work and to follow their results. In this post, I’m giving a brief account of their work now that they are all done. I will add links to the official thesis reports in Chalmers’s Library as soon as they become available (probably in the fall). While we’re at it: if you’re a current student looking for a master’s thesis project, make sure to check out the projects I currently offer or drop me a line.

The spec is out there — Extracting contracts from code

Pontus Doverstav and Christoffer Medin developed a tool that extracts partial JML contracts (a kind of formal specification) from Java code. Their approach differs from the most common techniques for specification inference, which ultimately reason about the semantics of code (for example, abstract interpretation techniques abstract the concrete semantics of a program on simpler domains). In contrast, Pontus and Christoffer’s work is syntax-driven, as it is based on a collection of patterns that capture how certain code implicitly suggests simple contracts. For example, a qualified call t.foo() implicitly requires that t != null; this assertion is propagated to the precondition of the method where the call appears.

The whole approach is deliberately kept simple and best effort. For instance, whenever there is a method call, we try to determine its effects based on the callee’s (partial) contract; if no callee contract is available, we just assume a worst-case behavior to preserve soundness. In the worst case, an analyzed method may be marked as “failure” by the tool, denoting that no meaningful specification could be reliably extracted. An advantage of keeping the analysis so simple is that we can apply it extensively and retain any result that is useful. When they applied their tool to thousands of methods in four open-source projects, Christoffer and Pontus found that it could extract more than one JML specification case per method on average. Many extracted contracts are simple, but a few are quite complex — such as one capturing in detail the behavior of a large switch statement covering 256 cases. The thesis also reports on some preliminary evidence that suggests the contracts extracted this way can be complementary to those a programmer would write, thus helping the analysis of programs and other applications driven by specifications.

Comparing programming languages in Google Code Jam

Alexandra Back and Emma Westman analyzed empirically the solutions submitted to Google’s Code Jam programming contest, with the overall goal of comparing features of different programming languages. The main motivation behind targeting Code Jam was finding a collection of programs that all implement the same functionality (whatever is required by each round in GCJ) in different programming languages (contestants can use any programming language of their choice), while being able to control for quality — the intuition being that a solution’s rank in the contest is strongly indicative of its quality in terms of correctness and efficiency.

The study focused on the five programming languages most widely used in the contest — C, C++, C#, Java, and Python — and analyzed over 210’000 solution programs. Since a contestant’s performance is based only on his or her ability to produce the correct output for a given, large input to the program, there is no guarantee that the program that has been submitted is the same as the one that was used to generate the correct solution. This complicated the study, and ultimately required to drop 38% of the solutions from some analyses, as they could not be compiled and executed. Nonetheless, a large number of usable solutions remained that could be fully analyzed, which gives confidence in the study’s results. Among them, Alexandra and Emma found that the highest-ranked contestants use C++ predominantly but by far not exclusively; that Python programs tend to be the most concise ones, and C# programs the most verbose; that C and C++ dominate the comparison of running time performance, even though the other languages do not fall that far behind; that C and C++ also have the smallest memory footprint, while Java has the largest (probably due to how the JVM manages memory). Check out the thesis for many more details and results.

Incremental deductive verification for a subset of the Boogie language

Leo Anttila and Mattias Åkesson developed a tool that verifies Boogie programs incrementally as they are modified in each new iteration. Boogie is a popular intermediate verification language, combining a simple imperative language and an expressive program logic. Boogie is also the name of a tool that verifies Boogie programs. Whenever a procedure changes implementation or specification, Boogie verifies it from scratch — like most other verifiers; in contrast, Leo and Mattias’s tool is able to reuse the results of previous verification runs. Suppose, for example, we have verified a procedure foo; later, we introduce a small change to foo that is still compatible with the procedure’s specification. The incremental verifier is aware of what has changed, and only verifies that, intuitively, the difference between the current version of foo and the previous one does not affect correctness.

The details of how incremental verification works are quite technical, and rely on a combination of attribute grammars and incremental parsing. My experience working with Boogie (or program verifiers that use it as back-end) suggests that such an incrementality matches well the way so-called auto-active verification tools are used in practice — namely in a stream of small changes to the source code each immediately followed by a call to the verifier, whose responsiveness is of the essence to support a smooth user experience. The thesis describes experiments simulating this mode of interaction, showing that incrementality can often reduce the running time of the verifier by 30% or more.


  1. Christoffer Medin and Pontus Doverstav: The spec is out there — Extracting contracts from code, Master’s thesis in computer science. Department of Computer Science and Engineering, Chalmers University of Technology, Sweden, June 2017.
  2. Alexandra Back and Emma Westman: Comparing programming languages in Google Code Jam, Master’s thesis in computer science, algorithms, languages and logic. Department of Computer Science and Engineering, Chalmers University of Technology, Sweden, June 2017.
  3. Leo Anttila and Mattias Åkesson: Incremental deductive verification for a subset of the Boogie language, Master’s thesis in computer science, algorithms, languages and logic. Department of Computer Science and Engineering, Chalmers University of Technology, Sweden, June 2017.

These are good times for formal verification. Amazon Web Services is using formal specifications and model checking to support rigorous design at a very large scale. Facebook, of all companies, is getting serious about static analysis, with Monoidics’s separation logic gurus and Francesco Logozzo of Clousot fame among its recent hires. John Carmack, as a programmer, is aggressively pursuing static analysis. Only thanks to formal verification has a remote but pernicious bug in Java’s standard library sort been discovered. It’s clear that — slowly, one step at a time — the offerings of formal methods are becoming practically helpful.

Still, the transition from pure research to practical applications requires time, in formal verification as in other areas. Among the intermediate steps that have to be covered there’s dealing with instances that are realistic: not necessarily full-blown industrial-strength, but not discounting any features that are expected of a fully usable implementation. Realistic software also moves past the challenges of benchmarks, which typically focus on one or few complicated aspects while eliding features that are considered commonplace and hence not interestingly problematic.

Nadia Polikarpova‘s presentation at the Formal Methods Symposium in Oslo a few weeks ago was precisely about dealing with realistic software in verification: our paper describes the verification of EiffelBase2, a realistic general-purpose data-structure library, against its complete functional specification.

Before you roll your eyes thinking, “Seriously?! Verification of data structures in 2015? Wasn’t the problem solved in the 1970s? Or maybe earlier! And they gave them a best paper award for this stuff? Unbelievable!”, let me explain what’s new and, hopefully, interesting about this work. It is true that the amount of work on verifying data structures is overwhelming. Not only because the effort poses interesting challenges but also because data structures are a domain that is naturally amenable to formalization and analysis: everybody agrees on the expected behavior of, say, a linked stack, and it can be formalized succinctly. In other domains (reactive software for example) even specification is a challenge in terms of both “what” and “how”. Still and all, most data-structure verification work focuses on demonstrating certain aspects of a technique and is applicable to implementations written in a certain way, possibly with some restrictions, and not in others. In contrast, Nadia implemented and specified EiffelBase2 before verification (even if, naturally, some later adjustment were necessary to complete the verification effort) with the goal of providing a usable realistic container collection without compromising on any aspect (object-oriented design, API usability, implementation performance, …) that is to be expected in practice. In fact, EiffelBase2 now ships with the EiffelStudio compiler as an alternative to the old EiffelBase library. (It did not fully replace it only for backward compatibility reasons.)

Here are a few examples of EiffelBase2 features that you would expect in a usable data-structure library (think java.util or .NET Collections, or C++ STL), but which are often simplified or omitted altogether in verification research:

  • External iterators, with the possibility of statically checking that multiple iterators operating on the same structure are safe, that is modify it consistently.
  • Hash tables with arbitrary user-defined objects as keys and object-based key comparison, with the possibility of statically checking that modifying objects used as keys does not affect table consistency.
  • Rich, convenient APIs offering operations for searching, replacing, inserting, and removing elements, merging containers, converting between different containers, object-based comparison (i.e., comparing objects according to their content rather than by references pointing to it), and copy constructors.
  • Abstract specifications of data structures, consistent with the inheritance hierarchy, and provably complete for functional correctness.
  • Implementations that do not trade performance off for verifiability: array-based lists use ring buffers, hash tables automatically resize their bucket arrays to avoid performance degradation, and so on.

If some of these features do not sound impressive it is because we are blasé, used as we are to having them in any realistic implementations, but also happy to ignore or abstract them away when performing verification because they do not seem to introduce complex, as opposed to complicated, novel challenges. However the devil’s in the details, and some sources of complexity are emergent from the combination of seemingly plain and innocuous features. Even verifying something as workaday as copy constructors brought its own set of challenges, in ensuring that the copy is consistent with the abstract class model and with its relatives by means of inheritance.

Part of the challenge of verifying EiffelBase2 was achieving all this with a good level of automation and a reasonable amount of annotations (besides specifications) — in other words, with an effort proportional to the magnitude of the challenge. To this end, we developed a verification methodology and supporting tool (AutoProof, which I presented in a previous post) with the goals of providing flexibility and proportionality: verifying simple stuff should be straightforward, verifying complex stuff (such as EiffelBase2) should still be attainable but require a bigger annotation overhead. While I personally think that being able to interact directly with the prover for the most challenging proofs may support even greater levels of flexibility, the auto-active interaction mode of AutoProof may have hit on a sweet spot in the landscape of annotations vs. automation — at least for full functional correctness in the domain of data structure libraries.

You’re welcome to add comments about features of realistic software that are often overlooked in practical verification, as well as about other success stories of formal methods that narrowed the gap between lab and field practice.

Walter Tichy recently interviewed Westley Weimer and Martin Monperrus on the subject of Automated Bug Fixing; the interesting interview appears on the March 2015 issue of ACM Ubiquity and is available here. In a follow-up message, he invited the AutoFix team — which includes yours truly — to comment on the interview. In this post I take up his suggestion and briefly discuss the approach to automated bug fixing that we developed in the AutoFix project. Even if I’ll refer to the whole AutoFix team (which also includes Yu “Max” Pei, Yi “Jason” Wei, Martin Nordio, Bertrand Meyer, and Andreas Zeller), these comments reflect my own views — this is my blog, after all 😉 Nonetheless I believe my co-authors’ opinions are in line with mine; and everybody’s welcome to leave comments to this post if there’s anything I missed that they want to point out.

The term “automated bug fixing” — or, as it is more commonly referred to in software engineering research, “automated program repair” — denotes techniques and tools capable of suggesting changes that remove faulty behavior of programs. The fixes are often in the form of source-code patches but can also target binary executables or other representations. The most widespread approaches are driven by tests: failing tests identify incorrect behavior that must be corrected; a fix is a change that turns failing tests into passing tests (that is, it corrects some faulty behavior) without affecting the behavior on originally passing tests (that is, it preserves correct behavior).

AutoFix is our technique for automated program repair. Its main distinguishing feature — apparent from the titles of the main publications describing it and listed below — is that it takes advantage of contracts (assertions such as pre- and postconditions) available in the source code. Contracts, as they are normally used in Eiffel or in other contract-equipped languages, are simple specification elements embedded in the program text. The key word here is simple: don’t think quantifiers, first-order logic, and formal verification; think list /= Void, set.count > 0, and runtime checking.

How does AutoFix use contracts? The main usage is actually independent of the details of the program repair technique: AutoFix generates random unit tests (using another tool called AutoTest) and uses contracts as oracles to classify them into passing and failing. Contracts also remain available for runtime checking throughout the various steps of the AutoFix algorithm. For example, AutoFix uses that information to sharpen the precision of its fault localization algorithms. (As Weimer explains in the interview, fault localization answers the question: “where in the program should we edit?”, before fix generation decides “what should we edit it to?”.) Contract checking also helps regression testing of candidate fixes, and the Boolean expression of a violated assertion may suggest cases that require special handling, such as an empty list or an index on an array’s boundary.

In terms of final output, the greater precision in fault localization and fix generation enabled by having contracts helps build good-quality fixes with limited search. Indeed, emphasis on the quality and acceptability of fixes is an aspect that we stressed early on in our research on AutoFix, predating the growing concern with similar aspects that characterizes some recent related work (such as PAR, Monperrus’s, and Rinard’s). Since the fixes produced by automated program repair are validated against a finite set of tests, they come with no formal guarantees of correctness or even adequacy. The work with AutoFix is no exception here because it deals with simple, and hence glaringly incomplete, contracts that cannot be used for static verification. In our experiments, however, we have manually inspected the fixes produced by AutoFix to assess their quality from a programmer’s perspective. It turns out that over half of the valid fixes produced by AutoFix are what we call proper; that is, they not only pass the available tests (a requirement that may be trivial when the tests are few or of poor quality) but are ostensibly consistent with the program’s implicit specification and can be applied to the codebase without introducing undesired behavior.

Performance is another aspect indirectly impacted by improving precision through using contracts: since AutoFix’s repair process explores a search space that is better structured by means of the annotations, it needs limited computational resources. Running our experiments with AutoFix in a cloud computing environment would cost, on average, only few cents per produced fix.

In the quickly growing landscape of automated program repair techniques, AutoFix offers an interesting, particular trade-off that is different than other approaches. On the one hand, relying on contracts and related techniques restricts AutoFix’s applicability to programs annotated with some contracts; in exchange for this additional burden (which is not unreasonable!) AutoFix can produce, requiring limited computational resources, fixes that often are of high quality.


“Welcome to the first class of our course on formal verification. I’d like to start with some motivation: why verification is important in the real world, and what the consequences of lack of verification in software development are.” Lights go off. You hear a countdown recited in French, “Trois, deux, unité, feu!“. On the screen, a rocket blasts off its platform into the sky. Thirty-something seconds into its ascension, it awkwardly topples off and quickly disintegrates. “That was the Ariane 5 rocket’s first launch, which failed due to a software bug. The failure resulted in an estimated 300 million Euros loss. Software errors can even cost human lives, such as in the case of the Therac-25 machine for radiation therapy, whose malfunctioning resulted in several patients dying from radiation burns. Formal verification can ensure that software is free from defects before it’s deployed, thus avoiding catastrophic disasters such as those just mentioned. Now here’s the course syllabus…”

OK, I admit to slightly exaggerating the above for dramatic purposes. But, if you ever took a course on formal verification, chances are that it started with some graphic pictures of exploding rockets, unresponsive Mars rovers, or defective medical equipment, followed by stark tallies of the enormous losses, in money or human lives, resulting from the disasters. The bottom line: software quality is generally abysmal; and the systematic all-out application of formal verification is our only hope to avert further similar disasters from occurring in the future.

What’s the problem with this angle, to which I may have also been guilty of paying lip service in the past? I think this way of pitching formal verification points to motivations that are not fairly representative of the varied landscape in the research and practice of formal methods, and risks alienating software development practitioners, who have different concerns in their everyday’s work — concerns which conflict with the oversimplified picture painted above.

The first problem I see is that pointing to a handful of major disasters due to software failures may be construed as suggesting that formal verification is something only justified for mission critical applications. What about the myriad software projects that don’t cost hundreds of million dollars and whose failure does not result in ruinous losses? Is there a place for formal methods within such projects’ modest budgets? Related to these points is the misled vision that formal verification is an all-or-nothing business: either it enters every step of development or you might as well do things as you’ve always done them — but then disaster will impend. The flip side is that the only failures that matter are the catastrophic ones; everything else is not a big deal. In reality, everyday software developers generally remain genuinely concerned about software quality even when working on non mission-critical development; but also have to cope with all sorts of project constraints pulling in different directions.

A better way to frame the usage of formal verification is in terms of a continuum of trade-offs. Each trade-off balances some of the additional costs incurred by applying verification techniques against some gains resulting from higher reliability or confidence in the software’s correctness. This way, each project can pick the trade-off that best fits its requirements and budget. It’s easy to guess that the overwhelming majority of development projects occupy some intermediate position in the trade-off continuum: their failures won’t result in apocalyptic losses, nor are they of no consequence whatsoever. These are the projects that most practitioner developers are familiar with and can relate to. To each their own: Paris Métro Line 14 deployed the B method to build correct-by-construction modules through refinement; for my students learning Java and C#, writing a decent collection of unit tests and running a lint-like tool on their projects is good enough.

The second problem with the above pitching of formal methods is that it may suggest that they are a silver bullet. You want high reliability: just use formal methods (but it will cost you)! It doesn’t matter what your development process is, what programming language you’re using, whether your software is interactive, if you’re deploying in a legacy system, what your amount of technical debt is. Start using [your favorite verification technique] and your problems will evaporate!

But reliability, and practically every desirable quality property, is never the result of a single technique, tool, design choice, or individual talent. It can only follow from a sensible, well-organized process; one that is transparent, traceable, technologically up to date, based on a clear understanding of the application domain, and so on. The investigations following the failures of the examples mentioned above (Ariane 5, Therac-25, etc.) invariably pointed to communication and process shortcoming as well as, often more egregious than, technical ones. Correctness cannot be built as an afterthought; it requires planning and practice. Based on his experience at Microsoft, James Larus claimed that a root cause of software bugs is “missing, outdated, and incorrect knowledge“, which easily result from poorly organized processes. Conversely, the practical adoption of formal verification is very often slowed down by unintended process constraints; progress in this sense doesn’t necessarily require more powerful techniques but mainly integrating known techniques in a way that is not disruptive of existing practices — perhaps except for unfixable noxious practices whose disruption is desirable anyway.

The third problem I see with the “disaster-oriented” motivation for formal methods is that it seems to suggest that verification research is worthy only to the extent that it’s directly applicable to relevant software development issues and leads concrete gains, for example averting failure. This view encourages a view of science as a provider of cost-effective technology. To be sure, verification technology has had enough success stories that even long-standing skeptics must concede it can be useful in practice. (If you find the previous sentence controversial, I have blog posts planned to convince you!) But practical applicability need not be — should not be — the only goal of researchers. There must be room for speculative inquiries, also because long term progress invariably and crucially depends on them. Think, for example, of how the spectacular progress of SAT solvers has fueled advances in automatic verification in the last decade or so. The performance of SAT solvers hasn’t mushroomed overnight, but is the result of decades of research on a topic that seemed, at the time of its inception, quite removed from practicality. As Ed Clarke often recalls when presenting SAT-based model checking, if he had suggested using an NP-complete problem such as SAT as the basis for a practical verification technique at the time he was working on his Ph.D. thesis, his colleagues would have called his suggestion implausible because clearly clueless about the well-understood notion of intractable problem.

In spite of not always being pitched and motivated in the most convincing way, formal verification remains an exciting, quickly growing research topic that has made spectacular progress, which has been winning, in one form or another, the favors of more and more serious practitioners. One example among several: John Carmack‘s recent opinion that

The most important thing I have done as a programmer in recent years is to aggressively pursue static code analysis. Even more valuable than the hundreds of serious bugs I have prevented with it is the change in mindset about the way I view software reliability and code quality.

This is quite an encouraging endorsement, even if you work on verification techniques other than static analysis. Or perhaps all the “motivation” that verification research needs comes from Leibniz. In the next formal methods course I teach, I’ll try to open with his famous quote instead of discussing rockets and radiation machines:

The only way to rectify our reasonings is to make them as tangible as those of the mathematicians, so that we can find our error at a glance, and when there are disputes among persons, we can simply say: let us calculate without further ado to see who is right.

If you’re familiar with computer science, you’re also familiar with the notion of abstraction. Abstraction is ubiquitous in computing. Whereas it belongs to all of science in one way or another, it has a pivotal role in computing where everything abstracts something more concrete and concrete computations are made of abstractions.

If you’re familiar with music, you’re also familiar with the notion of abstraction. Abstraction is ubiquitous in music. Whereas it belongs to all of the arts in one way or another, it has a pivotal role in music where everything abstracts something more concrete and concrete music is made of abstractions.

The similarity between computing and music with respect to abstraction runs deeper. Abstraction is introduced to solve similar problems that originate in a tension between flexibility and stability, generality and concreteness, effectiveness and understandability. In this post, I illustrate the analogy with one concrete example in music and one concrete example in computing. To make sure that the discussion does not become too — ahem — abstract, I recommend you first read the part you’re more familiar with, so that the analogies in describing its counterpart will be easier to follow.

Abstraction in music: the sonata form

A music composer has to reconcile his aspiration of creativity with the listener’s expectation of a structure familiar to her. Ideally, the composer’s be unencumbered, free to express his ideas by whatever musical items he deems appropriate; but such an approach has a high risk of alienating the listener, who may not be able to follow the ideas expressed in music without any structure for orientation. This is why music (like other arts) has developed forms, which are conventions that help achieve trade-offs between constraining the composer and convincing the listener. The sonata form is one of the most popular of such musical forms as proved by its malleability and longevity.

The sonata form is a prescription to organize the structure of a musical piece in a way that is regular but also leaves plenty of room for the outcomes of artistic creativity. A piece in this form consists of three main sequential parts: exposition, development, and recapitulation. Let’s illustrate them on the first movement of Beethoven’s Piano Sonata #9 (Op. 14 No. 1); you can listen to it played by Claudio Arrau in this YouTube video (I’ll link to specific segments as I introduce the sonata’s parts) and follow the music on one of these scores in the public domain.

The exposition (bars 1–60) introduces a number of themes (or subjects), which are melodies expressing the ground content of the whole piece. By opening with a presentation of unadulterated themes, the composer offers a convenient entry point to the listener: by becoming familiar with the primary ingredients of the sonata, she becomes capable of following the composer’s creation into unexpected territory without losing a general sense of direction. Beethoven’s sonata introduces three themes: the first theme (bars 1–21) begins with a first melodic line in the tonic (E major), which is then varied to construct a modulation into the dominant (B major); the second theme (bars 22–38) takes over in this key with a cantabile character, contrasted by the ensuing third theme (bars 39–56) which is more rhythmic in nature. A codetta (bars 57–60) takes up the last bars of the exposition and connects the third theme to a repetition of the exposition and then, after the repetition, to the second main sonata part: the development.

The development (bars 61–90) is a sonata’s central part, where the composer has substantial freedom to explore his ideas without major structural constraints. By varying and recombining the thematic material previously presented, he can lay out a compelling and imaginative musical argument while building atop the foundations laid in the exposition. The development in Beethoven’s sonata is fairly short, but it manages to explore some interesting fluid tonal contrasts between major and minor keys, and to expand the scant material of the codetta in a more dignified context (from bar 83).

The development is followed by the third sonata part: the recapitulation (bars 91–147). As the name suggests, the recapitulation is a concluding summary where the by-now familiar themes make their last appearance. The composer still has room for expressing his creative ideas in the recapitulation, as themes often change attributes such as presentation order, key, rhythm, and other melodic or harmonic details. Beethoven, for example, ventures into a minor key while transitioning from the first to the second theme in the recapitulation, and introduces other ingenious surprises that reward the attentive listener. The movement closes with a coda (from bar 148), which expands the material of the exposition’s codetta while restoring the awaited tonic key for an irenic conclusion.

The imperfectly symmetric exposition and recapitulation constitute an interface between the deep concrete musical ideas stirring in the composer’s mind and the ears and brains of the listener, who can so appropriate, in a generalized form, some of those ideas and enjoy them.

Abstraction in computing: the modular routine

A computer programmer has to reconcile her creative aspiration of producing an efficient implementation with the user’s expectation of an interface easy to understand and exporting predictable behavior. Ideally, the programmer’s be free to resort to any technical means she deems appropriate to build an implementation that is fast, uses little memory, or is idiomatic in the chosen programming language; but too much freedom has a high risk of producing a program that is hard to use because its exact requirements and guarantees are arduous to fathom. This is why computing (like other scientific and engineering disciplines) has developed techniques for modularization, which are approaches that discipline how program code is structured in such a way that the effects of modular components can be understood in isolation as well as in combination. The modular routine is one of the most popular elements of modular design as proved by its ubiquity across programming languages.

The modular routine combines a piece of code (the body or implementation) with a well-defined interface. The interface consists of a signature, a precondition, and a postcondition. Let’s illustrate these parts on the algorithm for binary search in sorted arrays; here’s an implementation taken from Tim Bray’s, which I adapted to pseudo-Eiffel and annotated with a suitable complete functional specification.

The program text begins (line 1) with the signature, which declares the data the routine operates on. The input data consists of the integer array a, which includes a.count elements from a[0] to a[a.count - 1], and of the integer scalar target, whose value is searched among a‘s elements. The output data is also an integer scalar, referred to as Result in the rest of search_binary.

Routine search_binary works correctly under certain assumptions about properties of the input data passed to it by callers. These assumptions are encoded as the precondition (lines 2–4) which includes two clauses. First, a must refer to a valid array object (line 3). Second, a‘s content must be sorted in nondecreasing order (line 4). The second precondition clause is specific to binary search, whose algorithm leads meaningless results if applied to sequences that are not ordered.

The body (lines 5–27) is a routine’s central part, where the programmer has substantial freedom to work out a suitable implementation without major restrictions beyond those given by the programming language’s features. The body of search_binary begins with a declaration of three integer local variables (line 6), which will be used to keep track of intermediate states in the computation. A loop occupies the major part of the body and does the heavy lifting of searching for the target. As we understand from the loop invariant (lines 10–13), the interval that goes from low to high (excluded) marks, at any point during the computation, the range of values in a that have not been searched for yet. Conversely, the elements at indexes from 0 to low are not greater than target; and the elements at indexes from high to a.count - 1 are greater than it. The body of search_binary exits the loop only when the whole array a has been searched. At that point, the final conditional (lines 23–27) can conclude whether a value equal to target exists at all in a: if low is still -1 or points to an element not equal to — and hence less than because of the invariant — target, the routine returns the invalid index -1 to denote “element not found”; otherwise, a[low] is a valid array element equal to target, and the routine returns its index. The choice of decoupling searching from determining whether target is in the array is an example of the programmer’s freedom to design an implementation that achieves specific, advantageous trade-offs. Bray points out why this approach is preferable to performing two tests in the loop (one for less than, and one for equal to) in the hope of exiting the loop earlier if target is found: in a search where the search interval shrinks exponentially, “nearly all” values are found in the last iteration; hence minimizing the operations per iteration is the most efficient solution. Other programmers, given different constraints, may however follow other approaches; for example, Joshua Bloch’s implementation of binary search in Java’s OpenJDK uses a loop with three-way comparison.

Understanding all these details of the implementation would be burdensome to users who just want to call search_binary to avail of its functionality. This is why the body is followed by the last routine part: the postcondition (lines 28–31), which summarizes the routine’s output in relation to the value of the input arguments. The postcondition of search_binary includes three clauses, defining the output whether the value searched for has been found or not. If Result denotes a valid index of a (line 29), then it points to an element with value target; otherwise Result is -1 (line 30), and hence a has no such element. The last postcondition clause (line 31) specifies that these are the only possible values Result may take; that is, the postcondition is complete. The functional postcondition nicely abstracts not only the details of the specific implementation of binary search, but even the fact that a binary search algorithm has been executed. Any search routine, be it binary, sequential, or following any other technique, could export the same postcondition in its interface to communicate to the user its input/output behavior.

The mirroring precondition and postcondition constitute a rigorous interface between the ingenious ephemeral details of the efficient implementation engineered by the programmer and the general terse information that is available to the user, who can so reuse, in different contexts, the programmer’s code and take advantage of its reliable behavior.


Although the analogy between sonatas and routines is imperfect, it unveils deep conceptual connections. As the key to managing complexity, abstraction underpins any efforts to further knowledge; and helps push to new horizons music and programs alike.