Contracts: first-order interlopers in a higher-order world

Reading Aseem Rastogi, Avik Chaudhuri, and Basil Hosmer‘s POPL 2012 paper The Ins and Outs of Gradual Type Inference, I ran across a quote that could well appear directly in my POPL 2015 paper, Space-Efficient Manifest Contracts:

The key insight is that … we must recursively deconstruct higher-order types down to their first-order parts, solve for those …, and then reconstruct the higher-order parts … . [Emphasis theirs]

Now, they’re deconstructing “flows” in their type inference and I’m deconstructing types themselves. They have to be careful about what’s known in the program and what isn’t, and I have to be careful about blame labels. But in both cases, a proper treatment of errors creates some asymmetries. And in both cases, the solution is to break everything down to the first-order checks, reconstructing a higher-order solution afterwards.

The “make it all first order” approach contrasts with subtyping approaches (like in Well Typed Programs Can’t Be Blamed and Threesomes, with and without blame). I think it’s worth pointing out that as we begin to consider blame, contract composition operators look less and less like meet operations and more like… something entirely different. Should contracts with blame inhabit some kind of skew lattice? Something else?

I highly recommend the Rastogi et al. paper, with one note: when they say kind, I think they mean “type shape” or “type skeleton”—not “kind” in the sense of classifying types and type constructors. Edited to add: also, how often does a type inference paper include a performance evaluation? Just delightful!

New and improved: Space-Efficient Manifest Contracts

I have a new and much improved draft of my work on Space-Efficient Manifest Contracts. Here’s the abstract:

The standard algorithm for higher-order contract checking can lead to unbounded space consumption and can destroy tail recursion, altering a program’s asymptotic space complexity. While space efficiency for gradual types—contracts mediating untyped and typed code—is well studied, sound space efficiency for manifest contracts—contracts that check stronger properties than simple types, e.g., “is a natural” instead of “is an integer”—remains an open problem.

We show how to achieve sound space efficiency for manifest contracts with strong predicate contracts. We define a framework for space efficiency, traversing the design space with three different space-efficient manifest calculi. Along the way, we examine the diverse correctness criteria for contract semantics; we conclude with a language whose contracts enjoy (galactically) bounded, sound space consumption—they are observationally equivalent to the standard, space-inefficient semantics.

Update: it was accepted to POPL’15!

Concurrent NetCore: From Policies to Pipelines

Cole Schlesinger, Dave Walker, and I submitted a paper to ICFP 2014. It’s called Concurrent NetCore: From Policies to Pipelines. Here’s the abstract:

In a Software-Defined Network (SDN), a central, computationally powerful controller manages a set of distributed, computationally simple switches. The controller computes a policy describing how each switch should route packets and populates packet-processing tables on each switch with rules to enact the routing policy. As network conditions change, the controller continues to add and remove rules from switches to adjust the policy as needed.

Recently, the SDN landscape has begun to change as several proposals for new, reconfigurable switching architectures, such as RMT and FlexPipe have emerged. These platforms provide switch programmers with many, flexible tables for storing packet-processing rules, and they offer programmers control over the packet fields that each table can analyze and act on. These reconfigurable switch architectures support a richer SDN model in which a switch configuration phase precedes the rule population phase. In the configuration phase, the controller sends the switch a graph describing the layout and capabilities of the packet processing tables it will require during the population phase. Armed with this foreknowledge, the switch can allocate its hardware (or software) resources more efficiently.

We present a new, typed language, called Concurrent NetCore, for specifying routing policies and graphs of packet-processing tables. Concurrent NetCore includes features for specifying sequential, conditional and concurrent control-flow between packet- processing tables. We develop a fine-grained operational model for the language and prove this model coincides with a higher level denotational model when programs are well typed. We also prove several additional properties of well typed programs, including strong normalization and determinism. To illustrate the utility of the language, we develop linguistic models of both the RMT and FlexPipe architectures and we give a multi-pass compilation algorithm that translates graphs and routing policies to the RMT model.

A Balance of Power: Expressive, Analyzable Controller Programming

I just finished reading A Balance of Power: Expressive, Analyzable Controller Programming. It’s an interesting proposal, but I’m writing just to express my satisfaction with the following sentence:

When we hit expressive limits, however, our goal is not to keep growing this language—down that path lies and other sulphurous designs—but to call out to full-language code.

‘Sulphurous’ indeed. Come for the nonmonotonic interpretation of learning, stay for the colorful prose.

Bug in “Polymorphic Contracts”

The third chapter of my dissertation is effectively a longer version of an ESOP 2011 paper, Polymorphic Contracts. We define FH, a polymorphic calculus with manifest contracts. Atsushi Igarashi, with whom I did the original FH work that appeared in ESOP 2011, and his student Taro Sekiyama have been working on continuing some of the FH work. They discovered—after my defense!—a bug in FH’s metatheory.

Short version: FH used parallel reduction as a conversion relation. A key property of this relation is substitutivity. We phrased it as “if e1 ⇒ e1′ and e2 ⇒ e2′ then e1{e2/x} ⇒ e1′{e2’/x}”. Unfortunately, this doesn’t hold for FH, due to subtleties in FH’s reduction rules for casts. The cast reduction rules are implicitly performing equality checks on types, and these equality checks can be affected by substitutions to change which reduction rule applies. The (tentative) solution in my thesis is to use a simpler type (and term) conversion relation which we call common subexpression reduction (CSR). In CSR, we relate types and terms that are closed by closing substitutions σ1* σ2. That is, the CSR conversion is the smallest congruence which is substitutive for →*, i.e., where if e →* e’ then T{e/x} ≡ T{e’/x}.

Long version: I’ve excerpted Section 3.5 of my thesis which discusses the System FH type conversion bug.

PhD thesis: Manifest Contracts

I defended my PhD thesis, Manifest Contracts on November 7th, 2013, with the final document submitted on December 6th. Since the doctoral degree shows up on my Penn transcript, I feel comfortable telling the world: I got a PhD! My thesis committee, comprising Stephanie Weirich (the chair), Rajeev Alur, Greg Morrisett, and Steve Zdancewic. Here’s the abstract:

Eiffel popularized design by contract, a software design philosophy where programmers specify the requirements and guarantees of functions via executable pre- and post-conditions written in code. Findler and Felleisen brought contracts to higher-order programming, inspiring the PLT Racket implementation of contracts. Existing approaches for runtime checking lack reasoning principles and stop short of their full potential—most Racket contracts check only simple types. Moreover, the standard algorithm for higher-order contract checking can lead to unbounded space consumption and can destroy tail recursion. In this dissertation, I develop so-called manifest contract systems which integrate more coherently in the type system, and relate them to Findler-and-Felleisen-style latent contracts. I extend a manifest system with type abstraction and relational parametricity, and also show how to integrate dynamic types and contracts in a space efficient way, i.e., in a way that doesn’t destroy tail recursion. I put manifest contracts on a firm type-theoretic footing, showing that they support extensions necessary for real programming. Developing these principles is the first step in designing and implementing higher-order languages with contracts and refinement types.

I’ll be starting as a post-doc with Dave Walker on Monday.

Axiomatizing manifest contracts

I was speaking with Phil Wadler at ICFP about full-spectrum programming languages, and it came up that Phil wasn’t familiar with the inconsistency in the Sage theorem proving axioms from Knowles et al. ’06. He prodded me to talk about it here.

Here’s the axioms from Knowles et al.’s ’06 Scheme Workshop paper.
Sage axioms
The inconsistency is between Faithfulness and Hypothesis. Hypothesis demands that x:{y:Int|false} ⊨ false[y:=x] = false, but Faithfulness requires that x:{y:Int|false} ⊭ false.

Talking with Kenn Knowles, he agreed—Faithfulness should instead say E ⊨ not false—the soundness of the logic comes from not being able to prove false from an empty environment. Kenn had an interesting analysis, which I’m including wholesale:

In practice, of course, the soundness of e.g. SMT is not in question. There are really two important and separable questions that axioms such as those of Ou et al ’04, our Scheme ’06, and our PLPV ’09 conflate somewhat. Neither is answered to my satisfaction.

  1. How shall we give semantics to contracts and executable refinement types?

    And the meta-question: Why do so many authors create new and rather different semantics? Perhaps your POPL ’10 and our TOPLAS ’10 may represent a convergence (knock on wood) towards techniques that are both generally effective and simple enough that future work will build upon them rather than just trying to make something simpler, as has happened a few times in the contract space.

    For this question, I feel an axiomatization yields little insight, while a lightweight denotation has a nice explanatory flavor. Thus my perspective is that critiquing axiomatizations does not substantially contribute to this question.

  2. How effectively can specifications / proof obligations written in Type Theory or subsets thereof be translated to other logics such as SMT?

    While an axiomatization seems like a way to characterize such translations, the question itself was not the focus of Ou et al ’04, our Scheme ’06, or our PLPV ’09. Without such results the direct and concrete approach of Liquid Types (e.g. PLDI ’08 and ESOP ’13) that simply illustrates a particular mapping may be clearer.

    For this question, I feel an axiomatization falls short unless it has proofs of interesting properties which illuminate why it is a minimal or otherwise “good” set of axioms for characterizing the space of translations/logics. Thus my perspective is that critiquing axiomatizations that already lack such properties/proofs does not substantially contribute to this question.

With regard to Kenn’s first point, Belo et al.’s ’11 syntactic semantics gives one answer for how to give semantics to contracts, though there’s plenty of room for less syntactic analyses.

Space-Efficient Manifest Contracts

I just submitted a paper to POPL 2014; it’s called Space-Efficient Manifest Contracts. Here’s the abstract:

Gradual types mediate the interaction between dynamic and simple types, offering an easy transition from scripts to programs; gradual types allow programmers to evolve prototype scripts into fully fledged, deployable programs. Similarly, contracts and refinement types mediate the interaction between simple types and more precise types, offering an easy transition from programs to robust, verified programs. A full-spectrum language with both gradual and refinement types offers low-level support for the development of programs throughout their lifecycle, from prototype script to verified program.

One attractive formulation of languages with gradual or refinement types uses casts to represent the runtime checks necessary for type changes (from dynamic to simple types, and from simple types to refinement types). Briefly, a cast ⟨T1⇒T2⟩ e takes a term e from type T1 to T2, possibly wrapping or tagging e in the process—or even failing, if e doesn’t meet the criteria of the type T2. Casts are attractive because they offer a unified view of changes in type, have straightforward operational semantics, and enjoy an interesting and fruitful relationship with subtyping.

One longstanding problem with casts is space efficiency: casts in their naive formulation can consume unbounded amounts of space at runtime both through excessive wrapping as well as through tail-recursion-breaking stack growth. Prior work offers space-efficient solutions exclusively in the domain of gradual types. In this paper, we define a new full-spectrum language that is (a) more expressive than prior languages, and (b) space efficient. We are the first to obtain space-efficient refinement types. Our approach to space efficiency is based on the coercion calculi of Herman et al. and Henglein’s work, though our explicitly enumerated canonical coercions and our straightforward merge operator are a novel approach to coercions with a simpler theory. We show that space efficiency avoids some checks, failing and diverging less often than naive calculi—but the two are otherwise operationally equivalent.

(Yes, I know that reviewing is double blind. But as the FAQ clearly states, double-blind review isn’t meant to hinder communication of results.)

NJPLS Fall 2012

Penn hosted NJPLS today. It’s been a while, but these local meetings are always fun. I presented on the status of my thesis work, Compiling Dynamic Information Flow Control. I’m building a compiler for λNaV (see our POPL submission).

One thing I omitted from the presentation—which came up more than once in conversations afterwards—is that my compiler uses LLVM as a backend. I parse a program, translate it to my own IR, optimize, and then send it off to LLVM. I’m not writing LLVM passes, or adding any kinds of AST nodes. Having my own IR probably isn’t the best architecture, but it’s much more agile than trying to modify LLVM itself. It also seems like a natural choice: my IR is tailored to reasoning about DIFC label lattices, while LLIR is particularly well suited for lower level languages, like C.

Exceptionally Available Dynamic IFC

I’m happy to report that Cătălin Hriţcu, Ben Karel, Benjamin Pierce, Greg Morrisett, and I submitted a paper to POPL 2013: the paper is called Exceptionally Available Dynamic IFC. Here’s the abstract:

Existing designs for fine-grained, dynamic information-flow control assume that it is acceptable to terminate the entire system when an incorrect flow is detected—i.e, they give up availability for the sake of confidentiality and integrity. This is an unrealistic limitation for systems such as long-running servers.

We identify public labels and delayed exceptions as crucial ingredients for making information-flow errors recoverable while retaining the fundamental soundness property of non-interference, and we propose two new error-handling mechanisms that make all errors recoverable. The first mechanism builds directly on these basic ingredients, using not-a-values (NaVs) and data flow to propagate errors. The second mechanism adapts the standard exception model to satisfy the extra constraints arising from information flow control, converting thrown exceptions to delayed ones at certain points. We prove both mechanisms sound. Finally, we describe a prototype implementation of a full-scale language with NaVs and report on our experience building high-availability software components in this setting.