OBT served a particular purpose: a venue for testing out new, weird ideas at a familiar setting with familiar faces. It was, as Lindsey Kuper explains, very successful! We all felt that there are now many venues for initial attempts with weird ideas; with that need met elsewhere, OBT feels less important.
OBT may not be gone forever. We’ll reconsider next year. If you have a strong opinion on the subject, please let me know!
We can tease apart the research on gradual types into two ‘lineages’: a pragmatic, implementation-oriented dynamic-first lineage and a formal, type-theoretic, static-first lineage. The dynamic-first lineage’s focus is on taming particular idioms—‘pre-existing conditions’ in untyped programming languages. The static-first lineage’s focus is on interoperation and individual type system features, rather than the collection of features found in any particular language. Both appear in programming languages research under the name “gradual typing”, and they are in active conversation with each other.
What are these two lineages? What challenges and opportunities await the static-first lineage? What progress has been made so far?
Their collapsible contracts are an implementation of the theory in my papers on space-efficient contracts (Space-Efficient Manifest Contracts from POPL 2015 and Space-Efficient Latent Contracts from TFP 2016). They use my merge algorithm to ‘collapse’ contracts and reduce some pathologically bad overheads. I’m delighted that my theory works with only a few bits of engineering cleverness:
Racket’s contracts are first-class values, which means subtle implementation details can impede detecting duplicates. Racket’s contract-stronger? seems to do a good enough job—though it helps that many contracts in Racket are just checking simple types.
There’s an overhead to using the merge strategy in both space and time. You don’t want to pay the price on every contract, but only for those that would consume unbounded space. Their implementation waits until something has been wrapped ten times before using the space-efficient algorithms.
Implication queries can be expensive; they memoize the results of merges.
I am particularly pleased to see the theory/engineering–model/implementation cycle work on such a tight schedule. Very nice!
For some time I’ve had the conviction that word expansion—the process that includes globbing with * but also things like command substitution with backticks—is somehow central to the shell’s interactivity. I’m pleased to have finally expressed my conviction in more detail: Word expansion supports POSIX shell interactivity will appear at PX 2018. Here’s the abstract:
The POSIX shell is the standard tool to deploy, control, and maintain systems of all kinds; the shell is used on a sliding scale from one-off commands in an interactive mode all the way to complex scripts managing, e.g., system boot sequences. For all of its utility, the POSIX shell is feared and maligned as a programming language: the shell is feared because of its incredible power, where a single command can destroy not just local but also remote systems; the shell is maligned because its semantics are non-standard, using word expansion where other languages would use evaluation.
I conjecture that word expansion is in fact an essential piece of the POSIX shell’s interactivity; word expansion is well adapted to the shell’s use cases and contributes critically to the shell’s interactive feel.
It’s my fifth year as a tenure-track assistant professor at Pomona College, a small liberal arts college (SLAC) in Claremont, CA. You may have read about why mathematicians might find the liberal arts setting appealing. Here’s why I think computer scientists might, too. I’ll start with what I love about working at Pomona and then I’ll talk about what I don’t.
Research freedom
First and foremost, I get to do the research I want to do. Neither the department nor the institution rely on overheads to operate, so there’s no grant pressure. Without graduate students, who require both financial support and through-lines of work for a dissertation, I can choose my focus freely.
The “without graduate students” part isn’t for everyone. If your research needs a lab of ten students to build enormous systems, then a SLAC may not be the right place. But if you want to build smaller systems, like my work on Smoosh, then a SLAC could be great; if you’re theoretically inclined, I think a SLAC is an ideal home.
Research expectations vary at liberal arts schools, and I’d count Pomona among the more robust, research-wise—the tenure evaluation split between research and teaching is somewhere between 50/50 and 40/60. If you’re sick of research and want a teaching-only position, then SLACs may not be the right fit.
I’ve been particularly enjoying the freedom to pursue side projects. I spent some of this two summers ago working with two undergraduates to write software for controlling the art department’s CNC router, marrying two of my interests: programming and woodworking. It was a good experience for the students, and it might even lead to a publication, though I never expected it to. While I might have had the time and resources to do such a thing at a larger institution… would I have bothered to lay out so much time on a project unlikely to produce anything for my tenure case?
Finally, Pomona has a generous junior leave policy. I just spent a year with Steve Chong (thanks, Steve!). I spent a year doing nothing but research, and my productivity itch has been pretty well scratched. You can make it work.
Students
Pomona has an 8-to-1 student-to-faculty ratio, with a 2:2 teaching load; some SLACs might have more teaching, like 3:2 or even 3:3. I couldn’t handle a heavier load—I wouldn’t get enough research done to be happy with myself—but I enjoy teaching well rounded students in moderately sized classes (my elective next semester will have no more than 20 students). The high end of students is astonishingly high—juniors and seniors working at an early graduate student level—and it’s rewarding to work with them on things that actually advance my research. My field (programming languages) requires a lot of background to do much of anything; when I came to Pomona, I didn’t expect that I’d be able to do any of my actual research with students. But I’ve been pleasantly surprised—I’ve been able to have one or two students at a time working on real PL research; they go on and do cool things.
Community
Our department of nine tenure-track professors is convivial, genial, and collaborative. It’s not that we’re magically free of disagreement, but rather that we can disagree productively. I’ve spent significant time in CS departments since high school, and many of them are lovely places to be—but I’m charmed by my department’s mellow, friendly relations.
On a larger scale, Pomona College has roughly 200 professors. I’m pleased to have friends and colleagues from many different departments and disciplines across Pomona’s campus (and beyond—there are 400 more faculty at the other Claremont Colleges—the Consortium here is an idiosyncratic collection of institutions). As someone who majored in both computer science and Egyptology, colleagues with diverse interests enliven and enrich my days.
What’s more, Pomona has a culture of service; service is a contribution, a lending of myself that makes me feel like I’m part of the college. Starting from their second year, all professors are expected to sit on college-wide committees—the sort of thing that one might not do until a much higher rank elsewhere. In my committee work, I help steer Pomona’s course; in so doing, I invest in the school as an enterprise. Sure, that investment takes the form of extra meetings and work that doesn’t directly serve me or my research or my teaching or my department. I can imagine a SLAC would be a frustrating place to be for someone who wasn’t community-minded or who was unwilling to take part in bureaucracy.
Values
Many of my values align with institutional goals and practices. Pomona’s diversity plan is a strong, aspirational one—the school is very dedicated to equity, with a focus on students who are the first in their family to attend college, students from low-income families, and students from other underrepresented minorities. I’ve been particularly pleased to see the way the school has rallied around DACA-mented students. My department does a good job bringing in and retaining women, who constitute 30-40% of our majors and just about half of our CS faculty.
To be clear: while Pomona is already doing a lot of things right, there’s of course more to do. I don’t expect an institution to mirror all of my values—and I think it’s healthy to separate my identity, my work, and my employer. But the service culture prevailing at liberal arts colleges leaves the door open for my values to help guide the institution.
Of course, it’s not all roses
Let’s be honest: it’s still a job; the arrangement is that I exchange my time and energy for a nice salary. There are things I wish were different. I often wonder what it would be like to train a grad student or two; I suspect I would be much more productive and have a greater influence on my field. When I submitted an NSF CAREER proposal this two years ago, I had to write everything myself: there’s no big grant office with templates for my data management plan; there’s nobody to look things over for boneheaded mistakes. A larger institution with a bigger department would offer more room for research collaboration. I could happily work on research full time, like last year. Like every other computer science department, we have more majors than we can handle—our classes aren’t as small as we’d like. The liberal arts context is enriching, but students tend to go for more breadth than depth; not enough of our students go on to graduate school and too many get sucked into Silicon Valley.
On balance, though, I’m very happy. I’m less productive than I could be, but I think my best papers are my most recent ones. I have to hunt for students who want depth, but they’re there—and hungry for it! And most importantly: I can work at Pomona without being consumed. The work is tiring, to be sure, but there’s room left for the rest of myself… and, given my own peculiar, personal limits and abilities and aspirations, I’m not so sure there would be elsewhere.
P.S. we’re hiring
haha you saw this coming
If you’re interested… we’re hiring! Feel free to get in touch if you have any questions.
I recently gave a talk to some Pomona College undergrads about how to secure their various devices (computers and phones in particular). I tried to focus on clear and concrete advice. I got tons of help on twitter, and I ultimately came up with a slides for a talk and a security checklist.
First: I’d like to encourage other people to give this talk. Let me know how it goes, and I’ll update the slides or checklist. I’m probably not the best person to do this sort of “lay” outreach. (It was hardly “lay” in any case—I think most of the audience were CS majors.) Maybe you are?
Second: I’d appreciate your feedback. I’d like to keep the slides and checklist up to date. I unfortunately don’t have the time right now to build up more permanent advice à la Decent Security.
Third: I am fundamentally on board with Taylor Swift’s attitude to security. There is so much fear and condescension around security that many people—even CS majors!—have turned off to it. While academic security research can be really interesting, it’s at least as important to have direct, concrete advice for people without significant expertise. In many ways we’re further along than we used to be—Signal, HDD encryption, SSL/TLS—but in some ways we’re worse off (JS drive-by attacks). Whose responsibility is it to educate and protect computer users, if not us? And if not now—when?
Per a request of sk19er on the WordPress forums, I’ve implemented a new feature for PHPEnkoder: you can use the [noenkode] shortcode to turn off PHPEnkoder on a specific page. As always, you can get PHPEnkoder from WordPress Plugin Directory.
In hybrid type checking, a subtyping relationship between subset types {x:T|e} determines when it’s safe to omit a cast. The structural extension of subtyping to, e.g., function types, gives us a straightforward way to achieve verification by optimization: if we can prove that a cast is from a subtype to a supertype, there’s no need to pay the runtime cost of checking anything.
When should we reject a hybrid typed program? In Flanagan’s seminal paper, he offers a straightforward plan:
Flanagan enacts this plan by rejecting casts which are provably not from a subtype to a supertype: if the SMT solver finds that a cast might fail, then it rejects the program. In his model of an SMT solver, returning a checkmark (✓) when checking proves subtyping, a question mark (?) when checking times out, or an x-mark (×) when a counterexample of subtyping is found.
The compilation and checking judgment (a/k/a cast insertion) will insert a cast on successful or indeterminate checking, but has no rules for (and therefore rejects) programs where there is a counterexample.
Unfortunately, such a policy is too restrictive. For example, an SMT solver can easily prove that {x:Int|true} doesn’t imply {x:Int|even x}… but such a program shouldn’t be rejected! We should only reject programs that must fail.
When must a cast fail? When the two types are disjoint, i.e., when they don’t share any values. Since our language with casts has when the only normal form typeable at both types is an error. Cătălin HriÅ£cu got a good start on this in his thesis (see sections 2.4.6 and the end of 3.4.4), where he defines a notion of non-disjointness. He doesn’t define the notion for function types, but that’s actually very easy: so long as two function types are compatible (i.e., equal after erasure of refinements), they are non-disjoint. Why? Well, there’s at least one value that {x:T|e11}→{x:T|e12} and {x:T|e21}→{x:T|e22} have in common: λx. blame.
I can imagine a slightly stronger version: if the domain type is a refinement of base type, then we could exclude functions which could never even be called, i.e., with disjoint domains. At higher orders, though, there’s no way to tell (maybe the function in question never calls its argument), so I think the more liberal interpretation is the right starting point.
NB I briefly discuss when to reject hybrid-typed programs in my dissertation (Section 6.1.4).
But it doesn’t quite work. For one, you also need to install ctypes-foreign. For two, the brew installation of libffi doesn’t automatically install itself globally, so ctypes can’t find it.
Now, this is merely a good start. Once I had this much working, I still couldn’t get OCaml to call C functions! It came down to a bear of a linking issue… no combination of LD_PATH and LD_LIBRARY_PATH and LD_PRELOAD would help, nor would -lflags -cclib,-L. I just couldn’t get ctypes to see my shared library! I finally found an option to ld that does the trick: -force_load. Here’s my build line:
Frank Pfenning originated the idea of refinement types in his seminal PLDI 1991 paper with Tim Freeman. Freeman and Pfenning’s refinement types allow programmers to work with refined datatypes, that is, sub-datatypes induced by refining the set of available constructors. For example, here’s what that looks like for lists, with a single refinement type, ? singleton:
datatype ? list = nil | cons of ? * ? list
rectype ? singleton = cons ? nil
That is, a programmer defines a datatype ? list, but can identify refined types like ? singleton—lists with just one element. We can imagine a lattice of type refinements where ? list is at the top, but below it is the refinement of lists of length 0 or 1—written ? singleton ? ? nil. This type is itself refined by its constituent refinements, which are refined by the empty type. Here’s such a lattice, courtesy of a remarkably nice 1991-era TeX drawing:
Another way of phrasing all of this is that refinement types identify subsets of types. Back in 1983, Bengt Nordström and Kent Petersson introduced—as far as I know—the idea of subset types in a paper called Types and Specifications at the IFIP Congress. Unfortunately, I couldn’t find a copy of the paper, so it’s not clear where the notation {x:A|B(x)} set-builder-esque notation first came from, but it shows up in Bengt Nordström, Kent Petersson, and Jan M. Smith’s Programming in Martin-Löf’s Type Theory in 1990. Any earlier references would be appreciated. Update (2015-03-18):Colin Gordonpointed out that Robert Constable‘s Mathematics as programming from 1984 uses the subset type notation, as does the NUPRL tech report from 1983. The NUPRL TR came out in January ’83 while IFIP ’83 happened in September. Nate Foster, who works with Bob Constable, suspects that Constable has priority. Alright: subset types go to Robert Constable in January 1983 with the Nearly Ultimate Pearl. Going once…
My question is: when did we start calling {x:A | B(x)} and other similar subset types a “refinement type”? Any advice or pointers would be appreciated—I’ll update the post.
Susumu Hayashi in Logic of refinement types describes “ATTT”, which, according to the abstract, “has refinement types which are intended to be subsets of ordinary types or specifications of programs”, where he builds up these refinements out of some set theoretic operators on singletons. By rights, this paper is probably the first to use “refinement type” to mean “subset type”… though I have some trouble pinpointing where the paper lives up to that claim in the abstract.
Besides the built-in type families int, bool, and array, any user-defined data type may be refined by explicit declarations. …
typeref ? list of nat
with nil <| ? list(0)
| :: <| {n:nat} ? * ? list(n) -> ? list(n+1)
Later on, in Section 3.1, they have a similar use of the term:
In the standard basis we have refined the types of many common functions on integers such as addition, subtraction, multiplication, division, and the modulo operation. For instance,
+ <| {m:int} {n:int} int(m) * int(n) -> int(m+n)
is declared in the system. The code in Figure 3 is an implementation of binary search through an array. As before, we assume:
sub <| {n:nat} {i:nat | i < n} ? array(n) * int(i) -> ?
So indices allow users to refine types, though they aren’t quite refinement types. In 1999, Xi and Pfenning make a strong distinction in Dependent Types in Practical Programming; from Section 9:
…while refinement types incorporate intersection and can thus ascribe multiple types to terms in a uniform way, dependent types can express properties such as “these two argument lists have the same length” which are not recognizable by tree automata (the basis for type refinements).
Now, throughout the paper they do things like “refine the datatype with type index objects” and “refine the built-in types: (a) for every integer n, int(n) is a singleton type which contains only n, and (b) for every natural number n, 0 a array(n) is the type of arrays of size n”. So here there’s a distinction between “refinement types”—the Freeman and Pfenning discipline—and a “refined type”, which is a subset of a type indicated by some kind of predicate and curly braces.
… the datasort refinements (often called refinement types) of Freeman, Davies, and Pfenning, and the index refinements of Xi and Pfenning. Both systems refine the simple types of Hindley-Milner type systems.
In her 2004 paper with Frank, Tridirectional Typechecking, she maintains the distinction between refinements, but uses a term I quite like—“property types”, i.e., types that guarantee certain properties.
Yitzhak Mandelbaum, my current supervisor David Walker, and Bob Harper wrote An Effective Theory of Type Refinements in 2003, but they didn’t quite have subset types. Their discussion of related work makes it seem that they interpret refinement types as just about any device that allows programmers to use the existing types of a language more precisely:
Our initial inspiration for this project was derived from work on refinement types by Davies and Pfenning and Denney and the practical dependent types proposed by Xi and Pfenning. Each of these authors proposed to sophisticated type systems that are able to specify many program properties well beyond the range of conventional type systems such as those for Java or ML.
Cormac Flanagan‘s Hybrid Type Checking in 2006 is probably the final blow for any distinction between datasort refinements and index refinements: right there on page 3, giving the syntax for types, he writes “{x:B|t} refinement type“. He says on the same page, at the beginning of Section 2, “Our refinement types are inspired by prior work on decidable refinement type systems”, citing quite a bit of the literature: Mandelbaum, Walker, and Harper; Freeman and Pfenning; Davies and Pfenning ICFP 2000; Xi and Pfenning 1999; Xi LICS 2000; and Ou, Tan, Mandelbaum, and Walker. After Cormac, everyone just seems to call them refinement types: Ranjit Jhala‘s Liquid Types, Robby Findler and Phil Wadler in Well typed programs can’t be blame, my own work, Andy Gordon in Semantic Subtyping with an SMT Solver. This isn’t a bad thing, but perhaps we can be more careful with names. Now that we’re all in the habit of calling them refinements, I quite like “indexed refinements” as a distinction. Alternatively, “subset types” are a very clear term with solid grounding in the literature.
Edited to add: thanks to Ben Greenman for some fixes to broken links and to Lindsey Kuper and Ron Garcia for helping me clarify what refines what.
2020-04-27 update:Shriram Krishnamurthi suggests that Robert (Corky) Cartwright had a notion of “refinement type” in “User-Defined Data Types as an Aid to Verifying LISP Programs” from ICALP 1976 and with John McCarthy in First order programming logic in POPL 1979. I haven’t been able to get a PDF copy of the ICALP paper (please send me one if you can find it!). The POPL paper is clearly related:
The key idea underlying our formal systems is that recursive definitions of partial functions can be interpreted as equations extending a first order theory of the program domain.
Their model is typed, and the paper is about how Corky and John independently discovered ways of addressing recursion/fixed points. They translate programs to logic, treating checks in negative positions as ?‚ like Blume and McAllester’s “A sound (and complete) model of contracts”, but they don’t seem to think of themselves as actually refining types per se. This paper is an interesting early use of an SMT-like logic to prove properties of programs… though they do the proofs by hand!
The auxiliary function ATOMLIST [a program predicate] serves as a clumsy mechanism for specifying the implicit data type atom-list [which he defined by hand]. If we included atom-list as a distinct, explicit data type in our programming language and expanded our first-order theory to include atom-lists as well as S-expressions, the informal proof using induction on atom-lists [given earlier] could be formalized directly in our first order system. However, since LISP programs typically involve a wide variety of abstract data types, simply adding a few extra data types such as atom-list to LISP will not eliminate the confusion caused by dealing with abstract data type representations rather than the abstract types themselves. In fact, the more complex that an abstract type is, the more confusing that proofs involving its representations are likely to be. Consequently, I decided that the best solution to this problem is to include a comprehensive data type definition facility in LISP and to formally define the semantics of a program P by creating a first-order theory for the particular data types defined in P. The resulting language TYPED LISP is described in the next chapter.