smoosh v0.1

I’ve been building an executable formalization of the POSIX shell semantics, which I’ve been calling smoosh (the Symbolic, Mechanized, Observable, Operational SHell).

I’m pleased to announce an important early milestone: smoosh passes the POSIX test suite (modulo locales, which smoosh doesn’t currently support). I’ve accordingly tagged this ‘morally correct’ verison as smoosh v0.1. You can play around with the latest version via the shtepper, a web-based frontend to a symbolic stepper. I think the shtepper will be particularly useful for those trying to learn the shell or to debug shell scripts.

There’s still plenty to do, of course: there are bugs to fix, quirks to normalize, features to implement, and tests to run. But please: play around with the smoosh shell and the shtepper, file bug reports, and submit patches!

OBT on hiatus

The recent organizers of the Off the Beaten Track (OBT) workshop (Luke Church, Bob Atkey, Lindsey Kuper, Swarat Chaudhuri, Ranjit Jhala, Shriram Krishnamurthi, David Walker, and me) have decided not to hold OBT at POPL 2020.

OBT served a particular purpose: a venue for testing out new, weird ideas at a familiar setting with familiar faces. It was, as Lindsey Kuper explains, very successful! We all felt that there are now many venues for initial attempts with weird ideas; with that need met elsewhere, OBT feels less important.

OBT may not be gone forever. We’ll reconsider next year. If you have a strong opinion on the subject, please let me know!

The Dynamic Practice and Static Theory of Gradual Typing

I’ll be presenting my thoughts on the state of gradual typing research—along with some goals and challenges—at SNAPL 2019. Here’s the abstract of my paper, The Dynamic Practice and Static Theory of Gradual Typing:

We can tease apart the research on gradual types into two ‘lineages’: a pragmatic, implementation-oriented dynamic-first lineage and a formal, type-theoretic, static-first lineage. The dynamic-first lineage’s focus is on taming particular idioms—‘pre-existing conditions’ in untyped programming languages. The static-first lineage’s focus is on interoperation and individual type system features, rather than the collection of features found in any particular language. Both appear in programming languages research under the name “gradual typing”, and they are in active conversation with each other.

What are these two lineages? What challenges and opportunities await the static-first lineage? What progress has been made so far?

See you in Providence?

Collapsible Contracts: Space-Efficient Contracts in Racket

While on sabbatical in Cambridge, MA (thanks, Steve!), I had the good fortune to attend my first SPLASH.

I was particularly excited by one paper: Collapsible Contracts: Fixing a Pathology of Gradual Typing by Daniel Feltey, Ben Greenman, Christophe Scholliers, Robby Findler, and Vincent St-Amour. (You can get the PDF from the ACM DL or from Vincent’s website.)

Their collapsible contracts are an implementation of the theory in my papers on space-efficient contracts (Space-Efficient Manifest Contracts from POPL 2015 and Space-Efficient Latent Contracts from TFP 2016). They use my merge algorithm to ‘collapse’ contracts and reduce some pathologically bad overheads. I’m delighted that my theory works with only a few bits of engineering cleverness:

  • Racket’s contracts are first-class values, which means subtle implementation details can impede detecting duplicates. Racket’s contract-stronger? seems to do a good enough job—though it helps that many contracts in Racket are just checking simple types.
  • There’s an overhead to using the merge strategy in both space and time. You don’t want to pay the price on every contract, but only for those that would consume unbounded space. Their implementation waits until something has been wrapped ten times before using the space-efficient algorithms.
  • Implication queries can be expensive; they memoize the results of merges.

I am particularly pleased to see the theory/engineering–model/implementation cycle work on such a tight schedule. Very nice!

New paper: Word expansion supports POSIX shell interactivity

I’ve been thinking about and working on the POSIX shell for a little bit over a year now. I wrote a paper for OBT 2017, titled Understanding the POSIX Shell as a Programming Language, outlining why I think the shell is worthy of study.

For some time I’ve had the conviction that word expansion—the process that includes globbing with * but also things like command substitution with backticks—is somehow central to the shell’s interactivity. I’m pleased to have finally expressed my conviction in more detail: Word expansion supports POSIX shell interactivity will appear at PX 2018. Here’s the abstract:

The POSIX shell is the standard tool to deploy, control, and maintain systems of all kinds; the shell is used on a sliding scale from one-off commands in an interactive mode all the way to complex scripts managing, e.g., system boot sequences. For all of its utility, the POSIX shell is feared and maligned as a programming language: the shell is feared because of its incredible power, where a single command can destroy not just local but also remote systems; the shell is maligned because its semantics are non-standard, using word expansion where other languages would use evaluation.

I conjecture that word expansion is in fact an essential piece of the POSIX shell’s interactivity; word expansion is well adapted to the shell’s use cases and contributes critically to the shell’s interactive feel.

See you in Nice?

Computer science at a small liberal arts college

NB I left Pomona college in 2021. But they’re hiring again, and I recommend you consider it!

It’s my fifth year as a tenure-track assistant professor at Pomona College, a small liberal arts college (SLAC) in Claremont, CA. You may have read about why mathematicians might find the liberal arts setting appealing. Here’s why I think computer scientists might, too. I’ll start with what I love about working at Pomona and then I’ll talk about what I don’t.

Research freedom

First and foremost, I get to do the research I want to do. Neither the department nor the institution rely on overheads to operate, so there’s no grant pressure. Without graduate students, who require both financial support and through-lines of work for a dissertation, I can choose my focus freely.

The “without graduate students” part isn’t for everyone. If your research needs a lab of ten students to build enormous systems, then a SLAC may not be the right place. But if you want to build smaller systems, like my work on Smoosh, then a SLAC could be great; if you’re theoretically inclined, I think a SLAC is an ideal home.

Research expectations vary at liberal arts schools, and I’d count Pomona among the more robust, research-wise—the tenure evaluation split between research and teaching is somewhere between 50/50 and 40/60. If you’re sick of research and want a teaching-only position, then SLACs may not be the right fit.

I’ve been particularly enjoying the freedom to pursue side projects. I spent some of this two summers ago working with two undergraduates to write software for controlling the art department’s CNC router, marrying two of my interests: programming and woodworking. It was a good experience for the students, and it might even lead to a publication, though I never expected it to. While I might have had the time and resources to do such a thing at a larger institution… would I have bothered to lay out so much time on a project unlikely to produce anything for my tenure case?

Finally, Pomona has a generous junior leave policy. I just spent a year with Steve Chong (thanks, Steve!). I spent a year doing nothing but research, and my productivity itch has been pretty well scratched. You can make it work.

Students

Pomona has an 8-to-1 student-to-faculty ratio, with a 2:2 teaching load; some SLACs might have more teaching, like 3:2 or even 3:3. I couldn’t handle a heavier load—I wouldn’t get enough research done to be happy with myself—but I enjoy teaching well rounded students in moderately sized classes (my elective next semester will have no more than 20 students). The high end of students is astonishingly high—juniors and seniors working at an early graduate student level—and it’s rewarding to work with them on things that actually advance my research. My field (programming languages) requires a lot of background to do much of anything; when I came to Pomona, I didn’t expect that I’d be able to do any of my actual research with students. But I’ve been pleasantly surprised—I’ve been able to have one or two students at a time working on real PL research; they go on and do cool things.

Community

Our department of nine tenure-track professors is convivial, genial, and collaborative. It’s not that we’re magically free of disagreement, but rather that we can disagree productively. I’ve spent significant time in CS departments since high school, and many of them are lovely places to be—but I’m charmed by my department’s mellow, friendly relations.

On a larger scale, Pomona College has roughly 200 professors. I’m pleased to have friends and colleagues from many different departments and disciplines across Pomona’s campus (and beyond—there are 400 more faculty at the other Claremont Colleges—the Consortium here is an idiosyncratic collection of institutions). As someone who majored in both computer science and Egyptology, colleagues with diverse interests enliven and enrich my days.

What’s more, Pomona has a culture of service; service is a contribution, a lending of myself that makes me feel like I’m part of the college. Starting from their second year, all professors are expected to sit on college-wide committees—the sort of thing that one might not do until a much higher rank elsewhere. In my committee work, I help steer Pomona’s course; in so doing, I invest in the school as an enterprise. Sure, that investment takes the form of extra meetings and work that doesn’t directly serve me or my research or my teaching or my department. I can imagine a SLAC would be a frustrating place to be for someone who wasn’t community-minded or who was unwilling to take part in bureaucracy.

Values

Many of my values align with institutional goals and practices. Pomona’s diversity plan is a strong, aspirational one—the school is very dedicated to equity, with a focus on students who are the first in their family to attend college, students from low-income families, and students from other underrepresented minorities. I’ve been particularly pleased to see the way the school has rallied around DACA-mented students. My department does a good job bringing in and retaining women, who constitute 30-40% of our majors and just about half of our CS faculty.

To be clear: while Pomona is already doing a lot of things right, there’s of course more to do. I don’t expect an institution to mirror all of my values—and I think it’s healthy to separate my identity, my work, and my employer. But the service culture prevailing at liberal arts colleges leaves the door open for my values to help guide the institution.

Of course, it’s not all roses

Let’s be honest: it’s still a job; the arrangement is that I exchange my time and energy for a nice salary. There are things I wish were different. I often wonder what it would be like to train a grad student or two; I suspect I would be much more productive and have a greater influence on my field. When I submitted an NSF CAREER proposal this two years ago, I had to write everything myself: there’s no big grant office with templates for my data management plan; there’s nobody to look things over for boneheaded mistakes. A larger institution with a bigger department would offer more room for research collaboration. I could happily work on research full time, like last year. Like every other computer science department, we have more majors than we can handle—our classes aren’t as small as we’d like. The liberal arts context is enriching, but students tend to go for more breadth than depth; not enough of our students go on to graduate school and too many get sucked into Silicon Valley.

On balance, though, I’m very happy. I’m less productive than I could be, but I think my best papers are my most recent ones. I have to hunt for students who want depth, but they’re there—and hungry for it! And most importantly: I can work at Pomona without being consumed. The work is tiring, to be sure, but there’s room left for the rest of myself… and, given my own peculiar, personal limits and abilities and aspirations, I’m not so sure there would be elsewhere.

P.S. we’re hiring

haha you saw this coming

If you’re interested… we’re hiring! Feel free to get in touch if you have any questions.

Computer Security in the Face of a Hostile Government

I recently gave a talk to some Pomona College undergrads about how to secure their various devices (computers and phones in particular). I tried to focus on clear and concrete advice. I got tons of help on twitter, and I ultimately came up with a slides for a talk and a security checklist.

First: I’d like to encourage other people to give this talk. Let me know how it goes, and I’ll update the slides or checklist. I’m probably not the best person to do this sort of “lay” outreach. (It was hardly “lay” in any case—I think most of the audience were CS majors.) Maybe you are?

Second: I’d appreciate your feedback. I’d like to keep the slides and checklist up to date. I unfortunately don’t have the time right now to build up more permanent advice à la Decent Security.

Third: I am fundamentally on board with Taylor Swift’s attitude to security. There is so much fear and condescension around security that many people—even CS majors!—have turned off to it. While academic security research can be really interesting, it’s at least as important to have direct, concrete advice for people without significant expertise. In many ways we’re further along than we used to be—Signal, HDD encryption, SSL/TLS—but in some ways we’re worse off (JS drive-by attacks). Whose responsibility is it to educate and protect computer users, if not us? And if not now—when?

Disjointness of subset types

In hybrid type checking, a subtyping relationship between subset types {x:T|e} determines when it’s safe to omit a cast. The structural extension of subtyping to, e.g., function types, gives us a straightforward way to achieve verification by optimization: if we can prove that a cast is from a subtype to a supertype, there’s no need to pay the runtime cost of checking anything.

When should we reject a hybrid typed program? In Flanagan’s seminal paper, he offers a straightforward plan:


Flanagan enacts this plan by rejecting casts which are provably not from a subtype to a supertype: if the SMT solver finds that a cast might fail, then it rejects the program. In his model of an SMT solver, returning a checkmark (✓) when checking proves subtyping, a question mark (?) when checking times out, or an x-mark (×) when a counterexample of subtyping is found.



The compilation and checking judgment (a/k/a cast insertion) will insert a cast on successful or indeterminate checking, but has no rules for (and therefore rejects) programs where there is a counterexample.



Unfortunately, such a policy is too restrictive. For example, an SMT solver can easily prove that {x:Int|true} doesn’t imply {x:Int|even x}… but such a program shouldn’t be rejected! We should only reject programs that must fail.

When must a cast fail? When the two types are disjoint, i.e., when they don’t share any values. Since our language with casts has when the only normal form typeable at both types is an error. Cătălin HriÅ£cu got a good start on this in his thesis (see sections 2.4.6 and the end of 3.4.4), where he defines a notion of non-disjointness. He doesn’t define the notion for function types, but that’s actually very easy: so long as two function types are compatible (i.e., equal after erasure of refinements), they are non-disjoint. Why? Well, there’s at least one value that {x:T|e11}→{x:T|e12} and {x:T|e21}→{x:T|e22} have in common: λx. blame.

I can imagine a slightly stronger version: if the domain type is a refinement of base type, then we could exclude functions which could never even be called, i.e., with disjoint domains. At higher orders, though, there’s no way to tell (maybe the function in question never calls its argument), so I think the more liberal interpretation is the right starting point.

NB I briefly discuss when to reject hybrid-typed programs in my dissertation (Section 6.1.4).

Installing ctypes and ctypes-foreign on OS X with brew and OPAM

I recently had some trouble getting ctypes working, so I thought I’d share my solution.

I found the Real World OCaml chapter on FFIs, and I tried following their advice first. They suggest:

brew install libffi
opam install ctypes

But it doesn’t quite work. For one, you also need to install ctypes-foreign. For two, the brew installation of libffi doesn’t automatically install itself globally, so ctypes can’t find it.

Here’s the right incantation:

brew install libffi
cd /usr/lib/
sudo mv libffi.dylib libffi.dylib.orig
sudo ln -s /usr/local/opt/libffi/lib/libffi.dylib
LIBFFI_CFLAGS=-I/usr/include/ffi CFLAGS=-I/usr/include/ffi opam install ctypes ctypes-foreign

Now, this is merely a good start. Once I had this much working, I still couldn’t get OCaml to call C functions! It came down to a bear of a linking issue… no combination of LD_PATH and LD_LIBRARY_PATH and LD_PRELOAD would help, nor would -lflags -cclib,-L. I just couldn’t get ctypes to see my shared library! I finally found an option to ld that does the trick: -force_load. Here’s my build line:

ocamlbuild -no-hygiene -pkg ctypes.foreign -lflags "-cclib -force_load /path/to/libfoo.a" bar.native

I imagine you can get away without -no-hygiene; I couldn’t get corebuild to work at all.

In any case, special thanks to Arjun Narayan for helping me debug this, and to Spiros Eliopolous for mockery absolutely nothing.