Cast notation: a case study

I recently wrote on the SIGPLAN blog about how PL notation is a barrier to entry. While the arguments are mine, I acknowledge the many folks who helped me write it in the post. Ideas from an interesting conversation with Neel Krishnaswami came up again in a comment from Jeremy Gibbons. The universe has spoken: here’s what I think about cast notation.

First, here’s what Neel said:

I think that the argument in this note is a bit under theorized. As I see it, there are three fundamental forces at work:

1. Infix operators are hard to parse without extra side information (eg, precedence).

2. Center-embedding is hard to parse for psycholinguistic reasons.

3. Semantics is about relating things, and it is easier for people to see connections when the notational differences between the two things are small.

So when you compare cast(e, S, T) to e <S => T>, the functional notation wins in point 1 and loses on point 2. This is because cast(cast(e, S, T), T, U) has nested structure in a way that e <S => T> <T => U> does not—the second one can parse as exp cast*.

I don’t know the work on gradual typing well enough to say with any confidence, but I would have expected the second notation to be a bit better for 3. The semantics of the term e is a map Γ -> S, and if the meaning of a cast is an embedding function S -> T, then [[ e <S => T> ]] = [[e]]; [[<S => T>]] — i.e., the parts of the term can be interpreted using composition in a diagrammatic order without changing the relative position of the subterms.

My guess in this last part is also an example of the dynamic that leads to bad notation — we pick our notation at the outset, based on a guess about which semantic properties are important, and we can get stuck with bad notation if the properties that actually are important differ from the ones we guessed when designing the notation. (Well, people can also just have bad taste, I guess.)

—Neel Krishnaswami, personal communication.

Both Neel and Jeremy zeroed in on a feature I really like about the e <S => T> or e :: S => T notations: they’re neatly diagrammatic. Jeremy goes further, noting that these notation suggess that cast might compose, as in e <S=>T> <T=>U> ~= e <S=>U>.

If I were forced to choose a notation, I do think these two are the best… with a slight preference for e :: S => T. (I think Phil Wadler introduced this notation, but I’m too tired to run it down just now. Let me know in the comments? Edit: thanks to James McKinna for pointing out the source—“Blame for all” by Ahmed et al.—in a comment, below!)

So why do I prefer a function? In short: the notations suggest identities which don’t actually hold.

Casts don’t compose

Whether you’re casting between simple gradual types or dependent refinements… casts don’t actually compose! Consider a term f : int -> ?, i.e., a function f that takes an int and returns… something.

We can cast f to be purely dynamic, writing f' = f :: (int -> ?) => (? -> ?). These types are compatible, i.e., they differ only by having or not having a ?. Now eventually f' may flow out of the dynamic part of our code and arrive at some context that thinks f' ought to be a bool -> bool function, casting it. So we get:

f' :: (? -> ?) => (bool -> bool) =
(f :: (int -> ?) => (? -> ?)) :: (? -> ?) => (bool -> bool) =
f :: (int -> ?) => (? -> ?) => (bool -> bool) 

Now, composition would say that we ought to be able to convert the above to f :: (int -> ?) => (bool -> bool), but such a cast is statically forbidden—these types aren’t compatible because their domains are incompatible, i.e., you can never cast from int to bool! (If you’re surprised that compatibility isn’t transitive in this way, a whole literature awaits. I recommend Abstracting Gradual Typing by Garcia, Clark, and Tanter as a starting point: transitivity can fail.)

In the contracts world, e :: S => T => U ~= e :: S => U is just as bad. What if S = U = {x:Int|true}, but T = {x:Int|x>0}. By eliminating the cast in the middle, we’ve forgotten to check that e is positive! Such a forgetful semantics comes up as a possible space-efficient semantics, though you can do better.

Cast congruence

Where Jeremy talked about composition of casts, Neel talked about compositional semantics: that is, the postfix notation directly suggested a diagrammatic denotation, as in [[ e :: S => T ]] = [[ e ]] ; [[ S => T]]. My experience with casts suggests that this intuition isn’t a helpful one, for two reasons.

First: the key ingredient for space-efficient evaluation is not using conventional composition for casts, but rather treating casts in your continuation specially. That’s no ordinary semi-colon! A “cast congruence” lemma lets you recover conventional reasoning, but it takes quite some work to get there.

Second, treating casts as first class (i.e., utterable without being directly applied) forces you to think about very strange terms, like <S1->S2 => T1->T2> <S1 => S2>. (Just… don’t. For why?) Just as for primitive operations, it’s simplest to force casts to be fully applied.

Use coercions

I don’t like these notations for casts because they offer bad suggestions. A textual notation is modestly more cumbersome here, but it’s worth it for clarity to newcomers. It’s particularly worth skipping fancy notation in this setting, because casts are merely a technical device for a core calculus, not a part of the surface language.

But the real truth is: if you’re interested in higher-order runtime checks, you really should be using Henglein’s coercions anyway. (And check out Henglein and Rehof’s Scheme implemented on those principles while you’re at it.) Coercions are much clearer than casts, compose naturally, and are the source of space efficiency and fast implementations. What’s more, blame safety for coercions is straightforward… it’s syntactic!

Postscript: the beam in my eye

You might say, “Well, Michael, it seems like all your papers use the <S => T> e notation. Who are you to judge?”

I used that notation first for POPL 2010, when we (me, Benjamin Pierce, and Stephanie Weirich) tried to figure out whether or not contracts and hybrid types were the same. (They are at simple types. They aren’t if you have dependency—they treat “abusive”, self-contradictory contracts differently.) I just stole Cormac Flanagan’s notation from Hybrid Type Checking, changing an \rhd to a \Rightarrow. (Ugh, why change it at all? My guess: I could draw \Rightarrow better on the board.)

I’ve stuck with that notation… and that’s the heart of the problem. People get used to a board notation, and then they decide that their preferred shorthand is what should go in the paper. What’s good for you when you’re doing the work may not be helpful or even clear to new folks.

I like Neel’s final diagnosis. Don’t invent a notation in the first paper. Use whatever you want on the board, but publish with text first. Like a good stew, your notation will be better on the second day, once the flavors have had time to marry. Later on, if you decide you do need a notation, you’ll know exactly which identities you want your notation to suggest… and which it should not suggest!

Racist Bullshit in Mathematics

Robin Gandy’s “On the Axiom of Extensionality–Part 1”, Journal of Symbolic Logic, Vol. 21, No. 1 (Mar., 1956) quotes Alan Turing using a racist phrase.

A screengrab if the bottommatter of the first page of Gandy's paper.

Received July 24, 1955.
1 Indeed A. M. Turing once told me that he had done this, and that the proof was fairly difficult. I have found among his manuscripts two versions of the proof: one is rather short and contains a fallacy which could not, I think, easily be put right; the other (perhaps a second draft) is unfinished and only a beginning. He may therefore have discovered and surmounted the fallacy. On the other hand, he always spoke of the axiom of extensionality as being 'the nigger in the woodpile', which suggests that he did not think his consistency was transcendental enough to accord with Gödel's theorem; but, by the results of this paper, it would have to be just that.
[Turing] always spoke of the axiom of extensionality as being ‘the nigger in the woodpile’, which suggests that he did not think his consistency proof was transcendental enough to accord Gödel’s theorem.

Yikes. Those unfamiliar with this particular racist phrase will be disappointed to learn that it’s still current enough in the UK to be used “totally unintentional[ly]”… whatever that means.

Gandy’s paper isn’t the first time I’ve been pulled out of my mathematical/logical/philosophical reverie by racist bullshit. When I was reading Ronald Clark’s The Life of Bertrand Russell, I posted a thread on Twitter of Russell’s many racist utterances, with a selection of three racist Bertrand Russell quotes; two anti-racist quotes repudiate his earlier statements, offering some modest redemption.

I found these episodes of casual, by-the-way racism jarring: they pulled me out of my investment with the material and my ability or even desire to identify with the author.

What’s galling is that Turing “always” spoke of the axiom of extensionality this way; Gandy thought the phrase worth repeating verbatim; the reviewers and editors and publishers thought the phrase acceptable; and those who cite the paper don’t seem to find this footnote worth remarking on. Gandy’s paper is important and widely cited—a foundational resource on extensionality in general, and functional extensionality in particular—but if I refer someone to it, I’m going to let them know to expect a disappointingly racist quote from Turing.