August | 2020 | weaselhat

So you want to define a gradual type system, like all the cool kids? My SNAPL 2019 paper imagines three possible motivations:

Expressiveness. You have nothing to lose but your static chains!
Interoperation. Gradual typing seamlessly weaves the dynamic and static worlds into a single fabric.
Typing itself. Static typing offers myriad benefits: enjoy them today!

You don’t have to pick just one. Or maybe you have a different motivation—I’d love to hear it. If you’re motivated by one of these goals but aren’t sure what to do, the paper offers a variety of challenge problems in Section 3.

Now, what do you have to do to define your gradual type system? You have to come up with a type system that has a question mark in it, of course. (You can also write any or dyn or Dynamic or *—whatever works for you.) But what else?

A surface language. Since Siek and Taha’s seminal 2006 paper, gradual types have commonly been expressed via elaboration: a source language (with nonexistent or optional or partial type annotations) is translated to a core language that makes all dynamism explicit. What is your source language? Even if you don’t define your source language formally, give an intuition about how programmers will experience it. Can programmers control what’s dynamic and what’s static? Do you ever reject source programs? Which? (GTLC rejects true 5—even in dead code—but different source languages could do different things.) Why is your line the right one to draw?
Concrete examples. Ideally drawing from real-world examples, what might be good about gradual types in your context? What new programs do you allow? What problems do you avoid? What guarantees do you gain? Make your example programs good! As Alan Perlis said, “A program without a loop and a structured variable isn’t worth writing”. Examples from the SNAPL paper include: the flatten function, JSON processing, or the “attach-to-the-request” idiom in middleware.
Operations. What can we do with base types? Having real operations around will force you to answer hard questions about your source and core languages. How does equality work, i.e., what can be compared and what are the answers? Does your dynamic language reject 5 + "hit"? What about 5 + ((Î»x.x) "hit")? If you truly have a dynamic type, what operations can you do on it? Which can fail? Is there a way to check at runtime whether casting to a static type will succeed before you commit to such reckless behavior?
Control. Include conditionals or some other nontrivial notion of control flow. The first published rules for gradual typing that used a notion of ‘meet’ came in 2012! The way you treat join points in control says a lot about the ergonomics of your system. Church encodings do not cut the mustard.
Type semantics. Are your types worth the pixels they’re written on? What do they mean? If I have a value of a given type, what guarantees do I have? You don’t need to give a formal type semantics, but it’s important to know what to expect. If I write a function Î»x:T. e, what can I actually assume about x in e? If T is int, do I know x is an int, or could it blow up? What about ref int… can reading fail? Writing? What about list int? Does pattern matching on it cause conversions, or possible failure? What about…

The SNAPL 2019 paper argues that there are two ‘lineages’ of gradual typing: one which starts from statically typed languages and relaxes or modifies the type system to include dynamic features, and one which starts from dynamic languages and tries to develop a static type system that can accommodate your ‘preexisting conditions’—legacy code. Whichever lineage you’re working in, each item above is worth carefully considering.

I want to conclude by calling out a paper that more people ought to know about; it does a good job on most of these points. It came out the same year as Alanis Morisette’s acclaimed international debut album, Jagged Little Pill.

Henglein and Rehof, “Safe polymorphic type inference for a Dynamically Typed Language: Translating Scheme to ML”, FPCA 1995 (technical report preprint).Download

(The official ACM version is less complete than the technical report—alas.) They are clear about their surface language (Scheme—with argument lists and call/cc, but not arbitrary set!). They have an entire section of concrete examples, with good demonstrations of how conditionals work with their coercion parameters. They even draw on examples from the literature, citing Mike Fagan’s thesis (which is a goldmine of examples). They don’t give a formal type semantics, but they do explain (with positive and negative examples) how type coercion parameters and polymorphism interact to achieve in their elaborated ML the ad hoc polymorphism necessary to implement their source Scheme.

I also want to highlight this paper because it’s one that I’ve never heard folks actually talk about, though it seems to be cited well enough. I urge anyone who is interested in gradual types to read it. Just like Alanis’s cri de coeur against the shallow world of pop, some things from 1995 are worth revisiting.

Ron Garcia gave helpful feedback on a draft of this post. Thanks, Ron!

(* subtyping *) output sub(ctx, typ, typ) (* bidirectional typing rules *) output synth(ctx, exp, typ) output check(ctx, exp, typ) (* subtyping between refinement types is implication *) sub(G, typ_ref(X, B, E1), typ_ref(Y, B, E2)) :- wf_ctx(G), exp_subst(Y, exp_var(X), E2) = E2prime, encode_ctx(G, PhiG), encode_exp(E1, Phi1), encode_exp(E2prime, Phi2), is_valid(`PhiG /\ Phi1 ==> Phi2`). (* lambda and application synth rules *) synth(G, exp_lam(X, T1, E), T) :- wf_typ(G, T1), synth(ctx_var(G, X, T1), E, T2), typ_fun(X, T1, T2) = T. synth(G, exp_app(E1, E2), T) :- synth(G, E1, typ_fun(X, T1, T2)), check(G, E2, T1), typ_subst(X, E2, T2) = T. (* the only checking rule *) check(G, E, T) :- synth(G, E, Tprime), sub(G, Tprime, T).

fun exp_subst(X: var, E : exp, Etgt : exp) : exp = match Etgt with | exp_var(Y) => if X = Y then E else Etgt | exp_bool(_) => Etgt | exp_op(_) => Etgt | exp_lam(Y, Tlam, Elam) => let Yfresh = fresh_for(Y, X::append(typ_freevars(Tlam), exp_freevars(Elam))) in let Elamfresh = if Y = Yfresh then Elam else exp_subst(Y, exp_var(Yfresh), Elam) in exp_lam(Yfresh, typ_subst(X, E, Tlam), Elamfresh) | exp_tlam(A, Etlam) => exp_tlam(A, exp_subst(X, E, Etlam)) | exp_app(E1, E2) => exp_app(exp_subst(X, E, E1), exp_subst(X, E, E2)) | exp_tapp(Etapp, T) => exp_tapp(exp_subst(X, E, Etapp), typ_subst(X, E, T)) end

weaselhat

Month: August 2020

What to Define When You’re Defining Gradual Type Systems

Formulog: ML + Datalog + SMT