Hutton and Wright have a paper What is the Meaning of These Constant Interruptions? in which they prove correct a semantics for exceptions and interrupts in a contrived language relative to a virtual machine by means of a compilation function.
A well-shaven Jim Miller gave a talk at Brown today on introducing dynamic languages into the CLR. I went because (a) they were giving away a free camera (which I didn’t get), and (b) the talk seemed actually interesting. Much more interesting than my last Microsoft talk, which was some interminably boring thing about webservices.
I tried to ask him the following question: “You’ve talked a lot about getting basic interoperability between programs written in different languages — object-oriented, procedural, functional — but you’ve said nothing about heterogeneity of styles. How does the CLR cope with that?” I asked because he said that an interface of an object included a way to create it, modify its insides, and so on — not necessarily how I define my datatypes. But then he turned it around, and asked me for an example — the best I could come up with was Erlang’s processes and single-assignment (a red herring) or Haskell’s purity (an academic joke).
Not what I was looking for. The mismatch is more serious, but I couldn’t adequately get at it on the spot. I asked him a follow-up, about GUI programming style. Would a GUI in IronPython need to be written in some bogus Windows Forms style, all callbacks? I didn’t want to compare the callback-GUI system to Flapjax‘s, since that’d be a bit wankerish of me. I didn’t handle it well at all.
So: what did I mean to ask? On the one hand, my problem is this: the mismatch between the CLR and some paradigms is pretty clear. If I wanted to work with Windows Forms in, say, F#, I would have to write in an all-ref style. In other words, the CLR lets me program C# in the language of my choice.
On the other hand, how else would I write a GUI? Flapjax is the most viable FRP implementation I’ve seen, but it’s still extremely nascent. When interfacing with the user, what is better than direct, mutative style? Python uses callback-based Tkinter and WxPython; DrScheme uses the MrEd callback-based framework. So what gives?
I may be more comfortable in the FP world, and happier programming Scheme in Scheme, rather than C# in Scheme, or even worse, Scheme in C# — but it’s becoming increasingly clear to me that my preference may be more related to my own experience than actual advantages.
The response to the template language in Flapjax has been mixed, to say the least. The most common complaint is that they mix content with style. True — this can be done with our templates. But nothing stops a developer from putting CSS in the HTML directly. Nothing except good sense, that is.
In response to a recent post on our discussion group for Flapjax, I wrote about this briefly:
This can become a huge mess, so the templates simplify it. We wouldn't want to encourage doing all of your computation in the display layer! I'd compare it to Smarty or JSP -- while you can include complex computations in the tags, it's meant only to ease the interface between models in code and presentations on screen.
I think that soundly characterizes the template language. Shriram wants that in the tutorial, so that'll show up there eventually. (Some people take classes during the semester, but the Flapjax team has much more important things to do. Or so we're told.) But it would be a big mistake to see Flapjax as providing just that -- the functional reactivity is the interesting bit.
Tonight we're presenting the language to the undergraduates in the DUG; there's an exciting contest announcement coming up, as well. Who has time for midterms?
Since I got back from Israel, I’ve been working on a top secret project: a programming language for the Web. Well, for the Web 4.0 — we gave 3.0 a miss. The language is Flapjax. As you’ll note on the homepage linked above and on the Flapjax development blog, it’s multifaceted. I’ll mention the salient major features here.
The main feature is functional reactivity, found in flapjax.js. Functional reactive programming (FRP) has been around for a while in the Haskell community. The PhD lead on our project, Greg Cooper, wrote FrTime (read Father Time), which embeds FRP in Scheme.
To learn more about FRP, you might as well walk through our tutorial. It’s a callbackless world in which values vary over time and whole expressions are appropriately re-evaluated. For example, the text in a textbox can be computed with over time — no need to register callbacks for onfocus, onblur, onthis, or onthat.
In essence, FRP is a monad. But in practice, this means that FRP is a driver/callback manipulator and CPS-ed code. In FrTime, CPS-ing isn’t done directly, but instead all of the language’s primitives (+, cons, etc.) are lifted. In Flapjax, either the programmer does it manually or the compiler (my work) translates the code to CPS. There are arguments in either direction — the compiler’s aggressiveness can make it hard to use.
If AJAX without the callback mess wasn’t enough, we also wrote a generalized object store. It’s accessed via AJAX (really AJAJ, since we use JSON extensively), and was built to allow quick and easy sharing. We don’t have a demo as cool as Yaggle yet, but it’s certainly in the works.
So that’s it. Future blogging topics are: the templating syntax, compiler internals, client API internals, basic howtos. The whole project was immensely fun. Shriram got me on board by asking me what would happen if PL people actually got together and wrote something for real — that is, fully implemented an idea and sent it out at the world in a language the world can use. We both chuckled for a moment, thinking how funny it would be to actually apply PL. And then he pointed out that there’s nothing funny about that at all.
- Proposes a small syntactic extension to the lambda calculus.
- Gives an intuition for how it works and some equivalent, lambda-only constructions.
- Gives an operational semantics.
- Proves the soundness and completeness of a simple type system.
- Proves the soundness and completeness of a parametrically polymorphic type system.
- Designs and proves correct several optimizations for this construct.
- Lists compelling use cases for the new extension.
- Provides an evaluation of the extension as applied to one of those use cases.
I’ve seen a lot of papers which are only the first three, or perhaps four or five. But the last three items are the real kicker. They show that Shivers is really grounded, that he’s not just rambling about Foozles, but is interested in making programming languages better.
If there’s any problem, it’s his writing and way of explaining. Shivers has worked alone a lot, for some unsurprising reasons, so his prose clearly reflects a single person’s thoughts. On the one hand, this makes reading his work a little difficult; on the other, it’s probably the source of a lot of his creativity and incision.
Taking a break from my super top-secret August stealth project, I came across a Joel Spolksy piece on FP, “Can Your Programming Language Do This?”. It misses the point, showing “functions as values” as the inspiration for the map/reduce paradigm. He doesn’t even show an interesting second-order function — input only. At least he admits that he’s doing nothing you couldn’t do in C.
The good thing about Spolsky’s article is that it links to Steve Yegge‘s “Execution in the Kingdom of Nouns”. It’s a fairy-tale-style explanation of some of the problems Java has: the monomaniacal focus on objects and the absence of “verbs”. Clever and very nicely written; the comments are the icing on the cake.
But I still feel like he’s missing something. The reason Java is so noun focused is that, as several comments point out, nouns have accountability where verbs do not. I assume that somewhere, someone is very interested in every step their program makes, and they naturally cannot abide an anonymous function. When asked “who is responsible for mapping over that list?”, the program can answer this or thatObject. This aptly named “architecture” is necessary for that level of accountability. (That or drastically improved stack tracing, profiling, and live debugging support. When does MzTake get integrated with DrScheme?)
Yegge is right for accusing Java for being in love with architecture, and I think that’s the underlying problem: not everyone needs that sort of insanely detailed accountability, but everyone likes to think that they do. Just as I would scoff if people told me that anonymous functions aren’t accountable — I can see them in my IDE during stack traces, what more do I need? — an ‘enterprise’ Java programmer must be fairly irritated when told that all that structure is baroque.
The NeoSmart files has a brief commentary on the feasability of encoding schemes like PHPEnkoder.
First, he mentions that
What’s so special about that? Well, we can build a tower as high as we like. We can make it arbitrarily computationally intensive to decode the e-mail. Half a second? Easy, give it forty or fifty encodings. Five seconds? Sure. These “computational micropayments” can be worthwhile for a user to pay, but a spammer? Decode one, sure. Decode fifty? That’s nearly five minutes to get fifty e-mail addresses. How many of those people really need a bigger penis?
I don’t much like that future, though. Even if it’s a link a user can click to wait a minute for the e-mail address, that’s not ideal. NeoSmart is right, much of the problem can and should be solved server side. The only client side solution that will ever work will require human language: posting e-mail addresses as puns, jokes, tricks, songs. The only way to escape our symbol-processing machines is to abuse symbols: I’m Mike, and I hang out with hatted weasels! My plugin, PHPEnkoder, also spends a lot of time at the weasels’ place.
But I haven’t seemed to need either of those solutions, as I still haven’t received any spam at the addresses I’ve posted here — encoded, of course.
The course was very interesting, and I’ve come to a conclusion about the expressivity and usefulness of specification languages.
The course itself is a basic overview of specification techniques. We started with a simplified Hoare logic, using it to “specify” programs. The idea being that complete Hoare specification can be used to infer code. This is of course insanely intractable — one could write the code and the Hoare invariants in the time it would take to just write the invariants. This material was also covered in Orna’s Introduction to Software Verification.
Next we approached a more useful system, Zed, a transition-based algebraic specification language. Zed’s mathematicality was fun, but Zed is also pretty useless — it fully contains higher-order logics, so the full language is undecidable.
After Zed, we came to probably the most fun section of the course: statecharts. Statecharts are hierarchical state machines with variables and a choice of composition operators (with and without history, sequential or parallel). It was clear from the beginning that statecharts were invented for a real-world use (in-flight controls, I believe). First, they are decidable when variables have finite domains. Second, this decidability comes from a compeltely intuitive translation into a finite-state machine. Third, they are visually intuitive and appealing. The only downside was the tool we were using (Rhapsody, an IBM tool), which had odd restrictions and an even odder interpretation system — instrumented compilation into C++. I’ve had brief fantasies of writing a nice Scheme or Python environment for direct interpretation of statecharts, but I have more fantasies than time.
Following the foray into imperative specifications, we drifted towards reactive systems, first by looking at Lamport, a system of temporal requirements on “operations” (essentially transitions). From there we began a multi-week foray into temporal logic. (Needless to say, these four or so weeks were dutifully skipped.)
Only a few weeks ago we emerged from the temporal forest into the process specification language Lotos. It’s essentially π-calculus with output as synchronization: when two processes are synchronized on a given channel and one wants to output, the other must be either (a) willing to receive, or (b) outputting the exact same value.
After Lotos, we looked at Larch, a slimmed-down algebraic specification system. It’s often decidable, unlike Zed, and deals with axiomatic theories as opposed to transitions. (Zed has syntax for axiomatic reasoning, too, but it is not the default mode of expression.) Larch is nice, particularly since it offers a very intuitive, functional way of reasoning — the axioms tend to make great use of lazy evaluation.
So, out of all of the languages I was exposed to, I liked three: statecharts, Lotos, and Larch. What makes these three special? Each of them implies an execution model. These models are sequential-reactive, parallel-reactive, and functional.
Statecharts are “sequentially reactive” in that the focus is on the sequential steps taken in a reactive program. It is the lowest level of all of the languages, but is still quite expressive due to its hierarchical nature. It’s also the easiest to prove things about, since statecharts are completely decidable for variables in finite domains. The only barrier is the size of the generated model.
Lotos is “reactive in parallel”, as it focuses on process synchronization and messages sent. This is an interesting idiom, and the one I have the least experience in. I had a few problems with it, as the synchronization mechanics were frequently abused in unrealistic ways. For example, one of our sample models was a system of two elevators and a controller. Both of the elevators shared a floor counter, which worked in a very strange way (continuously counting up and down, trying to synchronize with elevators at each floor). This happened to be easily and naturally expressed in Lotos, but no working implementation could ever be reified from the specification.
Lastly, Larch is “functional” — we reasoned about various evaluation axioms over a set of tags. We essentially give a reduction semantics for these tags, indicating that some construct and others destruct, while still others observe structure. All of the Larch specifications I’ve seen so far could be translated easily into Haskell; some of them would even be fully operational.
What makes these three so good? They’re intuitive, they’re like programming languages. Statecharts resemble the ALGOL family, Lotos resembles Erlang and othe process languages, and Larch is like FP, of course. Writing specifications in these languages is like higher-level programming, a few steps beyond pseudocode. An FP programmer working in Larch will be able to write specifications in a way that discovers bugs and elucidates requirements quickly — more quickly than if writing in an abstract logic like Lamport or Zed. It’s not just playing with symbols and logic — what I felt with Zed — the end product is computationally intuitive.
I’m reminded of Nancy Leveson’s keynote at FSE-12. She talked about how engineers are just confused and irritated by formal logics. Instead, she gave them a “case chart”: each row represents a boolean state (e.g., “engines are on”, “wings are adjusting”, etc.); each column was a set of checkboxes. For a given case chart, engineers would check off the valid sets of states, each column representating a different sub-case of the given case. This, she pointed out, was just DNF. But if you call it DNF they’ll scream.
The checkbox chart was in the engineers’ domain, but logics weren’t. Statecharts, Lotos, and Larch are so good because they each rest solidly in the implementation domain: programming languages.
The paper is interesting. On the one hand, it’s fairly light in terms of original contribution — it’s an extension of Pnueli and Rosner’s 1989 paper On the Synthesis of a Reactive Module. On the other hand, the journal format makes the K&V paper much clearer than P&R. I know this chiefly because I was supposed to present the latter, but got bogged down and switched.
The paper contributes two things. First, they extend P&R’s 2-EXPTIME algorithm for LTL synthesis to a similarly complex algorithm for CTL*. Second, they add the concept of “incomplete information”. During synthesis with incomplete information, input signals (propositions in the specification ψ) are split into two sets: I, “input” — the known or “readable” signals, and E, “environment” — the unknown or “unreadable” signals. They aim to generate a program that is correct in terms of ψ, even though the program sees signals only in I and not in E.
For example, ψ = G((m ∧ ¬b) ⇒ Xr) ∧ (¬m ∨ b) ⇒ X¬r), where m represents a message, r a response, and b whether or not a message is “bogus”. The system should respond only to nonbogus messages. Given m, b ∈ I and r ∈ O (“output”), the specification is realizable. But if b ∈ E, ψ is unrealizable.
Someone asked a very astute question during the talk: when is a specification with E ≠ ∅ realizable? Only when all propositions in E form a tautology. So why is E useful/meaningful at all? I came up with two cases: iterative checks to determine what signals are necessary; and the synthesis of systems of modules with some shared signals.
No matter how interesting the paper is, my presentation was terrible. Too formal, bad examples. It didn’t help that I was speaking English to a Hebrew-speaking audience, but even Orna mentioned that my examples were bad. The problems with my examples include:
- Hand-drawn on the board. This meant that I spoke into the board half the time.
- Minimal. I drew the smallest trees possible, sticking to the binary domain and the self-product of the binary domain. This of course ties into problem number 1.
- Unmotivated. The tree examples all came before any mention of the need for them. The paper presents the tree preliminaries first, but perhaps a “woven” presentation would be better.
- Too formal. After about five minutes of explaining hide, someone said “Wait, so it’s just a projection?” I was stuck in the notation, not giving a clear picture.
We’ll see if I can improve, perhaps giving the talk again back at Brown. Then again, the audience at Brown is much less versed in verification in general — increasing the need for good, clear examples. I’ll certainly drop the emptiness check material in the slides (slides 32 to 39), as I ended up skipping it here in Israel.
My mediocre presentation skills aside, the paper is both very good and very frustrating. It presents difficult material clearly, making the technique completely understandable. But every step of the work — automata generation, processing, emptiness checks — is theoretical. So far as I can tell, there isn’t a single implementation for a single step of their algorithm — after nearly a decade. I couldn’t even find a tool to translate from CTL* to μ-calculus! Without the CTL construction in section 4.4, it would have been much harder to present a nontrivial example.
But new in the synthesis world is a 2006 paper by Kupferman, Vardi, and Nir Peterman, Safraless Compositional Synthesis. It proposes an EXPTIME algorithm for LTL synthesis (as opposed to P&R’s 2-EXPTIME algorithm); even better, the algorithm is incremental with regard to conjunction! Best still, they claim that it’s amenable to symbolic implementation, since it doesn’t use Safra’s determinization technique (from the 1988 paper On the Complexity of ω-Automata). I still need to give the 2006 paper a full reading; real implementability would be very exciting. After I finish it and a few others, I’ll write up and post a bibliographic review.