FP: Fairy-tale Programming?

Taking a break from my super top-secret August stealth project, I came across a Joel Spolksy piece on FP, “Can Your Programming Language Do This?”. It misses the point, showing “functions as values” as the inspiration for the map/reduce paradigm. He doesn’t even show an interesting second-order function — input only. At least he admits that he’s doing nothing you couldn’t do in C.

The good thing about Spolsky’s article is that it links to Steve Yegge‘s “Execution in the Kingdom of Nouns”. It’s a fairy-tale-style explanation of some of the problems Java has: the monomaniacal focus on objects and the absence of “verbs”. Clever and very nicely written; the comments are the icing on the cake.

But I still feel like he’s missing something. The reason Java is so noun focused is that, as several comments point out, nouns have accountability where verbs do not. I assume that somewhere, someone is very interested in every step their program makes, and they naturally cannot abide an anonymous function. When asked “who is responsible for mapping over that list?”, the program can answer this or thatObject. This aptly named “architecture” is necessary for that level of accountability. (That or drastically improved stack tracing, profiling, and live debugging support. When does MzTake get integrated with DrScheme?)

Yegge is right for accusing Java for being in love with architecture, and I think that’s the underlying problem: not everyone needs that sort of insanely detailed accountability, but everyone likes to think that they do. Just as I would scoff if people told me that anonymous functions aren’t accountable — I can see them in my IDE during stack traces, what more do I need? — an ‘enterprise’ Java programmer must be fairly irritated when told that all that structure is baroque.

JavaScript “Protection”

The NeoSmart files has a brief commentary on the feasability of encoding schemes like PHPEnkoder.

On one side, his argument is pretty strong. Any spammer could use Greasemonkey to drive harvesting — complete DOM, complete JavaScript. But there are two points I disagree with him about.

First, he mentions that

JavaScript was never meant to be used as a heavy cavalry, a knight in shining armor, or else a bit of code that can may be used to do anything – because its not.

On the one hand, this has historical precedence. For example, much of the damage done by COBOL was due to its abuse. It was a small scripting language that exploded. But it’s also a bit senslesss. COBOL was crippled on day one. Syntactically and semantically, I mean. JavaScript, with its anonymous functions and prototype object system, remains state-of-the-art today. Just because JavaScript was a glue-script language when it was born doesn’t mean it can’t be useful now as a general-purpose language. Or should ML only be used to write theorem provers? Should Smalltalk only be used to teach children? JavaScript is not Tcl, and it’s not Perl. JavaScript has some great features, a light syntax, and a huge user base. While it’s true that browser incompatabilities are a problem, toolkits seem to have dealt with this well.

But I’m not going to argue languages with someone who prefers VBScript to JavaScript. The important thing is that I think he missed an important capability of the Hivelogic algorithm, something that makes it much more powerful than it seems.

PHPEnkoder is just a port of a very creative piece of software, the Hivelogic Enkoder. The original Enkoder takes a string and encodes it by, say, swapping every other letter. It then tacks on some JavaScript to swap them back. This is already a pretty strong system. But then it goes on and encodes that JavaScript, building up a tower of encoded scripts. An evaluation loop calls eval, iteratively decoding until the bottom document.write is reached and the text is displayed.

What’s so special about that? Well, we can build a tower as high as we like. We can make it arbitrarily computationally intensive to decode the e-mail. Half a second? Easy, give it forty or fifty encodings. Five seconds? Sure. These “computational micropayments” can be worthwhile for a user to pay, but a spammer? Decode one, sure. Decode fifty? That’s nearly five minutes to get fifty e-mail addresses. How many of those people really need a bigger penis?

I don’t much like that future, though. Even if it’s a link a user can click to wait a minute for the e-mail address, that’s not ideal. NeoSmart is right, much of the problem can and should be solved server side. The only client side solution that will ever work will require human language: posting e-mail addresses as puns, jokes, tricks, songs. The only way to escape our symbol-processing machines is to abuse symbols: I’m Mike, and I hang out with hatted weasels! My plugin, PHPEnkoder, also spends a lot of time at the weasels’ place.

But I haven’t seemed to need either of those solutions, as I still haven’t received any spam at the addresses I’ve posted here — encoded, of course.

PHPEnkoder 1.1

Yaniv encouraged me to add some basic <noscript> functionality to PHPEnkoder, so I threw it in — along with a few bugfixes. The message for non-JavaScript-capable clients is configurable, and it optionally applies to RSS.

Get it while it’s hot!


Harking to Mark Liberman of Language Log‘s call for science blogs, I’ve started up this bit of silliness. I’m a computer science student, focusing on verification, temporal logic, and programming language theory. You know, the usual.

The name is random, left over from software I used to write. I have my own image, but try to free associate the two: weasel, hat; hat, weasel. Yes, yes, that’s good.