Installing ctypes and ctypes-foreign on OS X with brew and OPAM

I recently had some trouble getting ctypes working, so I thought I’d share my solution.

I found the Real World OCaml chapter on FFIs, and I tried following their advice first. They suggest:

brew install libffi
opam install ctypes

But it doesn’t quite work. For one, you also need to install ctypes-foreign. For two, the brew installation of libffi doesn’t automatically install itself globally, so ctypes can’t find it.

Here’s the right incantation:

brew install libffi
cd /usr/lib/
sudo mv libffi.dylib libffi.dylib.orig
sudo ln -s /usr/local/opt/libffi/lib/libffi.dylib
LIBFFI_CFLAGS=-I/usr/include/ffi CFLAGS=-I/usr/include/ffi opam install ctypes ctypes-foreign

Now, this is merely a good start. Once I had this much working, I still couldn’t get OCaml to call C functions! It came down to a bear of a linking issue… no combination of LD_PATH and LD_LIBRARY_PATH and LD_PRELOAD would help, nor would -lflags -cclib,-L. I just couldn’t get ctypes to see my shared library! I finally found an option to ld that does the trick: -force_load. Here’s my build line:

ocamlbuild -no-hygiene -pkg ctypes.foreign -lflags "-cclib -force_load /path/to/libfoo.a" bar.native

I imagine you can get away without -no-hygiene; I couldn’t get corebuild to work at all.

In any case, special thanks to Arjun Narayan for helping me debug this, and to Spiros Eliopolous for mockery absolutely nothing.

Twitter bots and OAuth

I’m working on a Twitter bot, and I ran up against something very annoying: apps need to be on an account with a mobile phone number. I have just one mobile phone, and it’s already tied to my real Twitter account. Rather than finding a way to get another mobile number, I had the bot authorize my app using Twitter’s OAuth API. Here’s how to do it.

Step one: collect your API keys. There’s a consumer key and a consumer secret, both available from your app’s page at apps.twitter.com. Never commit these to a repository or post them anywhere. I put them in a special file that I tell git to ignore, keys.py.

Step two: collect OAuth tokens. The key trick here is to collect tokens that are permanently good, using the out-of-band (OOB) PIN method. To actually send the requests, I used the requests-oauthlib library for Python. Here’s how to do it:

from requests_oauthlib import OAuth1Session
from keys import *

twitter = OAuth1Session(consumer_key, consumer_secret) # loaded from keys.py!
twitter.params['oauth_callback'] = 'oob' # make sure we're in PIN mode
r = twitter.post('https://api.twitter.com/oauth/request_token')

At this point, r.text will tell you, HTTP request parameter style, your oauth_token and oauth_secret. Add these to your keys.py file and copy the oauth_token to the clipboard.

Step three: authorize the app. Fire up a web browser and log in to the bot’s account. Go to https://api.twitter.com/oauth/authorize?oauth_token=[whatever your oauth_token was]. You’ll get a PIN number. Add it to your keys.py file.

Step four: finalize the authorization.

from requests_oauthlib import OAuth1Session
from keys import *
twitter = OAuth1Session(consumer_key, consumer_secret)
twitter.params['oauth_verifier'] = pin
r = twitter.get('https://api.twitter.com/oauth/access_token?oauth_token=' + oauth_token)

Now r.text will give you a new oauth_token and oauth_token_secret. Save these—they’re OAuth access tokens, which are how you can actually make API calls.

Step five: check that it worked. Log in to the bot account, go to settings, and check the apps subheading—your app should appear. You can tweet programmatically, now:

from requests_oauthlib import OAuth1Session
from keys import *

twitter = OAuth1Session(consumer_key, consumer_secret, oauth_access_token, oauth_token_secret)
twitter.post('https://api.twitter.com/1.1/statuses/update.json?status=Testing.')

I would be lying if I said I was 100% confident that everything from here on out is easy, but it seems like these access tokens are all you need after PIN based authorization.

PHPEnkoder 1.13

I’ve resolved some E_NOTICE-level messages that were showing up when people set WP_DEBUG to true. Thanks to Rootside for pointing out this problem on the WordPress forums. As always, please let me know on the forums or email hidden; JavaScript is required if you run into any problems.

Scheduling the discussion order at PC meetings

I recently wrote a bit of code for scheduling the discussion order of PC meetings so as to minimize traffic in and out of the room due to conflicts of interest. Given some information that HotCRP happily generates, the code generates a schedule, which can be further turned into a handout and slides showing the current paper’s conflicts and the two upcoming papers.

If you’re interested in using or contributing, I’ve put the code up at mgree/conflict on github.

PHPEnkoder 1.12.1: “PHP is laughably bad” edition

I’ve released a bugfix for PHPEnkoder version 1.12. Get this: before version 5.5, PHP didn’t support arbitrary array index expressions. The problem was a line:

$ord = unpack("N",$c)[1];

Which I changed to:

$bs = unpack("N",$c);
$ord = $bs[1];

This is really ridiculous. Like, serious amateur hour ridiculous. Like, if your final project in a compiler’s class had syntactic limitation, you would not get a good grade. PHP is a fractal of bad design. Ugh. Just: ugh.

And while I’m complaining about these sorts of things, where are the structured content management systems? PHPEnkoder’s email and mailto: detection is a giant, horrible kludge of regular expressions. Where’s the CMS where the filters get passed a structured, syntactic representation of the page to be rendered? This is a serious question, and if you know of one, email hidden; JavaScript is required.

PHPEnkoder 1.12: unicode support!

I’m really delighted to have resolved a longstanding problem with PHPEnkoder and Unicode: PHPEnkoder should no longer choke on the various multi-byte characters, such as é and ü and ß. (No really, look: email hidden; JavaScript is required!)

As usual, updates are available from the WordPress plugin directory or from your dashboard.

LLVM 3.1, Haskell 7.4.1, and OS X

Haskell on OS X can be a little frustrating, what with the 32-bit/64-bit divide. I spent a little bit trying to get the latest 32-bit Haskell platform to work with LLVM 3.1, via the existing bindings. There were a few tricks, which I reproduce here for posterity.

First, here’s how I configured LLVM:

./configure --enable-optimized --enable-jit --with-ocaml-libdir=$GODI_PATH/lib/ocaml/std-lib/
make UNIVERSAL=yes UNIVERSAL_ARCH="i386 x86_64"
sudo make UNIVERSAL=yes UNIVERSAL_ARCH="i386 x86_64" install

Then, get clone the Git HEAD of the bindings. The llvm-base package is in the base/ subdirectory, and you need to build it first. If the configure script fails because it can’t find LLVMModuleCreateWithName (even though it’s obviously there in the library), the problem is that LLVM didn’t build with 32-bit bindings. Go back and build LLVM with the UNIVERSAL and UNIVERSAL_ARCH flags. Beyond this, there is a tiny wrinkle. Open up base/cbits/extra.cpp, and go to line 390; change error.Print to error.print. Now you should be able to run cabal install from the base directory; when that’s successful, go up one level and cabal install will give you the LLVM bindings.

I should warn you: something isn’t perfect here. The examples using the JIT didn’t work for me (I get a bus error when I try to call the Haskell-ized, JITed functions), but I was able to generate real code:

module Hello (main) where

import Data.Word

import LLVM.Core

llvmModule :: TFunction (IO Word32)
llvmModule =
  withStringNul "Hello, world!" $ \s -> do
    puts <- newNamedFunction ExternalLinkage "puts" :: TFunction (Ptr Word8 -> IO Word32)
    main <- newNamedFunction ExternalLinkage "main" :: TFunction (IO Word32)
    defineFunction main $ do
      tmp <- getElementPtr0 s (0::Word32, ())
      _ <- call puts tmp
      ret (0::Word32)
    return main


main :: IO ()
main = do
  m <- newNamedModule "hello"
  hello <- defineModule m llvmModule
  dumpValue hello
  writeBitcodeToFile "hello.bc" m

Then you can compile or interpret the bitcode, as usual:

$ ghc -o hello Hello.hs -main-is Hello.main
[1 of 1] Compiling Hello            ( Hello.hs, Hello.o )
Linking hello ...

define i32 @main() {
_L1:
  %0 = call i32 @puts(i8* getelementptr inbounds ([14 x i8]* @_str1, i32 0, i32 0))
  ret i32 0
}

$ ./hello
$ lli hello.bc
Hello, world!

I must admit---it took me some time to grok the LLVM bindings. Typeclass fanciness is just dandy when you're the one who did it, but Haskell's error messages aren't an easy way to figure out how something wants to be used. Then again, they call it the bleeding edge for a reason.

QuickRedex

Following Robby Findler’s excellent presentation on PLT Redex at POPL (check out the paper and presentation!), I hacked up something similar in Haskell. Naturally, it’s all manual, and there’s none of the publication or visualization support, but the essence is there. Here’s the code for the untyped lambda calculus:

module Untyped where

import Control.Monad
import Data.Maybe
import Test.QuickCheck

data Expr =
    Var Int
  | Lambda Expr
  | App Expr Expr

showExpr :: Int -> Expr -> String
showExpr _b (Var n) = "var" ++ show n
showExpr b (Lambda e) = "lambda. " ++ showExpr (b+1) e
showExpr b (App e1 e2) = "(" ++ showExpr b e1 ++ " " ++ showExpr b e2 ++ ")"

instance Show Expr where
  show e = showExpr 0 e

size :: Expr -> Int
size (Var _) = 1
size (Lambda e) = 1 + size e
size (App e1 e2) = size e1 + size e2

wellLexed :: Expr -> Bool
wellLexed = wellLexedAux 0
  where wellLexedAux :: Int -> Expr -> Bool
        wellLexedAux b (Var n) = 0 <= n && n < b
        wellLexedAux b (Lambda e) = wellLexedAux (b+1) e
        wellLexedAux b (App e1 e2) = wellLexedAux b e1 && wellLexedAux b e2

arbitraryExpr :: Int -> Int -> Gen Expr
arbitraryExpr n 0 = oneof [return $ Lambda $ Var 0] -- base case
arbitraryExpr n max = do
  oneof $ 
    (if n < max 
     then [arbitraryExpr (n+1) (max-1) >>= return . Lambda]
     else []) ++ 
    (if n > 0
     then [choose (0,n-1) >>= return . Var]
     else []) ++
    [do { e1 <- arbitraryExpr n (max `div` 2);
          e2 <- arbitraryExpr n (max `div` 2);
          return $ App e1 e2 }]

instance Arbitrary Expr where
  arbitrary = sized $ \max -> arbitraryExpr 0 max

-- All the terms we generate should be well-lexed
prop_WellLexed e = collect (size e) $ wellLexed e

subst :: Expr -> Int -> Expr -> Expr
subst e n (Var n') 
  | n == n' = e
  | otherwise = (Var n')
subst e n (Lambda e') = Lambda $ subst e (n+1) e'
subst e n (App e1 e2) = App (subst e n e1) (subst e n e2)

shift :: Int -> Expr -> Expr
shift i (Var n) = if n < i then Var n else Var (n-1)
shift i (Lambda e) = Lambda $ shift (i+1) e
shift i (App e1 e2) = App (shift i e1) (shift i e2)

value :: Expr -> Bool
value (Lambda e) = True
value _ = False

-- we can run this with m = Maybe or m = List
step :: MonadPlus m => Expr -> m Expr
step (Var _) = mzero -- unbound variable
step (Lambda e) = mzero -- found a term
step (App e1@(Lambda e11) e2) 
  | value e2 = return $ shift 0 $ subst e2 0 e11
  | otherwise = do
    e2' <- step e2
    return $ App e1 e2'
step (App e1 e2) = do
  e1' <- step e1
  return $ App e1' e2

-- if we can step, we'd better preserve scope
prop_StepWellLexed e = isJust next ==> wellLexed (fromJust next)
  where next = step e
        
-- verify progress
prop_Progress e = (classify isValue "value") $ (classify (not isValue) "step") $ value e || isJust (step e)
  where isValue = value e

-- use the List monad to ensure determinacy
prop_Deterministic e = nextStates > 0 ==> nextStates == 1
  where nextStates = length $ step e

One of the trickiest things here was making sure I was generating well lexed lambda terms that were small enough to be tractable. It would have been even harder, I think, with a more explicit representation of variables or binding. Thoughts, variations? Or, perhaps, complaints that I should have done this with GADTS or in Coq?