LLVM 3.1, Haskell 7.4.1, and OS X

Haskell on OS X can be a little frustrating, what with the 32-bit/64-bit divide. I spent a little bit trying to get the latest 32-bit Haskell platform to work with LLVM 3.1, via the existing bindings. There were a few tricks, which I reproduce here for posterity.

First, here’s how I configured LLVM:

./configure --enable-optimized --enable-jit --with-ocaml-libdir=$GODI_PATH/lib/ocaml/std-lib/
make UNIVERSAL=yes UNIVERSAL_ARCH="i386 x86_64"
sudo make UNIVERSAL=yes UNIVERSAL_ARCH="i386 x86_64" install

Then, get clone the Git HEAD of the bindings. The llvm-base package is in the base/ subdirectory, and you need to build it first. If the configure script fails because it can’t find LLVMModuleCreateWithName (even though it’s obviously there in the library), the problem is that LLVM didn’t build with 32-bit bindings. Go back and build LLVM with the UNIVERSAL and UNIVERSAL_ARCH flags. Beyond this, there is a tiny wrinkle. Open up base/cbits/extra.cpp, and go to line 390; change error.Print to error.print. Now you should be able to run cabal install from the base directory; when that’s successful, go up one level and cabal install will give you the LLVM bindings.

I should warn you: something isn’t perfect here. The examples using the JIT didn’t work for me (I get a bus error when I try to call the Haskell-ized, JITed functions), but I was able to generate real code:

module Hello (main) where

import Data.Word

import LLVM.Core

llvmModule :: TFunction (IO Word32)
llvmModule =
  withStringNul "Hello, world!" $ \s -> do
    puts <- newNamedFunction ExternalLinkage "puts" :: TFunction (Ptr Word8 -> IO Word32)
    main <- newNamedFunction ExternalLinkage "main" :: TFunction (IO Word32)
    defineFunction main $ do
      tmp <- getElementPtr0 s (0::Word32, ())
      _ <- call puts tmp
      ret (0::Word32)
    return main

main :: IO ()
main = do
  m <- newNamedModule "hello"
  hello <- defineModule m llvmModule
  dumpValue hello
  writeBitcodeToFile "hello.bc" m

Then you can compile or interpret the bitcode, as usual:

$ ghc -o hello Hello.hs -main-is Hello.main
[1 of 1] Compiling Hello            ( Hello.hs, Hello.o )
Linking hello ...

define i32 @main() {
  %0 = call i32 @puts(i8* getelementptr inbounds ([14 x i8]* @_str1, i32 0, i32 0))
  ret i32 0

$ ./hello
$ lli hello.bc
Hello, world!

I must admit---it took me some time to grok the LLVM bindings. Typeclass fanciness is just dandy when you're the one who did it, but Haskell's error messages aren't an easy way to figure out how something wants to be used. Then again, they call it the bleeding edge for a reason.


  1. Hi, I’m currently trying to get Haskell and LLVM playing nicely as well. I assume you compiled your LLVM from source and what version of OS X are you using?

    1. Hi Tom—I’m running OS X 10.6.8, GHC 7.4.1, and LLVM 3.1. (I haven’t tried using LLVM 3.2 yet.) I’ve heard 10.7 has issues with in particular GHC and the UNIX toolchain in general, but I don’t know much more…

  2. Hello — nice post. I don’t understand how newNamedFunction works. So on one hand I do “understand” it works when it is followed by definedFunction. That makes sense to me. But when you don’t define it, my guess it that it is looked up from somewhere. Do you know where? Is there a list of functions that can be used, or can I write a config file and specify the function and where to find it?

  3. So puts, in this example, is going to have its definition filled in by the linker because it’s part of standard libc. I haven’t tried this experiment, but I wouldn’t be surprised if changing the linkage type to PrivateLinkage broke the program.

    I haven’t played around with linking in extra things when building programs with LLVM, but I’m pretty sure the way to do it is with llvm-ld.

  4. Ahh. I figured it out mostly. LLVM has an associated LLVM-Base — one of the modules is called FFI.Core, and this defines addFunction — which uses Haskell’s FFI to pull in a C function by name. When you call newNamedFunction, the default is to assume that the name of the function corresponds to a C function which is on the FFI interface. Unless the definition is redirected through e.g. defineFunction, this is what will happen. So if you give it nonsense, you will get a runtime error on lli. I guess this leaves as a question what possible functions are available through the FFI, and so on.

  5. Thank you for the reply. Do you by chance know why the example

    bldGreet :: CodeGenModule (Function (IO ()))
    bldGreet = do
    puts IO Word32)
    greetz <- createStringNul "Hello, World!"
    func <- createFunction ExternalLinkage $ do
    tmp <- getElementPtr greetz (0::Word32, (0::Word32, ()))
    call puts tmp — Throw away return value.
    ret ()
    return func

    from the blog does not work/ what the ambiguity is?

  6. No idea without the error message.

    The code as pasted isn’t correctly formatted: there’s a stray paren at the end of your puts call. The typeclasses in the Haskell binding for GEP calls may be funny—you may have more luck with getElementPtr0. You may need a thunk after your call to createFunction.

  7. I copied the code literally from http://augustss.blogspot.ca/2009/01/llvm-llvm-low-level-virtual-machine-is.html
    I tried formatting in this textbox, but it didn’t work. I’ll try again.

    bldGreet :: CodeGenModule (Function (IO ()))
    bldGreet = do
    puts IO Word32)
    greetz <- createStringNul "Hello, World!"
    func <- createFunction ExternalLinkage $ do
    tmp <- getElementPtr0 greetz (0::Word32, ())
    call puts tmp — Throw away return value.
    ret ()
    return func

    The error message is
    "Ambiguous type variable n0' in the constraint: (type-level-0.2.4:Data.TypeLevel.Num.Sets.NatI n0) arising from a use ofgetElementPtr0' Probable fix: add a type signature that fixes these type variable(s)"

    If you can help at all — I will be very grateful. The documentation for the Haskell binding is not quite there, so I am really struggling through this.

  8. I think pre tags should work. I’m not surprised that blog entry’s code doesn’t work: it’s 4 years old. I’m not sure what’s wrong with it. I remember having some trouble with createStringNul—I use withStringNul. That, the unnamed function, and the missing ignore on the call are the only differences I can see with my code. Maybe start with mine and translate into theirs?

    In general, my code uses the FFI bindings as much as possible: the library isn’t completely finished, so not everything hangs together, and things are often brittle. Frankly, the typeclasses just get in the way—and are a huge negative when you’re writing a compiler. To make LLVM calls that meet the GADT constraints, your IR has to be at least as constrained: no thanks.

  9. Thanks for post. I tried this with LLVM 3.3 on OS X 10.7.5 with GHC 7.4.2, but didn’t succeed :(. It succesfully finds LLVMModuleCreateWithName, so installing llvm-base and llvm was no problem. However when I try to compile Hello.hs, I get a lot of linker errors: http://pastebin.com/PrBdp6xA.

    I guess I’ll try LLVM 3.1, hopefully it is not a problem in GHC/OSX 10.7…

    1. Have you successfully compiled other C++ code on your machine? It looks like g++ is having trouble finding the STL, so I think you may need to install the Xcode command line tools. Just a guess, though…

  10. My default compiler for c++ is clang++, but g++ also works fine.

    I compiled llvm/clang with clang++, I used this new clang to compile llvm-base, maybe there is the problem. Because for using the new clang, I also needed the libc++ (there where some errors, which were only found by the newer compiler). Maybe I should compile llvm-base with g++…

Leave a Reply

Your email address will not be published. Required fields are marked *