Processing Semi-Structured Data in the Unix Shell

The Unix shell is incredibly powerful. I use it routinely for simple tasks (moving files around), routine work (grading scripts), and in my development process (building, deploying, etc.). When I’m working with text, the shell and its ecosystem is excellent: patching together cat, find, grep, sed, tr, and cut with shell pipelines and redirections is a convenient, expressive, and fast way to inspect and edit files.

But my shell toolchain is much less helpful when working with semi-structured data, like JSON and YAML. Folks have made wonderful contributions to the shell ecosystem to help—tools like jq and gron. These two tools provide new languages for manipulating JSON. It may be embarrassing to admit for a programming languages researcher, but… I’m kind of maxed out on new languages.

So I built a new tool that lets you use your usual shell tools to work with modern file formats: ffs, the file filesystem.

A GIF showing the following shell interaction, editing JSON in place.

~/ffs/demo $ echo '{}' >demo.json
~/ffs/demo $ ffs -i demo.json &
[1] 56827
~/ffs/demo $ cd demo
~/ffs/demo/demo $ echo 47 >favorite_number
~/ffs/demo/demo $ mkdir likes
~/ffs/demo/demo $ echo true >likes/dogs
~/ffs/demo/demo $ echo false >likes/cats
~/ffs/demo/demo $ touch mistakes
~/ffs/demo/demo $ echo Michael Greenberg >name
~/ffs/demo/demo $ echo >website
~/ffs/demo/demo $ cd ..
~/ffs/demo $ umount demo
~/ffs/demo $ 
[1]+  Done                    ffs -i demo.json
~/ffs/demo $ cat demo.json 
{"favorite_number":47,"likes":{"cats":false,"dogs":true},"mistakes":null,"name":"Michael Greenberg","website":""}~/ffs/demo $ 
~/ffs/demo $
Editing JSON in place using ffs.

ffs lets you mount semi-structured data as a filesystem: objects and lists correspond to directories, while other types correspond to regular files. You can mount a file in one format, edit the filesystem, and write it back in another.

All you need to run ffs is FUSE, a kernel module that supports userspace filesystem. You’ll want libfuse on Linux, or macFUSE on macOS. Download a binary and play around!

New paper: Word expansion supports POSIX shell interactivity

I’ve been thinking about and working on the POSIX shell for a little bit over a year now. I wrote a paper for OBT 2017, titled Understanding the POSIX Shell as a Programming Language, outlining why I think the shell is worthy of study.

For some time I’ve had the conviction that word expansion—the process that includes globbing with * but also things like command substitution with backticks—is somehow central to the shell’s interactivity. I’m pleased to have finally expressed my conviction in more detail: Word expansion supports POSIX shell interactivity will appear at PX 2018. Here’s the abstract:

The POSIX shell is the standard tool to deploy, control, and maintain systems of all kinds; the shell is used on a sliding scale from one-off commands in an interactive mode all the way to complex scripts managing, e.g., system boot sequences. For all of its utility, the POSIX shell is feared and maligned as a programming language: the shell is feared because of its incredible power, where a single command can destroy not just local but also remote systems; the shell is maligned because its semantics are non-standard, using word expansion where other languages would use evaluation.

I conjecture that word expansion is in fact an essential piece of the POSIX shell’s interactivity; word expansion is well adapted to the shell’s use cases and contributes critically to the shell’s interactive feel.

See you in Nice?