Renato Athaydes Personal Website

Sharing knowledge for a better world

Revenge of Lisp (Part 12)

Learning Common Lisp to beat Java and Rust on a phone encoding problem
Written on 30 Sep 2021, 09:21 PM
Alien World

Background vector created by upklyak - www.freepik.com

Lisp Alien Mascot

This may surprise you if you know me, but I’ve been learning Common Lisp for a few weeks now.

It all started when I was reading, funnily enough, a blog post about another, much more hyped, language called Julia. The post was titled Julia and the reincarnation of Lisp, and in it the author lamented that despite his love for Common Lisp, it’d never become popular, so he was now trying Julia instead as it may become mainstream soon.

Anyway, what called my attention the most was not Julia (I’ve seen this promise of a new language that is as fast as C and as easy as Python a few too many times to believe it), but all the Lisp prasing, including by taking a jab at poor Java, of all things:

"The industry has a lot of code in Java even when it takes much less time to write code in Lisp."

I wondered how they could be so certain of that (and why they thought they needed to create something else anyway if Lisp was so great). As someone who’s written a lot of Java code, I would love to use a language that takes much less time write things in… so, I followed the linked article, which led me to discovering a few papers written in the late 1990’s and early 2000’s comparing Java with Common Lisp and other popular languages of the time. That made me think that, 20 years later, it would be a good idea to revisit the topic and compare Lisp with modern Java (those papers used Java 1 and 1.2!) and other newer languages like Rust to see whether the results of the old papers (which generally lauded Lisp while slamming Java) still held today.

I ended up slightly obsessed with doing that for the last few months and wrote a few articles about what I found out, including Revising Prechelt’s paper… and How to write really slow Rust (a result of my initially awkward attempt at porting the Lisp code to Rust), followed by Part 2, where I analysed in more detail why the Rust implementation was so bad, and finally, How to write fast Rust code, a summary of what I learned optimising the Rust code (pushed by a crowd of Rust people who didn’t like seeing Rust running slow on my benchmarks).

All the while, one thing had stuck in the back of my mind: the Common Lisp code was the shortest by a good margin, and its performance was just excellent, even with the clean, completely non-optimised code which was written by Peter Norvig in a couple of hours!

For this reason, I just couldn’t keep myself from learning more about the language and trying to find out just how far we can actually take it in terms of performance.

Because I got a bit over-excited with all the fascinating things I came across, I decided to break up this post into two.

In this first part, I quickly introduce Common Lisp and the things that I found interesting or weird about it, then describe the phone encoding problem used in the the benchmarks, as well as the Common Lisp solution written by Norvig.

In Part 2, I will describe the benchmark improvements I made for this blog post and then show the effect of each optimisation that was either suggested to me by readers of my previous blog posts or that I came up with myself based on both my experience with optimising Rust and Java, and by reading books and article about Lisp.

Can Common Lisp beat, or at least approach, Java and Rust in terms of performance? Is it still an attractive language to use even in 2021, nearly 40 years after it was first described, and 27 years after it was finally standardized? Read on to find out!

Common Lisp - a brief history and introduction for non-Lispers

Common Lisp, henceforth CL, is actually an ANSI standard. It does not have only a single compiler implementation, far from it, there are lots of implementations, both free and commercial.

In this post, I am going to use the most prominent and performant (I believe) free implementation today, SBCL.

Lisp is a big family of languages with a long tradition. The first Lisp was described already in 1958 by John McCarthy. Today, there are arguably three main languages in the Lisp family tree: Common Lisp, the result of an unification effort in 1994 which resulted in the Common Lisp’s ANSI standard… Scheme, which today has spawned many dialects such as Racket and GNU Guile (most notably used in GNU Guix), and Clojure, which was created by Rich Hickey based on his frustrations with both Java and the existing Lisp implementations at the time.

Even though Clojure might be a more obvious choice for someone, like me, who is used to working with the JVM, I actually wanted to try using something else with a lighter runtime and great performance.

Choosing an editor

My first challenge before getting started with CL was to choose an IDE to use! Unfortunately, there’s no good JetBrains IDE for CL (I couldn’t find even a basic plugin for IntelliJ) so my first choice was not feasible. After asking on the Lisp Reddit about that, I got an unanimous answer: just use Emacs or one of its more complete distributions like Spacemacs.

I decided to try simple Emacs because I like to learn things from the very basics and it seemed that other distributions are not much more than amalgamations of several Emacs packages.

If you like VS Code, VIM, Sublime or Atom instead of Emacs, they also have support for Common Lisp! Check all of the available editors at the Lisp Cookbook - Editor Support page.

Learning Emacs is something that deserves another blog post one day, but suffice it to say that I spent at least a few weeks in the rabbit hole that the world of Emacs turned out to be… the whole application, while sometimes compared to a complete Operating System because of the huge variety of packages it supports to do virtually anything, is basically a Lisp environment with a user interface on top. It uses a dialect of Lisp referred to as elisp - or Emacs Lisp - which is fairly close to CL. This has the advantage that by learning Emacs, you also have to learn some Lisp!

With basic Emacs knowledge at hand and a nice configuration for Lisp coding, I fired up SLIME (one of the packages that turn Emacs into a CL IDE) and was finally ready to play with CL code the way it’s meant to be done.

First impressions

Common Lisp code is initially pretty hard to read, in my opinion, for several reasons. One of which is that the names of the functions are not what you would expect if you know more modern languages. Another is that macros can introduce variables (actually, custom syntax) seemingly out of nowhere, something we’ll get back to later.

For example, here is a short snippet from Norvig’s solution to the phone-encoding problem:

(when (and (not found-word) (not (numberp (first words))))
    (print-translations num digits (+ start 1)
        (cons (nth-digit digits start) words))))))

numberp and cons didn’t really look obvious to me. when might be easier to guess, but it’s unclear what happens in case the condition is false (or as we’ll see, nil in CL).

After some digging, numberp is named as such because it’s a predicate and CL convention is to end the names of predicates with p (other examples: alpha-char-p, stringp, numberp). But not all predicates follow the convention: to check if something is an atom you use atom, to check for nil (more about that soon), you use the function null, confusingly for a Java developer.

cons is a strangely named function, but should be familiar to anyone who has ever read anything about Lisp: it builds, or _cons_tructs, lists, the basic data structure on which all Lisps are built, including Lisp source code itself, giving Lisp its fabled homoiconicity property.

The other basic list functions are car, cdr and cadr which return the first, rest and second items of a list, respectively. In fact the latter set of functions are more intuitively named aliases for the former.

Here’s a basic example of how cons works from a short SLIME REPL section:

As in most Lisps, CL allows us to quote a list with the ' symbol, so above, '(2 3) is a quoted list, meaning the list is not evaluated. Normally, lists (which form s-expressions in Lisp source code) would be evaluated with the first item being either a function or a macro. So, (+ 1 2) is evaluated immediately by calling the function +, and results in the value 3, but '(+ 1 2) is simply a list containing the symbol + followed by numbers 1 and 2.

The curious usage of nil in CL

Moving on, something I found odd was the role of nil in CL. Instead of being just a null pointer as with most languages, in CL nil plays the role of false and of the empty list. It may sound like a very bad choice, but the whole language is actually built around lists, remember, so that has some interesting benefits!

You rarely get in trouble by passing or returning nil to/from functions as most other functions expect you will do just that!

For example, (length nil) evaluates to 0. (first nil) is the same as (first '()) and evaluates to nil. (alpha-char-p #\$), where #\$ is the char $, returns nil because $ is not alphabetic.

Now, it should be obvious what when evaluates to when its condition is false (I mean, nil): nil, of course.

format and CL truth

What you may expect to be called true is just t in CL, by the way. But while nil is always used to represent false, anything that’s not nil is actually true when it comes to conditionals… t, or T (yes, CL is mostly case-insensitive, in case you haven’t noticed) is just a convenient symbol to use for that.

For example, when using cond, it’s handy to have a condition that always matches anything, and t is perfect for that:

(let ((n (random 10)))
  (cond
    ((= n 1) 'one)
    ((= n 2) 'two)
    ((= n 3) 'three)
    (t 'other-number)))

Sometimes, T is also used as a special config value. For example, format can be used to format a String (similar to printf), but if the first argument is T, it prints the string to stdout instead.

Knowing that, we can look at the common CL idiom to print something, say "Hello " + name (where name is an existing variable):

(format t "Hello ~a~%" name)

Yeah, I know… it’s starting to get really weird!

Let’s look at the docs (as displayed in Emacs) to understand how format works:

There’s a lot to unpack above, but the important things to notice are the function signature (called Declared type), the example directives and the values expected for DESTINATION.

Notice how ~a is a placeholder for an argument and ~% for new-line, somewhat analogous to %s, %n patterns from printf in C-like languages.

Seeing that type declaration for the first time surprised me because I’d assumed that CL is a intrinsically dynamic language, but as you can see above, it seems to have an intricate type system capable of representing union types! As we’ll see later, we can declare types for everything in our own code too, and the compiler can check them at compile time, where possible, or at runtime when necessary. It can also emit much more efficient native code when you do that!

Notice also that, interestingly, functions can return many values, not just one, hence the VALUES in the return type.

Another little REPL section may clarify a few things:

Multiple return values are used a lot in CL. For example, when we try to retrieve a value from a hash-table by its key, we get two values: the actual value associated with the key, and a boolean that’s T if the table contained the key, nil otherwise.

To learn more about multiple values, have a look at Constructs for handling multiple values.

Data structures

That brings us to data structures in CL.

Besides the ubiquitous list, CL also has many other built-in data structures, including n-dimensional, resizable array (incl. vector, string, bit-vector), hash-table, stream and struct.

Instead of having different functions for reading and writing into data structures, CL has a nice trick instead: it works by combining the function that reads with setf for writing.

setf can be used to mutate almost anything, though for many common operations there are other forms that may be more convenient to use (e.g. push to add an item to a list).

Check this out:

It’s worth mentioning here the CLOS (Common Lisp Object System), which is (according to the linked Lisp Cookbook), arguably one of the most powerful object systems available in any language. I highly recommend learning about it as it might completely change your view of OOP. Also, CLOS concepts show up in the CL documentation, so it helps knowing a little bit about it just to understand the docs better.

CL equality

In CL, equality falls on a continuum. There’s very strong equality (the identity of the “things” is the same) with eq, and lax equality (the “things” look the same) with equalp.

In between, there are eql and equal. There are also a few other type-specific predicates like = (only works for numbers), string=, char-equal, tree-equal and probably more!

This article by Eli Bendersky explains well the main equality functions.

For the purposes of this article, the important thing is that a data structure like hash-table, which needs to know about the concept of equality to do its job, lets the user specify which predicate to use for equality.

The programmer must be careful to select the most appropriate function for each particular use case (from a performance’s point-of-view, the stricter the equality function, the better).

The function make-hash-table accepts a :TEST keyword (we’ll see in the next section what keywords mean in CL) which determines how keys are compared. While the default, eql, works well for numbers and characters, for strings one must use equal, or even equalp for case-insensitive comparison.

Hence, to create a string-keyed, case-insensitive hash-table, for example, one can do this:

(make-hash-table :test 'equalp)

Macros and functions

Finally, as I mentioned earlier, CL, like most Lisps, has very powerful macros. As explained by Paul Graham in On Lisp:

... a macro is essentially a function that generates Lisp code β€” a program that writes programs.

But because macros are not marked specially in CL, they make code difficult to read sometimes. As an example, some macros introduce new bindings, and it’s impossible to know that if you don’t know the syntax of a macro (macros are so powerful they can become full-on DSLs, as we’ll see when we discuss loop). Below, what’s the in argument?

(defun load-dictionary (file size)
    ...
    (with-open-file (in file)
        ... )

Well, it’s NOT an argument, it’s a new binding. This is basically saying, open the file at path file and bind in to the file stream.

This is kind of analogous to the let special operator (besides functions and macros, CL also has a few special forms from which all other constructs are built) whose purpose is to introduce one or more local bindings:

(let ((a 1)
      (b 2))
  (+ a b))

Functions are usually defined using a macro, defun:

(defun f (x)
    (* x 2))

Above, defun is introducing a function called f, and (x) is the arguments list. What comes after is the body of the function, which in this case multiplies the argument by 2.

Another way to create a function is with lambda. That’s usually done when an anynoymous function is required, as when using the mapping functions:

Note that to pass a reference to a function my-fun, the notation #'my-fun is used. A reference to a variable my-var would be just 'my-var, which means CL has separate namespaces for variables and functions. This is what differentiates a Lisp-1, like Scheme (i.e. 1 namespace for everything), from a Lisp-n, like CL. Which kind of Lisp is “best” is the topic of a somewhat contentious debate in Lisp communities.

We can create our own, weirder macro to define functions too! Below, we define a macro that just calls defun, but with the argument list reversed, because why not!?

The special ` backquote character creates templates for Lisp expressions, whereas , comma is used to make a slot within that template. In other words, ` quotes, , unquotes (so can only be used within a quoted expression).

Functions can have variadic and optional parameters, as well as keyword parameters, which are like named parameters. Parameters can even have default values (I am genuinely surprised CL has had this for literally decades!).

With this, we have everything we need to understand Peter Norvig’s code for the Prechelt’s phone encoding algorithm.

If you want to learn more Lisp, check these excellent (and free!) resources:

The phone encoding problem

The problem that was discussed in the papers that motivated this series of blog posts is a phone encoding problem where long phone numbers are encoded into words so that people can remember them more easily.

It uses a very simple scheme where characters (a-z, A-Z allowed) are translated into digits. We can seee the mapping in the following Lisp function, from Norvig’s solution:

(defun char->digit (ch)
  "Convert a character to a digit according to the phone number rules."
  (ecase (char-downcase ch)
    ((#\e) 0)
    ((#\j #\n #\q) 1)
    ((#\r #\w #\x) 2)
    ((#\d #\s #\y) 3)
    ((#\f #\t) 4)
    ((#\a #\m) 5)
    ((#\c #\i #\v) 6)
    ((#\b #\k #\u) 7)
    ((#\l #\o #\p) 8)
    ((#\g #\h #\z) 9)))

For example, the word Hello can be used to represent 90888, and world can represent 28283, hence phone number 90888-28283 can be encoded as Hello world!

Before/after each word, a digit can be used without being encoded, but only in case there’s no word that could be used at that position.

If you would like to try to implement this algorithm yourself, feel free to follow the more comprehensive instructions from the Lisp study.

My GitHub repository renatoathaydes/prechelt-phone-number-encoding contains solutions in CL, Java, Rust and a few other languages. It also has a benchmark runner you can use to try your own implementation. Check the README page for details, and the many branches for alternative implementations for each language.

Understanding Norvig’s implementation

The initial implementation of the phone encoding algorithm by Peter Norvig is written in a very high level, straightforward manner.

The valid words (the dictionary) are loaded into a global variable named *dict* as follows:

It seems that using names surrouned by *s is a CL convention for global variables. Constants, defined with defconstant, use +s instead, so might be named, for example, +my-const+.

(defvar *dict* nil
  "A hash table mapping a phone number (integer) to a list of words from the
  input dictionary that produce that number.")

(defun load-dictionary (file size)
  "Create a hashtable from the file of words (one per line).  Takes a hint
  for the initial hashtable size.  Each key is the phone number for a word;
  each value is a list of words with that phone number."
  (let ((table (make-hash-table :test #'eql :size size)))
    (with-open-file (in file)
      (loop for word = (read-line in nil) while word do
        (push word (gethash (word->number word) table))))
    table))

Hopefully, the previous section was instructive enough for the above code to look understandable.

The inner loop of the algorithm (inside the print-translations function), where it goes through each character in the current words being analysed, looks like this:

(loop for i from start below (length digits) do
  (setf n (+ (* 10 n) (nth-digit digits i)))
  (loop for word in (gethash n *dict*) do
    (setf found-word t)
    (print-translations num digits (+ 1 i) (cons word words))))

The loop macro, used above, is an amazing Common Lisp DSL for iterating over collections in more ways than you probably ever thought possible! I recommend having a read of Loop for Black Belts to understand how crazy (or beautiful?) it is. There are even people writing research papers and giving conference talks about it.

The word->number function, which encodes a word into a number, duplicates the same algorithm shown above to compute a number n for the current word.

Notice that n starts with value 1 because, as Norvig mentions in the comments:

... the obvious way of mapping strings to integers would map R to 2 and ER to 02, which of course is the same integer as 2. Therefore we prepend a 1 to every number, and R becomes 12 and ER becomes 102.

The nth-digit function is defined as follows:

(defun nth-digit (digits i)
  "The i-th element of a character string of digits, as an integer 0 to 9."
  (- (char-code (char digits i)) #.(char-code #\0)))

#. is a reader macro which evaluates its expression at read-time (i.e. before the expression is even compiled!), so that #.(char-code #\0) expression gets replaced with 48 in the code that actually executes - but using 48 wouldn’t be nearly as readable.

And that’s all!

Pretty simple solution, I would say, but perhaps wasteful?! The key n is computed by using arithmetic. As each phone number has an upper limit of 50 characters (that’s part of the problem description), that requires big-numbers, which is what killed the performance of the Rust program in my previous blog posts.

Also, it uses cons to append a new word and recurse instead of mutating the list in-place, an optimisation that sped up the Rust code by over 15%.

Could that make the Lisp code faster? We’ll find out later, but let’s see first how this implementation fared against my own Java implementation and a Rust port (admitedly implemented without much concern for performance) on my initial blog post:

comparison between Lisp, Java, Rust implementations

That’s right, it was the fastest implementation for all input sizes except the largest one, where it performed just slightly worse than my best Java implementation. I write mostly Java professionally, so it’s reasonable to assume the Java code is fairly representative of a proper Java implementation, but the Rust one was a beginner’s implementation.

After lots of Rust programmers insisted that I make changes to the Rust program, as I mentioned before, I spent weeks, and 3 blog posts, getting Rust to run faster! A little effort was also spent on making similar changes to the Java program as I changed Rust… but CL was left completely unchanged, and unsurprisingly CL fell behind.

In the next post, I will finally fix that by throwing every weapon CL (SBCL in particular) makes available to programmers to try and make the CL program run as fast as possible and, hopefully, as fast or faster than highly optimised Java and, one can dream, Rust!

Continue to Part 2!

πŸ‘ΎπŸ‘½πŸ›Έ