Renato Athaydes Personal Website

I share, therefore I am

Running WebAssembly on the JVM

... and a look at what's going on around WASM!
Written on 24 Nov 2019, 06:40 PM

WebAssembly Logo

Web Assembly (or WASM) is the hot, new technology that aims to allow more efficient programming languages to run inside a browser at near-native speeds.

In summary, it is a low-level instruction format that’s both fast to load and to execute, perfect for handling demanding content in a web browser that JavaScript would have more difficulty handling.

Web Assembly has two formats: binary (the default) and a textual representation which is human readable.

In fact, it’s possible to write WASM by hand using the textual format.

Even though the language is stack-based (like Forth)), it has some “syntax sugar” that allows for nesting instructions, making code look much more familiar to developers used to more usual languages.

For example, the following code performs the equivalent of 2 + 3:

i32.const 2
i32.const 3
i32.add

This can be also written as:

(i32.add
    (i32.const 2)
    (i32.const 3)
)

Not very different from Lisp (except for the explicit types everywhere):

(add 2 3)

The similarity between WAT (WASM Text) and Lisp is not just a coincidence! As Lisp, WAT is based on S-expressions, which are very easy to parse, something that’s a goal of WASM.

It’s really fun to try to write to WASM like this! And more than that, it’s a very good way to learn WASM, which is important for anyone writing compilers targeting it.

But the current compiler toolchain to interact with WASM is quite focused, understandably, on compiling C, C++ and Rust to WASM.

Besides, the “normal” way of running the compiled WASM is by embedding it in a HTML page and loading it on the browser.

So, if you just want to play with the textual format and quickly run the code, things are harder than they should be.

That’s why I decided to find something that could run WASM outside of the browser, and being a JVM developer, if it ran on the JVM, even better!

Doing a little research, I came across the cretz/asmble project, which can run WASM and WAT directly on the JVM by compiling to Java bytecode!

In this blog post, I will show how you can use that, as well as WASM-on-JVM Gradle Plugin I wrote, to easily play with WASM and interact with it from standard Java and Kotlin code.

But before that, let’s have a quick look at how the WASM toolchain normally works, so we understand the differences.

The usual way to do WASM

The most common way to compile to WASM and run it in the browser is, currently, by using Emscripten.

Emscripten implements the C standard library inside the browser! Even syscalls are emulated, so that many C applications can work (at the cost of generating very large binaries)!

Reminds me of CheerpJ, a full JVM implementation on the browser. It’s not the lightest library in the world, so beware! See my blog post comparing JVM alternatives to JS on the browser to learn more.

If you install it on your machine (make sure you have CMake, git and Python installed first - and get a coffee… or a beer, it takes a long, loooong time), you should be able to compile an existing C or C++ application to WASM.

I say should because I was not able to… after fixing a few minor errors I could make sense of, when I finally tried to compile C code, I got an error I just couldn’t work around (didn’t try very hard, admittedly):

$ emcc hello.c -o hello.html
cache:INFO: generating system asset: is_vanilla.txt... (this will be cached in "/home/renato/.emscripten_cache/is_vanilla.txt" for subsequent builds)
.... lots of things ...

shared:ERROR: '/home/renato/programming/experiments/emsdk/binaryen/master_64bit_binaryen/bin/asm2wasm hello.temp.asm.js 
    --total-memory=16777216 --trap-mode=allow --mem-init=hello.html.mem --mem-base=1024 --wasm-only 
    -o hello.wasm --mvp-features' failed: [Errno 2] No such file or directory

After giving up on Emscripten, I noticed that the LLVM can emit WASM directly, so I decided to do just do that.

The LLVM C compiler is called clang, and you can compile/link a C file to WASM like this:

clang --target=wasm32 -nostdlib -Wl,--no-entry -Wl,--export-all -o add.wasm add.c

Here’s my simple C code:

// Filename: add.c
int add(int a, int b) {
  return a + b;
}

To run that in the browser, you need to embed it on a HTML page. I use this one (index.html):

<!DOCTYPE html>

<script type="module">
  async function init() {
    const { instance } = await WebAssembly.instantiateStreaming(
      fetch("/add.wasm"));
    console.log(instance.exports.add(4, 1));
  }
  init();
</script>
<h2>Look in the console</h2>

Serve this using a a web server (file URL won’t work due to CORS), and you should see 5 printed in the console.

And this, my friends, is your usual WASM hello world!

Now, if you intended to start with WAT rather than C, you’ll need to first convert WAT to WASM.

We’ll see how to do this easily with Gradle later… but the standard way is to use wabt, the WebAssembly Binary Toolkit.

With the wat2wasm tool that comes with it, you can run:

wat2wasm add.wat -o add.wasm

Finally, you can embed the wasm file in a HTML page and it should work.

WASM VS Java bytecode

If you are familiar with Java bytecode, you might have noticed that WASM is quite similar to it.

For example, let’s look at the simple WASM code we wrote earlier:

i32.const 2
i32.const 3
i32.add

In Java, we could represent this as follows (with local variables just to avoid constant folding):

class Ints {
    public int getInt() {
        int a = 2;
        int b = 3;
        return a + b;
    }
}

If you compile this with javac Ints.java, then see the bytecode with javap -c -v Ints, you’ll see this (metadata omitted for brevity):

iconst_2
istore_1
iconst_3
istore_2
iload_1
iload_2
iadd
ireturn

It’ more verbose because I had to assign the ints to local variables otherwise the compiler would just emit iconst_5!

If we could write it by hand (and we can! You see, I’ve been interested in these weird bytecode formats for some time, so I actually wrote a Groovy AST that lets you write bytecode directly from within Groovy), it would look more like this:

.limit stack 2
iconst_2
iconst_3
iadd
ireturn

As you can see, quite similar to WASM:

WASM JVM
i32.const 2 iconst_2
i32.const 3 iconst_3
i32.add iadd

Clearly, mapping WASM to JVM bytecode should not be that difficult of a task. There’s still a large number of things to take into consideration to make things work smoothly, but in principle, at least, the two formats are pretty close.

Runnig WASM on the JVM

Finally, we’re ready to see how WASM code can be run in the JVM.

The asmble compiler can convert directly from both wasm and wast (notice the compiler expects wast, not the more common wat, for some reason) to JVM bytecode.

Once you’ve installed asmble, create a wast file like this:

(module
  (import "spectest" "print" (func $print (param i32)))
  (func $print70 (call $print (i32.const 70)))
  (start $print70)
)

Then, run it:

./asmble/bin/asmble run -testharness print-70.wast

This example requires the -testharness option because WebAssembly code cannot print things. It can’t do pretty much anything other than number crunching and direct memory operations, really… but it can import things from the Web APIs, and in this case, the test harness serves the purpose of providing some of the Web APIs to the WASM code.

Including WASM code in JVM-based Gradle projects

Using asmble like this is not very convenient because there’s no host language to interact with it, so all your WASM code can do is, currently, calculate and print things!

To be interesting, it would be nice to be able to use WASM code from Java or other JVM languages!

And that’s what the WASM-on-JVM Gradle Plugin brings to the table!

Here’s how it works.

Create a simple Gradle project as you normally would, applying the com.athaydes.wasm plugin.

The build file will look like this:

plugins {
    id 'java'
    id 'com.athaydes.wasm'
}

version '1.0-SNAPSHOT'

repositories {
    jcenter()
}

Add wast files to src/main/wasm, and Java files to src/main/java as usual.

Call gradle compileWasm to compile the wast files. Now, from the Java code, you’ll be able to see the generated Java classes that correspond to the WASM code!

Here’s the full hello-world example:

(module
  (func (result i32)
    (i32.const 42)
  )
  (export "hello" (func 0))
)

src/main/wasm/HelloWasm.wast

public final class Hello {
    public static void main( String[] args ) {
        System.out.println( "WASM says " + new HelloWasm().hello() );
    }
}

src/main/java/Hello.java

Compile everything:

gradle compileJava

Run the Java class as you would normally do:

java -cp build/classes/java/main:build/classes/wasm/main Hello

Which prints:

WASM says 42

Really easy!

The Compile C to WASM to JVM Bytecode Example shows more advanced functionality, like how to add a C compilation-to-wasm step to the build, then use that from Java and create a fat jar with all dependencies for easily running the application.

But why?!

I started looking into running WASM on the JVM because I wanted to learn WASM and see if I could come up with a way to compile a higher level language to it. I’d expected there would be some kind of interpreter or even REPL I could use to quickly try some stuff, but that is absolutely not the case.

The only other way I found to run WASM code without having to use a browser was wabt.js, but having to deal with the JS tools is something I always prefer to avoid.

Being able to easily compile WAT to JVM bytecode is a lot nicer.

Is this production ready?!

Absolutely not! I would say WASM itself is not production ready, given just how unpolished all the tools related to it currently are, and that it can’t even outperform JS yet on the very few things it can already do (so all the hard work you have to do to use it as things stand is pretty much for no benefit)…

Unfortunately, the asmble project maintainer shows very little interest in the project, from what I can see on the GitHub repository, and it’s currently far from ready… anything more complex you throw at the compiler will break. The code is written in Kotlin, which could’ve been a nice thing if it wasn’t so dense, which makes contributing to the project difficult, unfortunately.

On the other hand, given how extremely simple WASM actually is, pretty much every language under the Sun already has an option to compile to WASM!

The question is, just how much of it actually works, and how heavy are the language runtimes necessary to ship with any code written in WASM will have to be (which currently has no GC, so the language runtime has to take care of that, and a lot more).

WASI (WebAssembly Interface)

I thought I should also mention that there’s some effort currently underway to bring WASM outside the browser, WASI. This would allow WASM to not only run natively on any architecture, but to interact with other languages as a kind of glue that brings everything together (not very different from GraalVM in the Java world, with its polyglot environment).

However, there’s not much more than a blog post and wasmtime, which supposedly can run WASM files outside the browser, but has no documentation whatsoever showing how (so I couldn’t try it).

There’s also a business trying to leverage on WASM while it’s still early days, Wasmer.

It promises to Run any code on any client and let you embed WASM on applications written in Go, Rust, Python, Ruby, PHP, C, C++ and C#! It even has a package manager for code written in WASM.

If you’re looking at running WASM that uses the WASI, then this seems to be the best bet.

Final thoughts

To end on a positive note: WASM has the potential to become a low-level compilation target for many languages, which may turn out to be very useful not only on the browser, but also on other environments.

For now, though, it’s nice to be able to experiment with WASM on the JVM, not just on the browser! Who knows, this may even become something no one expected (after all, Java was supposed to be big on the browser just like WASM now, but things turned out quite differently)!