ActuallyUsingWasm

Last updated: March 2020

What is wasm?

WebAssembly, aka wasm, is a portable bytecode intended for fast and secure execution of programs across different systems.

Let’s be 100% clear: this is not a new idea. Most notably Java and .NET both have done this idea before, pretty successfully, and build on a long history of other tech before them. Why bother doing this again? Well, the JVM is about 25 years old, from an age where, for instance, it really wasn’t a sure thing whether garbage collection even could get widespread acceptance in modestly performance-oriented code. It was also designed before the Internet became what it is today, and is full of ideas about security and distribution of programs that are at odds with current methods. As for the .NET CLI, it just never seems to have caught on for realsies much outside of the Microsoft world. I’m really not sure why; it’s a technological improvement to the JVM in basically every way that I am aware of. Maybe it’s just the Stigma Of Microsoft, the until-recently closed-source nature and threat of patents being able to crush Mono whenever MS really wanted to that kept people from buying into it. Sad but reasonable.

So, why reinvent the wheel? Well, to take another shot and see if we can make something better, something that doesn’t have a single company pushing it and which uses a bit of the knowledge gained from a few decades of people trying to write really fast bytecode systems and JIT compilers. Do we really need to do this? Up to you. But in my opinion, wasm looks like enough of an improvement to be worth it. Webassembly has its own share of warts, of course: I dislike that it starts with a 32-bit version instead of 64-bit, some kinda fundamental things like pointer and interface types are still WIP, and there’s no versioning system attached since apparently web browser devs think “if you try it and it doesn’t work, then you know it’s not supported” is perfectly fine. Despite that sort of stuff, the tech in general seems generally reasonable, and people are actually Using It In The Real World by now.

One of the interesting prospects of WebAssembly is that it can be used for safe and portable sandboxing on systems that aren’t web browsers. This is somewhat undersold by at least some parts of the wasm ecosystem, in my opinion, but has some real value. Being able to distribute executables and libraries in a portable bytecode that gets locally compiled seems to me like a strict improvement on distributing platform-specific binaries, even if the programs are open source and you can just compile them from source if you really want to. Given that we’re increasingly living less in an x86_64 monoculture, with diverse good non-x86 hardware widely available, this idea seems more and more useful. So, as someone who mostly wants to write portable programs that run on desktop-style hardware, let’s try to actually use webassembly for this and see how it goes.

What does the ecosystem look like?

Well, it’s a bunch of acronyms, and they all start with W. Let’s try to look at how the parts fit together:

  • wasm – “machine code”. A low-level bytecode designed for portable, fast and easy execution. An open standard being built by the WebAssembly WG of the W3C.
  • wasi – “system calls”. An API for doing basic system stuff, mainly I/O. emscripten does this too but I dislike using emscripten’s tooling. Open standard being built by the WebAssembly WG of the W3C.
  • Compiler – Bring your own. rustc supports wasm32-unknown-emscripten and wasm32-wasi. It also supports wasm32-unknown-unknown, which is basically “as if you were writing for a microcontroller”, with no OS, system interface or such; you have to write your own, or use an existing one like the web-sys crate. You can also use clang, emscripten and probably others, but that would involve using a language less cool than Rust so I’m not interested right now. That said, it looks like Zig and some other languages are starting to support webassembly as well, so we’ll probably see more of it as time goes on.

Webassembly implementations and tools:

  • wasmer – Interpreter/JIT made by the folks at wasmer.io. Based on Cranelift but has other compiler backends too (LLVM, standalone)
  • wapm – A package manager similar to npm. Whyyyyyy do we have Yet Another One Of These. For programming languages it’s useful for it to be part of the build system, but wasm modules are more like DLL’s than anything else, so I don’t see what this is supposed to gain me. We already have ways to distribute executable and library binaries. Maybe wapm will end up being a better way, but I’m not convinced yet.
  • wasmtime – Interpreter/JIT. Does the same job wasmer but made by different people. Based on Cranelift. Doesn’t seem to be involved with wapm at all.
  • Many others are listed at Awesome Wasm Runtimes, which seems more or less up to date at the moment. However it is not my goal here to compare every possible implementation, just the ones that seem (to me) to have the most push behind them. Other reasonably mature implementations appear to be Perlin Networks’ life (in Go), Parity Tech’s wasmi (in Rust), and probably others I’m missing. Both are interpreters and neither support WASI, so they are less interesting to me, though life’s experimental JIT looks pretty sweet.

Misc other terminology:

  • WASI – The WebAssembly System Interface, a somewhat POSIX-y API intended for giving wasm programs on non-web platforms a useful system interface.
  • Bytecode Alliance – A loose association for developing WASI that seems to mainly consist of Mozilla, Fastly and optimism. Intel and Red Hat are listed as members, but from rummaging through github group owners (as of March 2020) there’s only one person who lists themselves as an Intel employee and no Red Hat ones, so. I really hope it grows a lot!
  • Cranelift – General-purpose compiler and JIT backend written in Rust. Similar to LLVM in concept.

The delta between wasmer and wasmtime is interesting. Who’s paying for wasmer? From the Github org the most likely candidate is Syrus Akbary, the lead dev of the Graphene-Python GraphQL lib and owner of Wasmer Inc. Looks like officially Wasmer Inc maintains wasmer as well as wapm. Wasmer Inc was started in 2018 and the Bytecode Alliance in 2019, so that’s an easy explination for the question of “why do both these exist”. Wasmer Inc started first, as a for-profit company, then Bytecode Alliance formed later as a non-profit for various companies to cooperate through.

Both wasmer and wasmtime are written in Rust, which is part of why I know about them. Why is Rust often attached to Webassembly stuff? As far as I can tell, three reasons: rustc, Cranelift, and wasm itself. In reverse order:

  • Unlike the JVM or .NET CLI, wasm doesn’t require a garbage collector. This means it’s an attractive compilation target for languages like Rust, C and C++, which also don’t require a garbage collector. wasm is also, in my limited-but-real experience, smaller and lower level and thus easier to target with a new compiler or runtime. It has no concept of classes or even structures, very simple namespaces and linking/loading model, no complex types such as generics, etc. It is much more like machine code than JVM or CLI bytecode is. So if you’re starting a green-fields low-level programming project, and your lowest-level possible languages are C, C++ and Rust, then Rust is a pretty strong contender.
  • rustc treats cross-compilation as a first class citizen. rustc is always a cross-compiler, and cargo and rustup let you target a new architecture with just a command or two. Far cry from gcc, where cross-compiling starts with “install these specific versions of binutils etc, then build gcc from scratch with these magic options and install it into a particular place and then you can call it with the right magic command line incantation.” clang apparently is better, being more rustc-like in always being a cross compiler, but building and installing a cross-compiled libc and such is still wacky and fragile. Between them, rustup, cargo and rustc control pretty much the whole stack, including the system lib interface, linker and all library dependencies, and so can easily cooperate to make cross-compilation easy. rustc treats Webassembly like any other target and has good support for generating it, so, creating Webassembly code from Rust is extremely easy.
  • And lastly, you have Cranelift just sitting there, being a still-in-development-but-nice compiler backend library just waiting to be used. LLVM is better, but LLVM is also a lot more work to deal with. If you’re looking at language+compiler backend combinations, the obvious first one is C++ and LLVM, and if you then ask “can we use something like LLVM in Rust easily” the most mature thing you find is Cranelift, and Cranelift can target Webassembly. Binding to LLVM from Rust is certainly possible, rustc itself does it, but it’s not much fun.

So part of Rust’s prevalence in this space is coincidence, part of it is good design, and part of it is people recognizing that these two technologies that are growing at the same time should be able to go well together. It’s also probably not much coincidence that Mozilla is a player in both wasm and Rust, though far from the largest or only player. That said, wasm == Rust is hardly a law. There’s plenty of wasm runtimes written in other languages, it’s just surprising to find so many different projects in the same space using Rust.

Using wasmer

wasmer’s official tutorial is here and there’s nothing I can really show you that it doesn’t do better. Long story short, writing a program to call a wasm function using wasmer looks like this:

use wasmer_runtime::{error, imports, instantiate, Func};
fn main() {
    let wasm_bytes = include_bytes!("add.wasm");
    let import_object = imports! {};
    let instance = instantiate(wasm_bytes, &import_object)?;
    let add_one: Func<u32, u32> = instance.func("add_one")?;
    let result = add_one.call(42)?;
    println!("Result: {}", result);
}

To make it clear, this produces a native program that loads and executes wasm bytecode, essentially as if it were calling out to a DLL. (A wasm module looks and functions a lot like a DLL.) add.wasm is a program from the wasmer tutorial that they provide for you to download. …Also it’s a Rust program, and a kinda jank one, ’cause it only exports one function, add_one(), but includes all of the Rust executable fluff like panic handler, memory allocation functions, etc. It’s 1.7 megabytes FFS. Let’s fix that, hmm?

# Install wasm tools if necessary: on Debian 10, this is `apt install wabt`
wasm2wat add.wasm > add_hacked.wat
# edit the assembly and remove everything but the `add_one` function, the export decl for it, and the type definition it needs
wat2wasm add_hacked.wat
wasm-validate add_hacked.wasm
# Huzzah, we managed not to break anything

Much better, my add_hacked.wasm is 143 bytes and frankly is still oversized, and loading it with the same hello-world program still works and produces the same result. Wonder how much work that would take with a native code DLL? It’s totally doable but I feel like it probably would have taken more than ten minutes to get right. Now, it would be nicer to make the Rust program that the wasm came from produce something similarly minimal in the first place, but that’s beyond the scope of this for now. IIRC all you have to do is treat it like you’re making an embedded program: use the abort panic handler, possibly compile it with no_std, and strip out debugging symbols. Doing that is left as an exercise to the reader.

(Heck, I can’t resist…. In practice, when I used crate-type = ["cdylib"] and lto = "thin" in my Cargo.toml it produced a 49 KB wasm file… consisting of 99.7% debugging symbols for backtraces, even with panic = "abort" and debug = false. Irritating, I’m sure there’s a way to get it to only output bare code, just gotta find it. There’s a wasm-strip program that takes out debugging stuff though, if you don’t want it, and leaves just the 119 bytes of actual code.)

The wasmer page has more tutorials about various other embedding use cases, which don’t touch advanced stuff in much depth but do seem to do a good job of getting the basics down.

Using wasmtime

The wasmtime docs are not oriented towards embedding it as a wasm runtime in a native program, and example code for doing so is present but less sophisticated. But, figuring it out for basic things is still pretty easy. The wasmtime equivalent of the above program is this:

#[wasmtime_rust::wasmtime]
trait WasmAdd {
    fn add_one(&mut self, input: i32) -> i32;
}

fn main() {
    let wasm_bytes = include_bytes!("add.wasm");
    let mut add = WasmAdd::load_bytes(wasm_bytes.as_ref()).unwrap();
    let result = add.add_one(42);
    println!("Result: {}", result);
}

There is actually an interesting subtle difference here: the arg and return type of add_one() must be i32 here, not u32. Webassembly itself does not specify whether integers are signed or unsigned, just whether operations treat them as one or the other. Looks like wasmer and wasmtime have different opinions about how to interpret that. Our actual WebAssembly program uses the i32.add instruction, which is identical for signed and unsigned numbers, so either is a valid interpretation and either will produce correct results. There is a WebAssembly proposal for being able to more strictly define these things in the wasm module itself. In the mean time, Rust’s stubbornness about signed and unsigned integers saves us yet again, or at least alerts us to a potential edge case. Implementers beware!

Standalone programs in Webassembly

What I don’t seem to find is any documentation on making standalone programs for execution on wasmer. That seems a little ingenuous. Well, wasmer claims it supports the WASI API, and rustc offers WASI as a target, so let’s just try that I guess?

cargo init hello
cd hello
cargo run
# prints 'Hello world!'
cargo build --target wasm32-wasi
wasmer target/wasm32-wasi/debug/hello.wasm
# prints 'Hello world!'

…well that was easy. What about wasmtime?

wasmtime target/wasm32-wasi/debug/hello.wasm
# prints 'Hello world!'

Not much to complain about here, I suppose! wasmer doesn’t have much documentation on writing programs using the WASI API, while wasmtime has enough of a tutorial to get you started. Just based on what they demonstrate, wasmer is more focused on embedding wasm in your native program, while wasmtime is more focused on executing standalone wasm programs using WASI. Both are capable of both, it just seems a matter of emphasis.

So, for basic stuff, to use WASI from Rust you don’t actually need to do anything special. Rust’s standard library compiles to WASI just like it compiles to POSIX and Windows API’s, and handles all the stuff necessary to present a reasonable API using it. Some things like threads that WASI doesn’t support yet are presumably unimplemented and will either refuse to compile or panic at runtime. (This is not a great way of doing it, but nobody in the Rust libs team been able to come up with a consistently better way, and it’s not because they’re not trying.) The main difficulty with targeting WASI is system-specific functionality; for example I tried to build termion for it but it choked on ioctl interface stuff. Pure computation, memory allocation and I/O seems to mostly Just Work.

Performance

This is a complicated topic, so I’m going to simplify it way too much. Part of the promise of wasm is that it can be JIT compiled and executed Fast, at least broadly on par with JVM and CLI code. To a first approximation, those commonly come well within an order of magnitude of native code, more or less, so I’d hope for wasm programs to perform somewhere in the same range: within 10x the performance of the same program in native code.

I’m going to use my favorite stupid benchmark, Fibonacci The Dumb Way. To me it’s a good not bad easy way to take a look at the performance of a language implementation in terms of function calls, branches and integer math.

fn fib(x: u32) -> u32 {
    if x < 2 {
        1
    } else {
        fib(x - 1) + fib(x - 2)
    }
}

fn main() {
    println!("{}", fib(40));
}

fib(40) is traditional, as it’s the largest round number a slowish computer can calculate without me getting too impatient. So, let’s try it out:

# Build native code
cargo build --release
time ./target/release/fib
# 165580141
# 0.35user 0.00system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 1836maxresident)k
# 0inputs+0outputs (0major+100minor)pagefaults 0swaps

# Run a few more times, each time we get 0.35 seconds +/- 0.01ish

# Great, now in wasm
cargo build --release --target wasm32-wasi
time wasmer target/wasm32-wasi/debug/fib.wasm
# 165580141
# 3.67user 0.00system 0:03.67elapsed 99%CPU (0avgtext+0avgdata 16832maxresident)k
# 0inputs+0outputs (0major+2092minor)pagefaults 0swaps

# Run a few more times, each time we get about 3.7 seconds +/- 0.1ish

wasmtime target/wasm32-wasi/debug/fib.wasm
# 165580141
# 2.40user 0.00system 0:02.40elapsed 100%CPU (0avgtext+0avgdata 11392maxresident)k
# 0inputs+8outputs (0major+1709minor)pagefaults 0swaps

# Run a few more times, each time we get about 2.4 seconds +/- 0.1ish

Considering they both use Cranelift for a backend, I’m actually surprised the performance for wasmtime and wasmer is noticably different. wasmtime looks a little faster, but frankly this is so crude I’m willing to call them the same. They both fall pretty much at the high end of my “JIT’ed static language” mental field of <= 10x slower than native code, and considering wasm and Cranelift are both only a couple years old this performance seems really pretty decent. Also note memory use: native code, max 1.8 mb max resident memory. The wasm versions have a memory overhead of 10-15 megabytes. Not trivial, but also not giant. Java runs the same “benchmark” in 0.66 seconds, for example, but maxes out at 35 mb of memory. Meanwhile the black magic that is LuaJIT does the same calculation in about 1 second on that machine, using 2.2 mb of max resident memory.

Start-up overhead is also potentially a thing! JIT’s traditionally start up slower than native code because they need extra time to, you know, load and compile the code before running it. This is honestly the main reason I never want to use bloody Clojure. Sad, but true. I omitted the first runs from the previous time and memory calculations to try to remove any cache warm-up cost. wasmer and wasmtime both cache the results of the JIT compilation automatically. For both runtimes the cached files are opaque blobs, and they’re named only by hash, so there’s not (currently) much you can do with them. For wasmer you can clear the cache with wasmer cache clean. wasmtime doesn’t seem to have a way to tell it to clean the cache, but does have some settings for how the cache is managed, and deleting the files by hand just makes it shrug and re-create them. So, let’s run this program a few more times and clear the cache between each run and check if we can even see any overhead at all on something this small:

  • wasmer, about 3.85 seconds and 26 MB max resident size, so 0.2 sec and 10 MB JIT overhead
  • wasmtime, about 2.5 seconds and 15 MB max resident size, so 0.1 sec and 4 MB JIT overhead

So yes, there is measurable overhead to JIT compilation, but it doesn’t seem huge for this tiny program.

One last time though, these are literally the stupidest performance benchmarks possible, so don’t actually make any decisions based off of them. It’s purely to get a vague feel for scale. The only real conclusion worth taking away at the end is “slower than C” and “faster than Python”.

Security

A non-obvious but, in my mind, fundamental part of webassembly is sandboxing. If you execute a program, what is that program is allowed to do? Originally on non-mainframe computers the answer was generally “anything”, which was fine when the internet wasn’t a thing or your system was run by a dedicated sysadmin. If you ran something nasty, the worst thing you could do is hose that one system, and then you either get a young child to fix it or get kicked off the machine by an irate sysadmin and not get let back on. As shared computing systems became more common and sophisticated, various permissions got put in place to limit the amount of damage people could do to others, and these mostly worked okay. Then desktops and the internet happened and suddenly running an unknown program, or even a known-but-compromised program, can and will delete your cat photos, launder money for the Russian mob, and steal credit card numbers from little old ladies in Iowa. Sometimes all at once. And, unless you have more money than time, you do not have a dedicated sysadmin to help you out.

The permissions that modern OS’s use to try to deal with this are all based on the minicomputer-style shared computer systems these OS’s grew up on: users, groups, access control lists, and blacklists. User adam cannot delete the files for user eve unless explicitly allowed, and you can set it so that he can’t even see what those files are. But these are still fundamentally lists of things that a particular program can not do, and are generally tied to particular users. Nothing is protecting adam from running a malicious program that steals his own credit card number, snoops through his private info, or talks to other people in his name. By now it’s obvious that we need more compartmentalized control and that blacklisting is not the answer; as someone who has worked desktop tech support before, I think the only feasible solution is to permit nothing and whitelist the required capabilities. Make it so you must always specify all things a program is allowed to do, including “execute code” and “allocate memory”, per program, per user. This can still have problems, and is more work to manage, but is far easier to control and far harder to subvert. Android and iOS do this, crudely, by asking for user permissions to do specific things on a per-program basis and hiding a lot of the details of the system from random programs. It’s not perfect, but it’s totally doable.

More broadly, if we’re not building our machines to expect humans to fail then we’re doing it wrong. This is not just a matter of computing, this is a system design problem for any machine that can hurt people. High gantries have handrails, and are made out of materials that are hard to slip on. Internal combustion engines with moving parts are packed away into steel boxes where they can’t casually mangle fingers. Doors that lock themselves must be unlocked from the inside, so that people can escape the building in case of a fire. And programs that run on the multi-billion-transistor supercomputers we all carry around in our pockets every day must only that have access to the data and capabilities that they absolutely require, and that access must be auditable and revocable. However, even OpenBSD’s security systems are fundamentally opt-in, because the basic API was designed in the 70’s and isn’t going to change, and throwing away 50 years of software development to change it all is Not Easy. Wasm and WASI has the potential to fix this by making a clean break that doesn’t require OS support, that works on any OS, and can be adopted incrementally.

This is useful for far more than a layer of protection against malicious programs though. It’s a way to make programs that literally do not know or care what computer they are running on. If you explicitly have to provide a program with a list of all resources it can access, from memory to files to network sockets, then that changes how you run multiple programs on a system. The current paradigm is that each program has random assumptions and dependencies built into it that a sysadmin or devops engineer needs to understand to make it cooperate with other programs, even in just such simple ways as “where are config files” and “what TCP port do I use”. If you have every program explicitly list these requirements and have them fulfilled from an outside source, it becomes something that can be automated. You can and must write programs that don’t CARE what TCP port they use, they use the port the system hands them. They don’t care whether their config file is /etc/foo.conf or /local/etc/foo.conf, they use the the file handle the system hands them. You then don’t have to configure individual programs, you configure systems, and these systems can be inspected and managed via code. This is a capability that has only actually become common the last few years, with “serverless” computing systems like AWS Lambda. Docker tries to emulate this sort of functionality using existing API’s, and Kubernetes tries to manage it, but this SHOULD be the operating system’s job. OS functionality like Linux’s capabilities and OpenBSD’s pledge are steps in the right direction that make this possible, and are the primitives upon which things like Docker are built. But you still have to opt in to this sort of sandboxing. It is not the default.

Anyway! Lecture over. So how do we tell these runtimes to sandbox things? Can’t find any docs on it for wasmer, so let’s look at wasmtime.

wasmtime has a minimal but functional guide to this here. Basically, we have a program that copies a file:

./copy src/main.rs copy1.rs
diff src/main.rs copy1.rs
# No output, files are identical

Now we try to run it with wasmtime:

wasmtime copy.wasm src/main.rs copy2.rs
# error opening input: failed to find a preopened file descriptor through which "src/main.rs" could be opened

It says ‘no u’ because we haven’t explicitly told it that it is allowed to access files anywhere. We can do that using the --dir flag:

wasmtime --dir=. copy.wasm src/main.rs copy2.rs
diff src/main.rs copy2.rs
# No output, files are identical

So, that works. Now if we try to read a file outside the given dir, say using an absolute path, then it tells us to get lost again. There’s similar functionality for environment variables, and you can also rename directories. This capability is… not much, to be honest. Upon digging more, it appears that wasmer has the exact same capabilities, and… in fact, produces the exact same error messages when it tells you permissions are denied. They probably use the same WASI library under the hood.

What functionality does WASI provide at all, then? What does this high-level, sandboxed pseudo-OS look like? A high-level overview is here, and the API reference is here. Seems like the general list is:

  • Command line args
  • Environment variables
  • Basic clock time stuff
  • File stuff
  • Secure RNG
  • Very simple send-data-through-socket functions
  • Very simple process control like yield() and exit()

So yeah, not a lot. Looks much like very basic POSIX. Work in progress, I suppose. It’s totally enough to implement, say, a serverless HTTP API service though.

Can we run wasmer in wasmer?

…or wasmer in wasmtime or vice versa?

Nope, both require system interface stuff that the WASI API doesn’t provide. JIT compilers be like that, I guess. Someday, hopefully!

Conclusions

You can probably compile and run nethack for wasm using WASI, if you work at it a bit and are willing to get your hands dirty with terminal nonsense. You could definitely run a web service backend at least partially on it, though some stuff like threading and database interface might get a little Exciting. More desktop-y capabilities, media libs, and hardware interfaces are not there yet, but hopefully will be someday.

Both wasmer and wasmtime are perfectly good runtimes that are both easy to embed (in Rust programs) or easy to use standalone. They use broadly similar codebases written in Rust. wasmer is run by a corporation that also runs the package manager wapm, while wasmtime is run by a non-profit backed mainly by Mozilla and Fastly. If I had to choose one, I’d choose wasmtime; the presentation is much rougher, but I’d rather have it run by a non-profit than a company, and it’s closer to the root of the technology with some Notable Names from the Rust compiler and std lib team on it. From a technical point of view, I’d prefer a compiler with one backend and some experienced compiler developers on the team to one that has three backends (while from a business view, it might make more sense to hedge your bets more). While it works just fine, from my view wapm is mostly worth mentioning in how useful it isn’t, given I never needed or wanted to actually touch it while writing this, though I’m mainly thinking about writing standalone programs in Rust so it might have use cases for other types of systems. Embedding either of these runtimes to have a program that loads wasm, or expose native functions to a wasm program, both work fine but might have some Interesting edge cases just as much as any other FFI system. Wasm has various other implementations with various goals, and maybe I’ll investigate them more someday, but those don’t seem to support WASI (yet?).

Now that I think of it, Google seems conspicuously absent in non-web-browser uses of wasm: they have no standalone WASI runtime, Go does not compile to wasm, there is no serverless hosting platform for it, and I haven’t heard anyone talking about using it for Android even though it might be an interesting replacement for Dalvik, even if only as a research project.

WASI is usable but basic, and needs more momentum behind it. It’s an idea with a ton of potential, but for now it’s still mostly potential. On the other hand, if you’re looking for a good may-become-big-someday project to put your name on, it seems like a great candidate to me.

So, yeah. Webassembly is still a somewhat crude tool, but it has some applications I’m really interested in, and it’s officially usable outside a web browser.