Renato Athaydes

Testing and building C projects with Zig

Written on Sun, 19 Jun 2022 16:35:00 +0000

🛠️

✨

C has been around for a long time, but even today it is at the root of much of the critical infrastructure in the software world.

A large amount of the lowest level software that is widely used today, from the Linux kernel to curl, from git to grep, and in fact, all of the GNU Core Utils, stuff you use every day like kill, cat, cp, mkdir, rm, du, date… all of this is written in C.

Check the source code for some of these utilities here. There’s something magic about being able to read the actual C code of these little programs we use in our day-to-day without thinking how it’s probably been there for over 40 years.

While C is a simple language, actually using C properly is really, really hard. There’s very little in the way of abstractions and “safety” when compared to higher-level languages, or even more modern lower-level ones. It’s about as close to the metal, as some people like to say, as you can get without writing machine code (with the probable exception of Forth).

“C is a pretty basic language with not a lot of features. It can do anything, but it can make you work for it.” - Beej’s Guide to C Programming

For this reason, there has been a wave of languages trying to replace C with something better. C++ was probably the first one, but it was soon followed by others, like D, Rust, Nim and more recently, Odin, Vale and the one that we’ll be looking at for this blog post, Zig.

There are many more modern lower-level languages competing with C, see robertmuth/awesome-low-level-programming-languages for more.

The problem is that C is still alive and well, and it’s probably not going anywhere any time soon. There’s just too much of it out there, and some of the effort to replace all that legacy with something new (which I am sure would bring its own different set of issues with it) could be better spent elsewhere.

That’s why I think that rather than trying to replace C completely, we should be doing more to actually ensure the existing C code we rely on is wrapped into saner APIs and as well tested as possible, allowing us to integrate with all the legacy C gives us without fear.

It’s in this context that I found out that Zig embraces C instead of treating it like dirty garbage. It makes it very easy to interop with C, and the Zig compiler itself can compile C code. It might be using MSFT tactics, of course, but even if that were the case, it still gives us a nifty tool to improve test coverage and usability for the mountains of C code we will probably have laying around for a long time!

From Zig In-depth Overview:

Zig is better at using C libraries than C is at using C libraries.
One of the primary use cases for Zig is exporting a library with the C ABI for other programming languages to call into.
In some ways, Zig is a better C compiler than C compilers!
Zig is faster than C.

I know, a lot of grandiose claims that are hard to believe… but even though there’s been quite a few similarly outrageous claims from other sources in the past that turned out to be nothing other than vaporware, I don’t believe that to be the case with Zig.

I’ve tried it, and from what I can see so far, it delivers on nearly all fronts.

Including the part where it claims to be a better C compiler than any other C compiler!

In this post, I will be building a C project with the Zig compiler and build system, testing the C code with Zig’s native testing system, and finally, giving a taste of what brand new, fresh Zig code looks like.

All sample code in this blog post can be found in this GitHub repo. See the commits.

Quick Zig Introduction

To create a Zig project, you can run either zig init-exe (create an app) or zig init-lib (create a lib), depending on what you’re building.

▶ zig init-lib
info: Created build.zig
info: Created src/main.zig
info: Next, try `zig build --help` or `zig build test`

It will generate the following files (as of Zig 0.10.0):

const std = @import("std");
const testing = std.testing;

export fn add(a: i32, b: i32) i32 {
    return a + b;
}

test "basic add functionality" {
    try testing.expect(add(3, 7) == 10);
}

src/main.zig

const std = @import("std");

pub fn build(b: *std.build.Builder) void {
    // Standard release options allow the person running `zig build` to select
    // between Debug, ReleaseSafe, ReleaseFast, and ReleaseSmall.
    const mode = b.standardReleaseOptions();

    const lib = b.addStaticLibrary("zig", "src/main.zig");
    lib.setBuildMode(mode);
    lib.install();

    const main_tests = b.addTest("src/main.zig");
    main_tests.setBuildMode(mode);

    const test_step = b.step("test", "Run library tests");
    test_step.dependOn(&main_tests.step);
}

build.zig

Just like that, one can compile and run tests 10 seconds after installing Zig:

▶ zig build test
All 1 tests passed.

The Zig Standard Library’s testing namespace contains several functions to help write tests and perform assertions, including expect as shown in the generated test above, but also expectEqual, expectFmt, expectError and more. We should use expectEqual to improve the test, after all, even more frustrating than a failing test is a failing test that won’t tell you exactly why it failed.

test "basic add functionality" {
    try testing.expectEqual(10, add(3, 7));
}

Running it, a little surprise:

▶ zig build test
./src/main.zig:9:36: error: expected type 'comptime_int', found 'i32'
    try testing.expectEqual(10, add(3, 7));
                                   ^

Interestingly, Zig has special types that represent compile-time values… in this case, comptime_int is the type of the constant 10, which is not automatically compatible with i32, the return type of the add function.

In fact, Zig has lots of interesting features related to code that runs at compile-time (which is used for implementing important things, like generics and reflection).

Fixing that issue is easy with a type cast:

test "basic add functionality" {
    try testing.expectEqual(@as(i32, 10), add(3, 7));
}

Running it again, we get the expected result:

▶ zig build test
All 1 tests passed.

Obviously, when you build the project for release, the test code is not included in the binary. We can confirm that by asking Zig to print the emitted assembly:

▶ zig build-lib -O ReleaseFast -femit-asm=main.asm --strip src/main.zig

▶ cat main.asm
	.section	__TEXT,__text,regular,pure_instructions
	.intel_syntax noprefix
	.globl	_add
	.p2align	4, 0x90
_add:
	lea	eax, [rdi + rsi]
	ret

	.section	__DATA,__data
	.globl	__mh_execute_header
	.weak_definition	__mh_execute_header
	.p2align	2
__mh_execute_header:
	.space	32

.subsections_via_symbols

One ASM instruction is all you get, as it should be.

Next, let’s try to test a C function instead.

Calling C Code from Zig

To get started, let’s assume we have an add function in C already, so we don’t need the Zig one.

int add(int a, int b) {
  return a + b;
}

c/add.c

Now, we can call this from Zig extremely easily by including the C file directly with @cInclude("add.c"):

const adder = @cImport({
    @cInclude("add.c");
});
const std = @import("std");
const testing = std.testing;

test "basic add functionality" {
    try testing.expectEqual(@as(i32, 10), adder.add(3, 7));
}

src/main.zig

For the Zig compiler to find the C source file, we just need to let it know the include path with -I (we’ll use the Zig CLI for now, but we’ll get back to the Zig build system later):

▶ zig test src/main.zig -I c/
All 1 tests passed.

There’s no need for header files or bindings declarations of any sort. The Zig compiler is able to figure it all out by itself.

We can change the C code to make the test fail just to be this really worked… it can’t be that easy, right?

int add(int a, int b) {
  return a - b;
}

Running again:

▶ zig test src/main.zig -I c
Test [1/1] test "basic add functionality"... expected 10, found -4
Test [1/1] test "basic add functionality"... FAIL (TestExpectedEqual)
/Users/renato/programming/apps/zig-macos-x86_64-0.10.0-dev.2577+5816d3eae/lib/zig/std/testing.zig:79:17: 0x108749b6b in std.testing.expectEqual (test)
                return error.TestExpectedEqual;
                ^
/Users/renato/programming/experiments/zig/src/main.zig:8:5: 0x108748a9c in test "basic add functionality" (test)
    try testing.expectEqual(@as(i32, 10), adder.add(3, 7));
    ^
0 passed; 0 skipped; 1 failed.
error: the following test command failed with exit code 1:
src/zig-cache/o/b6efeca5c30e2cea2e628ec3183c2f63/test /Users/renato/programming/apps/zig-macos-x86_64-0.10.0-dev.2577+5816d3eae/zig

Yeah, it is definitely running the test!

For something a bit more interesting, let’s suppose we had some old C code we wanted to verify works as we think it should… but it’s a bit sloppy:

int count_bytes(char *str) {
  int count = 0;
  while(str[count]) {
    count++;
  }
  return count;
}

We may try to call this from Zig like this:

test "counting bytes in C" {
    try testing.expectEqual(@as(i32, 3), clib.count_bytes("ABC"));
}

But that won’t work:

▶ zig test src/main.zig -I c
./src/main.zig:12:59: error: expected type '[*c]u8', found '*const [3:0]u8'
    try testing.expectEqual(@as(i32, 3), clib.count_bytes("ABC"));
                                                          ^
./src/main.zig:12:59: note: cast discards const qualifier
    try testing.expectEqual(@as(i32, 3), clib.count_bytes("ABC"));
                                                          ^

You see, that C code has the right to modify our beautiful String literal, but literals should be immutable, of course! So Zig won’t let us call that function like that.

If we really must, we need to get a mutable copy of the String first, then pass that on to C and let it do what it wishes to the poor thing:

test "counting bytes in C" {
    var cstr = "ABC".*;
    try testing.expectEqual(@as(i32, 3), clib.count_bytes(&cstr));
}

Trying to run it again:

▶ zig test src/main.zig -I c/
All 2 tests passed.

All is good again.

Before we continue, let’s go through what’s just happened in more detail.

First, we needed to assign the de-referenced string’s bytes (.* de-references) to a mutable local variable (using var, as using const would make it immutable)… the string literal "ABC" has type *const [3:0]u8 (a pointer to a const array of 3 bytes, terminated with a 0 as a C string), which as the Zig error message said, cannot be used as a [*c]u8 (a non-const C-string). The cstr variable gets the type [3:0]u8, which when turned into a pointer with & is compatible with [*c]u8, so Zig lets us call the C function with &cstr.

Of course, none of this would be needed if the signature of the C function had been int count_bytes(const char *str) (notice the const parameter):

test "counting bytes in C" {
    try testing.expectEqual(@as(i32, 3), clib.count_bytes("ABC"));
}

Which works without issues.

Untyped C Pointers

Sometimes, it can get harder to use C code from Zig. For example, suppose you had some C code for a hash table that kept its values as untyped pointers (welcome to C).

Here are the function signatures in C:

void store(const char *key, const uintptr_t value)

int fetch(const char *key, uintptr_t *value)

This turned out to be difficult to use in Zig because the standard pointer conversion built-ins don’t seem to work out-of-the-box in a simple manner.

To figure this one out, I wrote code that uses this in C, then asked Zig to translate it for me into Zig.

Here’s the sample C code:

#include <inttypes.h>
#include <stdio.h>

// trivial impl that can keep a single value in memory!
uintptr_t _p;

void store(const char *key, const uintptr_t value) {
  _p = value;
}

int fetch(const char *key, uintptr_t *value) {
  *value = _p;
  return 1;
}

int main() {
  static char *s;
  store("foo", (const uintptr_t) "goodbye");
  fetch("foo", (uintptr_t *) &s);
  printf("s is %s\n", s);
}

c/storage.c

To transate this to Zig, I used the handy zig translate-c command:

zig translate-c c/storage.c > storage.c.zig

This generates a lot of boilerplate, but buried within it I was able to find the implementation of my actual code translated to Zig:

pub export var _p: usize = @import("std").mem.zeroes(usize);
pub export fn store(arg_key: [*c]const u8, value: usize) void {
    var key = arg_key;
    _ = key;
    _p = value;
}
pub export fn fetch(arg_key: [*c]const u8, arg_value: [*c]usize) c_int {
    var key = arg_key;
    _ = key;
    var value = arg_value;
    value.* = _p;
    return 0;
}
pub export fn main() c_int {
    const s = struct {
        var static: [*c]u8 = @import("std").mem.zeroes([*c]u8);
    };
    store("foo", @intCast(usize, @ptrToInt("goodbye")));
    _ = fetch("foo", @ptrCast(
        [*c]usize, @alignCast(@import("std").meta.alignment(usize), &s.static)));
    _ = printf("s is %s\n", s.static);
    return 0;
}

What a beast. But that gave me the necessary tools to make it work. Clearly, it wouldn’t be convenient to keep (un)wrapping pointers between Zig and C, so sometimes, the best way to go seems to be to write a Zig wrapper for unwieldly C APIs.

Here’s what I ended up with in this case:

const std = @import("std");
const alignment = std.meta.alignment;
const expectEqual = std.testing.expectEqual;

const storage = @cImport({
    @cInclude("storage.c");
});

fn zigStore(key: [:0]const u8, value: [:0]const u8) void {
    storage.store(key, @intCast(usize, @ptrToInt(value.ptr)));
}

fn zigFetch(key: [:0]const u8) ?[*]u8 {
    const Result = struct {
        var value: [*c]u8 = undefined;
    };
    const found = storage.fetch(key, @ptrCast([*c]usize, @alignCast(alignment(usize), &Result.value)));
    return if (found == 0) null else Result.value;
}

src/storage.zig

Not pretty, but this is just a slightly cleaned up version of the auto-translated C code.

And now, the test can be written much more conveniently:

// continues from previous sample

test "can store data in hash table" {
    zigStore("foo", "foo-value");
    try expectEqual(@as(?[*]const u8, "foo-value"), zigFetch("foo"));
}

src/storage.zig

Opaque C Pointers (void *)

Zig has a type that maps directly to C’s void *, or opaque pointers: anyopaque.

Hence, if the function we need to call had the following signature:

void store_void(const char *key, const void *value)

Then, calling it from Zig would need to look something like this:

test "can store data using void pointer" {
    const value = @ptrCast(*const anyopaque, "hi-value-void");

    c.store_void("hi", value);

    try tst.expectEqual(@as(?[*]const u8, "hi-value-void"), zigFetch("hi"));
}

Going from a C opaque pointer to a Zig typed value is a little bit harder. I had a lot of trouble trying to figure this one out, but the good fellas at the Zig’s GitHub repo helped me out with this definition:

fn ptrToStr(ptr: *const anyopaque) [:0]const u8 {
    // in order to reconstruct the slice we use `std.mem.span` which uses
    // the fact it is zero terminated to basically `strlen` it
    return std.mem.span(@ptrCast([*:0]const u8, ptr));
}

Notice how Zig requires us to manually reconstruct the opaque data we receive from C. This is obviously unsafe, but when integrating with C directly like this, nothing would save us from that.

Revisiting the Zig build

I have been using the Zig CLI directly in the previous examples because using the Zig build system for them would be overkill.

However, on real-world projects, the Zig Build System can be extremely useful and maybe even help finally put to rest the messy, unegornomic build tools the C world has always been plagued with.

The build file generated by zig init-lib needs only some minor modifications to be able to compile the Zig and C code together… the project is currently looking like this:

▶ tree . -I zig-cache
.
├── build.zig
├── c
│   ├── add.c
│   └── storage.c
└── src
    ├── main.zig
    └── storage.zig

2 directories, 5 files

The Zig build file should look like this now:

const std = @import("std");

pub fn build(b: *std.build.Builder) void {
    // Standard release options allow the person running `zig build` to select
    // between Debug, ReleaseSafe, ReleaseFast, and ReleaseSmall.
    const mode = b.standardReleaseOptions();

    const lib = b.addStaticLibrary("zig", "src/main.zig");
    lib.setBuildMode(mode);
    lib.addIncludeDir("c");
    lib.install();

    const main_tests = b.addTest("src/main.zig");
    main_tests.addIncludeDir("c");

    const storage_tests = b.addTest("src/storage.zig");
    storage_tests.addIncludeDir("c");

    const test_step = b.step("test", "Run library tests");
    test_step.dependOn(&main_tests.step);
    test_step.dependOn(&storage_tests.step);
}

To run all tests:

▶ zig build test
All 2 tests passed.
All 1 tests passed.

To build the library:

▶ zig build

▶ tree -s zig-out
zig-out
└── [         96]  lib
    └── [     735856]  libzig.a

1 directory, 1 file

This seems to build a debug lib. If you look at the mode variable in the build file, you can see the comments about letting the person building the project choose the release option… that can be done like this:

▶ zig build -Drelease-small

▶ tree -s zig-out
zig-out
└── [         96]  lib
    └── [      23600]  libzig.a

1 directory, 1 file

You can also change the target platform to whatever you want, say, WASM!

Zig supports cross-compilation to a lot of different targets, and even any version of the C standard lib you choose. And there’s no need to install toolchains either, the Zig compiler is all you need. This is definitely one of the greatest advantages of using Zig even if you just want to stick with C for production code.

But first, let’s add a little exported function back to the Zig code at src/main.zig, otherwise the library would be empty:

pub export fn add(a: i32, b: i32) i32 {
    return a + b;
}

Zig requires exported functions to use C-calling convention, so that they can be used from non-Zig code, so beware the types that can be used in exported functions is more limited…

Now, for this to work, we also need to create a shared library rather than a static one, so change this line:

const lib = b.addStaticLibrary("zig", "src/main.zig");

To:

const lib = b.addSharedLibrary("zig", "src/main.zig", std.build.LibExeObjStep.SharedLibKind.unversioned);

Finally, run the build and inspect the generated WASM binary:

▶ zig build -Drelease-small -Dtarget=wasm32-wasi

▶ tree -s zig-out
zig-out
└── [         96]  lib
    └── [      12266]  zig.wasm

1 directory, 1 file

▶ wasm2wat zig-out/lib/zig.wasm
(module
  (type (;0;) (func (param i32 i32) (result i32)))
  (func $add (type 0) (param i32 i32) (result i32)
    local.get 1
    local.get 0
    i32.add)
  (memory (;0;) 1)
  (global (;0;) (mut i32) (i32.const 65536))
  (export "memory" (memory 0))
  (export "add" (func $add)))

Works perfectly!

You can learn more about the Zig Build System from zig build explained, a series of posts written by Felix “xq” Queißner.

There’s also Loris Cro’s Extend a C/C++ Project with Zig post which takes Redis’s massive code base and puts it into a Zig build!

Writing new code in Zig

At this point, I hope that I’ve convinced you that Zig has a lot to offer and that you’re tempted to use Zig for building and testing your legacy projects written in C.

But what if you want to go further start using it also for all your new production code?

Well, Zig is an easy-to-learn language, has pretty decent tooling thanks to its super compiler and the Zig Language Server (which powers IDE support on VSCode, Emacs and any editor supporting LSP), has pretty good ergonomics and comes with lots of safety checks (which you can disable for the final release, of course), all of which make it a very compelling choice for writing new code.

Just to give you a taste of it, have a look at what using Zig’s StringHashMap - to re-implement that terrible C API for data storage (which was based on C’s hsearch which is as bad as an API can get) - looks like.

test "can store data more easily in Zig HashMap" {
    const allocator = std.heap.c_allocator;
    var map = std.StringHashMap(i32).init(allocator);

    try expectEqual(@as(u32, 0), map.count());
    try map.put("one", 1);

    try expectEqual(@as(?i32, 1), map.get("one"));
    try expectEqual(@as(?i32, null), map.get("two"));
    try std.testing.expect(map.count() > 0);
}

This is pretty close to code you would write in a higher level language like JavaScript or Java, except for the explicit use of an allocator.

For more information about choosing which Map type to use, and how to work with Maps in general, have a look at Hexops’ devlog - Zig hashmaps explained.

But keep in mind that this is just because we’re using Zig’s comptime goodness to emulate generics, which in C would look like a mess of macros and untyped pointers. The generated machine code from Zig’s compiler would still look very similar to what the C compiler would’ve generated, there’s no runtime penalty.

About that allocator, here’s a simple example from the Zig docs about how you might use one:

var global_file_frame: anyframe = undefined;
fn readFile(allocator: Allocator, filename: []const u8) ![]u8 {
    _ = filename; // this is just an example, we don't actually do it!
    const result = try allocator.dupe(u8, "this is the file contents");
    errdefer allocator.free(result);
    suspend {
        global_file_frame = @frame();
    }
    std.debug.print("readFile returning\n", .{});
    return result;
}

This example also gives a taste of Zig’s async-await feature.

Conclusion

Using Zig has been a great experience overall. I was impressed with the fact that the compiler seems to be very fast and that it produces very fast, small binaries just like C, and how well the cross-compilation work without any effort.

The integration with C is one of the best I’ve seen (though there are even nicer ones out there, but that’s another story), and the fact that it can compile C code alone or together with Zig code into virtually any target desired is a killer feature.

My main goal with this blog post was to learn whether I could use Zig to compile and test C code, and the answer is undoubtedly yes! I feel confident now to implement certain things in C as I know I can write good, comprehensive tests for it, in Zig… even though I may even question myself now whether I should be writing any C at all given the option to use Zig. The truth is that I was mostly writing C to learn low-level APIs like sockets, and to get a better understanding of what the system is really doing under all the abstractions other languages put on top of it.

I must add that it’s not just roses, however. Zig is very young, there’s a lot of things that are not entirely finished yet and you might even run into some hairy compiler bugs if you’re unlucky. The documentation is definitely lacking. Finding out how to convert between different Zig/C types, for example, was a tough challenge (thanks to all the people on the GitHub repository and the #zig IRC channel for helping out!). The CLI, while very powerful, is just barely documented at all. The copmtime feature, while very nice, may not be as magical as it seems at first - I tried writing a test assertion library for Zig, for example, but it quickly became apparent that comptime is not enough for the task and the result would look too awkward to be useful.

There’s also the obvious fact that Zig is not yet stable and it seems a few existing features are likely to change in the near future.

But even taking all of that into consideration, it’s still hard to find anything better if the goal is to be as close to C as possible.

As I said in the introduction, Zig promises a lot, and it delivers.