#What’s FFI and/or C interoperability
Before the real blogpost starts, we should get to know what FFI is and why it is important, and what better example to use than (bear with me for a second) machine learning Python libraries.
Have you ever wondered why majority AI applications are written in Python? It is definitely easy to use, but the world’s slowest mainstream language might seem like a poor fit for resource intensive mathematical calculations. That’s why most popular libraries geared towards data analysis use a little trick called “not actually being written in Python, but some other, more performant language”.
FFI or Foreign Function Interface allows calling/using code written in language A from language B. In practice language A is always something that can compile to C-compatible shared library, for example C, C++, Rust or Zig.
Whether it’s for performance gains or because you simply wish to use a library, that doesn’t have an alternative in your language of choice you may find yourself figuring out things like “how the fuck do I load this DLL into my Lua script”.
#Zig and C
Although most other languages also have some form of interop with C it usually involves writing glue code, re-declaring functions in the other language and making it match the declarations in the foreign library, and then of course somehow linking it together. Zig on the other hand can compile, link and also include C code and headers. Using imported C functions is as simple as using any other Zig function.
It is as easy as1:
Armed with this newfound knowledge you rush out, get yourself a copy of your favorite library, and begin to hack on some new cool project, no longer constraint by pesky language boundaries and… oh no…
#void pointers, type conversions and callbacks
Zig understands C, but void pointers and callbacks are not fun to deal with. Let’s say you want to use libuv↗, it may look something like this:
|
|
The example isn’t representative of real world use of libuv, but it gives us something simple with a callback and nullable pointers. Inside the code we create a libuv event loop, initialize a timer and let it run a callback with a given context every 0.1 seconds. As far as FFI goes this is amazing, you get to call C functions as if they were native and it just works
However, if you look closer at the counter
method there are a few things
I dislike there.
Any pointer passed from C will be optional, even tho the library promises us to always pass a valid pointer to our callback, the type system cannot know that.
Callbacks usually require context, this context is in the form of void pointers, in this example it is the
data
field onuv_timer_t
. Using@ptrCast(@alignCast(...))
throws away any type safety, that Zig offers us.
In simple example like this it hardly matters, but the more complex a codebase
becomes the easier it is to introduce undefined behaviour as a result of
misusing a void pointer, or unwrapping a null
value whilst assuming it cannot
be null
.
#Making a type-safe wrapper
What if instead of the previous example we had something similar, but with Zig types and Zig callbacks.
|
|
This might come as a surprise, but the types here are bit for bit identical with their C counterparts and even the compiled binary is very similar to the one created from the first example.
The definition of uv.TimerTyped
starts like this:
All structures representing libuv handles look similar, they contain only
a single field of the type they are meant to represent and are extern. What
this gives us is a structure with exact same size and alignment as its
C counterpart.
usingnamespace↗
is used here to allow structs to access common functions, such as close
for
all handles.
#Casting
Now for the Cast
function.
|
|
Each of these calls is in essence used to eliminate pointer casts at the side of the user, although the functions are identical they only serve as a barrier that prevents wrong types from being passed or returned by casts. Since they are inlined they also don’t add any size or performance hit to the program.2
The notable thing here is the close
method which takes a pointer of type
Self
and a callback to pass it to after closing, but uv_close
actually
takes a generic *uv_handle_t
and not a specific handle type. That’s why we
use @ptrCast(self)
to turn our pointer into a generic handle and then as the
second argument we create an anonymous function3, that takes a handle and
then inside we cast it to our Zig type and pass it to the callback.
#“Implementing” the functions
I use the word implement very lightly here as most of the functions are very simple wrappers, which are fairly easy to write. Let’s go ahead and define all the functions used in the example with the timer.
|
|
All the methods are very simple, probably no-op due to compiler optimizations, since the only thing they perform are mostly casts and most importantly all pointer shenanigans happen inside them, directly inside the struct, which leaves very little room for error.
#Conclusion
I find this the most elegant solution to dealing with C types in a foreign language. No complex wrappers, no allocations and most importantly no void pointers. All at the small cost of writing a few very straightforward wrappers.
You can find the code used in this post here↗.
The two examples are in counter1.zig
and counter2.zig
respectively. Both of
them can be compiled via zig build-exe -lc -luv <FILE>
.
You also need to link the libc↗. ↩︎
The real implementation is a bit more complex and uses some comptime type resolution to work for both const and non-const pointers at the same time. ↩︎
Zig doesn’t truly have anonymous functions, but does have anonymous structs which can have functions, see this↗. ↩︎