What Is FFI? How Do Languages Communicate?
Programming language design has taken many different paths over the years. We have C-like languages that are compiled all the way down to machine code, but we also have interpreted languages, lazy languages, languages that compile to a portable bytecode, garbage collected languages, and so much more.
As software engineers we do not want to duplicate work unnecessarily. So what happens when work exists in a language other than the one you’re using?
When developing in - for example - Python, we don’t want to port all the code we need to Python just to use it. Instead, we rely on infrastructure that lets us call functions and use types from other programming languages.
This infrastructure, which bridges different type systems (or lack of type systems), execution models, and code-organization concepts, is called a Foreign Function Interface (abbreviated FFI from here on out). Most modern languages have some way to interoperate with others through FFI. The exact syntax and options vary by language, but they almost always have one thing in common: they represent functions as if they were C functions.
This is called the “C ABI” (C Application Binary Interface) or “C Calling Convention” and as the name suggests is a convention on how to call functions (which CPU registers hold what arguments, which register(s) hold return values, when to spill to the stack, etc.)1. Notice that this is just a convention that the industry agreed on over time. The C ABI is predictable, simple, and has been stable for decades and therefore became the de-facto standard for language interoperability.
Calling an FFI Function
Let’s say for example we have a Rust program that needs to call the time
function from libc (a C static library)2. We would use the following
construct:
unsafe extern "C" {
fn time(time: *mut time_t) -> time_t
}
Before we dissect the syntax though: you may have already noticed that nowhere
in this snippet do we ask for libc. Why is that?
This is because object file formats (ELF on Linux, Mach-O on macOS, PE on
Windows) are all quite old and therefore quite simplistic. They have one global
namespace called a Symbol Table that all functions (and statics) share. So
you cannot say “call time from libc”, you can only say “call a function
named time”.
When a compiler builds a program, all function calls reference the function by
symbol (“call function named X”). To make this actually executable, we need
to replace every symbol reference with the actual address of the function. This
happens after compilation during the linking step, where a separate program
called the Linker will aggregate all object files that make up your final
program, lay them out on disk and then resolve these symbol names.
To call time from libc we therefore need Rust to emit a reference to the
time symbol and make sure the libc file is also passed to the linker.
This solves the problem of what to call, but we still need to figure out how to call that function: How many parameters does the function accept? What are the types of these parameters? How many return values does it return? Remember the symbol references above are plain string names. They do not carry any information about the function’s argument types or return types so Rustc has no way of figuring out the types itself.
This is why we need to tell it about time’s signature through a so-called
“extern block” (or sometimes an “extern C block”) above. This block declares
items that are not defined in the current crate. Each item we declare is a
promise to the compiler: “this is the correct signature of this symbol, trust
me”. We commonly refer to it as a binding.
Get the signature wrong and your Rust program will pass garbage to the FFI
function without any way to check this at compile-time. The exact implementation
won’t be known until link-time, much later than the compiler’s type-checking
pass. This is why bindings are marked unsafe: you as the programmer have to
ensure signatures are correct.
This is the foundation of all Foreign Function Interfaces in Rust. In later exercises we will see how to make this much safer and more ergonomic.
Head to the exercise
Head to the exercise, where you’ll write this block for a bm_add function
implemented in C.
Exercise
The exercise for this section is located in 01_intro/01_what_is_ffi
-
Technically, there is no “single” calling convention. Every architecture defines its own “C Calling Convention”. For example, the RISC-V C Calling Convention is defined here and lays out the sizes of C primitives and how arguments and return values are passed from and to functions. x86 architectures have many different calling conventions: the Microsoft x64 calling convention and the System V ABI are the most common, but many calling conventions exist to e.g. improve calling performance (
fastcall,regcall, and more). When we say “the C calling convention” we usually mean “the C calling convention commonly used on this OS+architecture combination”. ↩ -
Yes, generally speaking
libcis distributed not as a static but as a dynamic library which is a completely different way of linking and calling functions. We’ll cover this in a later chapter in detail. ↩