The D programming language has a bunch of built-in attributes like pure
and nothrow
. I was wondering how things like libraries might break if
function attributes changed between versions, so I gave it a try.
The Problem
Let’s say I have this library code:
And I make a shared library out of it:
Then I write a program using this useful library:
And compile it, dynamically linking to the thing
library:
So far, so good. But now I write a new version of the library that takes advantage of D’s function attributes:
I compile this to make a new version of the shared library:
What Went Wrong?
This is an example of a problem often called ABI instability (“ABI” referring to the Application Binary Interface – the interface one binary file, such as the library, uses to interact with another, such as the executable). It’s a problem for all languages that use dynamic linking or loading (here’s an in-depth guide for C++). It’s well known that changing a library’s API can break third-party code, and this is just the same problem from a lower-level perspective. Just to be clear, changing things that are only used internally by the library wouldn’t cause problems at either level.
Most D projects in 2016 are compiled all at once and use static linking, so this kind of problem doesn’t happen much. However, like some other D programmers, I think dynamic linking and binary compatibility is going to matter. I want to install D libraries as system libraries and use D in very large projects. That’s why I tried this experiment.
Running app
again failed with a symbol lookup error.
Because app
is dynamically linked, it doesn’t contain the
definition of functions like foo
and bar
that it uses from the thing
library. The compiler inserts the names (called “symbols”) of the
missing things into the executable, so they can be looked up when needed. The symbol name _D5thing3fooFZAya
is a little cryptic, but you can probably see that
app
failed to find foo
from the thing
library.
Let’s look at what symbols are available from version 0.2 of the thing
library:
Most of these symbols are part of things like the D and C runtimes, but the fifth and sixth entries are symbols for
the two functions exported by the thing
library. It looks
like the foo
function now has the symbol _D5thing3fooFNaNbNiZAya
which doesn’t match the symbol app
was looking for: _D5thing3fooFZAya
.
By the way, this weird naming is what’s called “mangling”. In the simpler world of C, a function called foo
would get exported as the symbol foo
. But in more complex languages like D and C++, it’s possible to have
multiple functions called foo
as long as they’re in different
modules or namespaces, or have different argument types. Mangling is the process of encoding information into the
symbol name so that the right implementation can be found.
If you look at the spec for D’s mangling scheme, you
might notice that it includes function attributes. The new symbol for foo
includes NaNbNi
, which means pure
, nothrow
and nogc
. app
was compiled against the old version of the library with the old
mangled name for foo
, but the new library has a different
name.
This is a nuisance, but it’s better than the situation in C/C++. For C++ this stuff isn’t standardised, but getting
a linker error like with app
is the lucky case – otherwise
your program will try to run and be potentially buggy/unstable because of incompatibilities. C has no mangling of type
information, so changing types of exported symbols is just plain unsafe.
How Can This be Fixed?
So, the new version of the library broke binary compatibility. The simplest fix is to just recompile app
. If everything’s compiled together against the same source code,
there shouldn’t be any compatibility problems. Obviously this might defeat the purpose of having separate binaries,
though.
Another option is to add the missing symbol as an alias of the new symbol when building the shared library:
The symbol aliases can be put into a separate file:
It would be nice to have a way to solve the problem inside the library itself, though. D doesn’t have an equivalent
to
GCC’s alias
attribute, but it’s possible to override the
name mangling of a symbol:
Now the new library version works with old executables even when compiled normally:
The standard library has mangling functions that can help with getting the right mangled names.
Doing it this way adds an extra hop in calling foo
for old
executables. In theory, a compliant D compiler is allowed to automatically implement the alias, first by inlining the
function, then by deduping the identical function bodies. I’m not expecting to rely on that happening any time
soon.
A major advantage over the linker hackery is that it maintains D’s type safety. The above code compiles because it’s
safe to call a pure
function even if you don’t expect it to
be pure
. If version 0.2 took away attributes, the
same trick wouldn’t compile, and that’s a good thing because it wouldn’t be safe.
auto
can be Hazardous
On that note, what if I refactor bar
in version 0.3?
The answer is that I break binary compatibility of bar
:
The gotcha is that functions that return auto
have
function attributes inferred automatically. The original function was pure
, but my (dubious) refactoring to use mutable global state took that
away. This time it isn’t just a naming problem. Code that was compiled under the promise of a pure bar
isn’t guaranteed to work with the new version of bar
, even if the symbols are forced to match. This is why attributes are
a part of the mangling spec in the first place.
Summary
- Like with all libraries in all languages, functions in a public API need extra care and attention
- Adding attributes to these functions is safe if you do something about binary compatibility
- Removing attributes isn’t safe. Creating a new function with a new name might be the only safe option
- As a corollary, being conservative with attributes in early versions makes sense
- Using
auto
in a public API isn’t a good idea (it never was because it makes the function body a part of the API)