What is the D Runtime, Anyway? — The Art of Machinery

D’s runtime is a recurring hot topic, but there’s obviously a lot of confusion about what the D runtime even is. I gave a quick explanation during my talk at DConf 2017, but I decided to write a blog post because I’ve seen confusion since then, and because I think blog posts are just a much better format for technical stuff, anyway.

What the D Runtime isn’t

First, let’s get some possible misconceptions out of the way really quickly.

It’s not a virtual machine like the JVM, or anything like the interpreters for Python or Lua.
It’s nothing like the event loop or scheduler in systems like Node or Go.
It’s not code running in a separate thread or process, and doesn’t pre-empt your D code.
It’s not fundamentally different from the runtime in C or C++ (it just has more stuff).

The Short Explanation of What it is

It’s a library. You can see it in all its glory in the official git repo. It’s primarily used by the compiler, though sometimes accessing it from application code (i.e., by importing things from core) can be useful, too. This library binary is called something like druntime.so, with the exact file extension depending on your OS and whether you’re linking statically or dynamically. (Note that some D runtime code might get inlined into your main program.)

What’s it for?

The compiler’s job is to compile high-level code into low-level machine code. A simple compiler could just keep generating the same machine code for each D feature every time it’s used, in every program, but it obviously makes sense to put some of this code into a library, in exactly the same way that human programmers use libraries to avoid copy pasting.

In my talk I gave this (non-exhaustive) list of D features that are supported by the D runtime library:

Garbage collection (GC)
Object (base class of all D classes)
Initialisation/cleanup of modules and static data
Associative arrays
Operations like struct equality and array copying
Threads and thread-local storage (TLS)
Runtime type information (TypeInfo)

A Library? Is that Really all it is?

Many garbage-collected languages have a concurrent GC that runs alongside the main code, so some people assume that D must have something running in the background to support its high-level features. This isn’t the case and D’s runtime really is just a library.

The standard D GC (and currently the only supported one) is like a malloc() on steroids. When you need a GC-backed memory allocation, the GC implementation either just gives it to you, or (if it can’t give it to you) it runs a collection in the hope that it can free up some memory to give you. The GC code doesn’t run otherwise (unless, of course, you explicity run it using the interface in core.memory).

The runtime is called the runtime because it supports the program while it’s running. It’s not a thing running automatically. If you want to see for yourself, you can test it with this example code:

long factorial(int n)
{
        long f = 1;
        int j;
        for (j = 1; j <= n; j++)
        {
                f *= j;
        }
        return f;
}

This code happens to be both valid C and D, as-is, so you can compile it in a C compiler and a D compiler and compare the outputs. Here’s a sample from DMD on x64 (with basic optimisations for cleaner code):

_D1t9factorialFiZl:
        push   rbp
        mov    rbp,rsp
        mov    rsi,rdi
        mov    edx,0x1
        mov    rcx,rdx
        cmp    esi,edx
        jl     .l0x23
.l0x13: movsxd rax,ecx
        imul   rax,rdx
        mov    rdx,rax
        inc    ecx
        cmp    ecx,esi
        jle    .l0x13
.l0x23: mov    rax,rdx
        pop    rbp
        ret

There’s obviously no GC implementation buried in there, and there’s no call instruction to any GC implementation in a library. If you run this code and check your OS’s process/thread listing, you can see for yourself that there’s nothing else running. C-like D code compiles to much the same kind of thing as the C equivalent (although, like with C++, the D binary might have some extra stuff like runtime type information, depending on the compiler).

Here’s a super-simple example that uses the D runtime by allocating GC-backed memory. (This time it only works in D, of course.)

int* foo()
{
        return new int(42);
}

Here’s some sample compiled output, this time from x64 ldc2:

_D1t3fooFZPi:
       push   rax
       mov    rdi,[_D10TypeInfo_i6__initZ]
       call   _d_allocmemoryT
       mov    [rax],0x2a  ; 42
       pop    rcx
       ret

Note the call to _d_allocmemoryT and the use of runtime type information.

What about C++? Does C Really have a Runtime, too?

C and C++ implementations use a runtime in normal “hosted” programming environments (as opposed to runtime-less, “freestanding” environments like some embedded systems). Unlike the standard libraries, the runtime libraries are considered an implementation detail and aren’t standardised. The typical C++ runtime is a lot like the D runtime, but smaller, of course, because it doesn’t include things like GC or associative arrays.

C’s runtime is a little harder to explain because it’s mostly low-level stuff that’s taken for granted in other languages (usually because they’re based on C, themselves). One feature that’s easier to explain from a high level is program start/exit.

Anyone who’s written “hello world” in C knows that a C program starts in the main() function, and exits simply by returning from that function, but that’s not the whole story at the binary level. The exact details depend on the OS, but I’ll explain it for GNU/Linux — other platforms are similar. The true entry point to the program is _start, which is C runtime code that does various low-level initialisation, like preparing the stack and environment variables, before calling main() just like any other C function. At the end of the program, main() returns an integral status code (0 for okay), which the runtime stores. Now the runtime has the job of doing low-level cleanup, which includes running any application cleanup handlers that were registered using the atexit() function. When that’s all done, the runtime sends the stored return value from main() to the OS kernel using the exit system call, which causes the program process to be shut down.

Actually, the current implementation of D piggybacks on C’s runtime. The main() function in D code gets renamed to _Dmain() in the compiled binary, and the D compiler injects a C-style main() that does D’s initialisation before calling _Dmain().

By the way, don’t confuse the exit system call with the exit() C standard library function. (System calls and standard library functions are totally different things.) The exit() function is normal code that ultimately leads to the cleanup/exit code in the C runtime. The exit system call is a direct “kill me now” signal to the OS.

Finding out More

I hope this post was a useful overview of what the D runtime is. If you want to know more, you can browse the source code, or take a look at anything in core in the standard library docs. Most D code doesn’t need to care about the details of the runtime, but some stuff in core is useful if you’re doing something very low-level, or need to squeeze a bit more performance out of a program.