D for Bare Metal Programming — The Art of Machinery

This post is somewhat outdated. It’s still relevant, but some of the problems I discussed have already been fixed.

Previously I talked about booting a PC directly to bare metal D and said that Hello World is never a strong test of a programming environment. To get a better feel for what D is really like on bare metal, I wrote Xanthe, a simple, classic-style vertical scrolling shooter game with no dependencies on either the D or C runtime.

Some years back I used to write firmware in C for industrial metering systems based on microcontrollers like the MSP430. These computing environments are extremely restricted, and a lot of stuff we take for granted on modern desktop machines just doesn’t work. For example, a lot of idioms of modern high-level programming depend on dynamic allocation, so a different programming style is needed when you don’t even have malloc().

There’s no real need to program like this when developing on a modern PC with gigabytes of RAM, but I decided that if I’m going write something like Xanthe, I might as well test some embedded programming techniques in D. The TL;DR is that although some special D features don’t work (even though they should), it’s almost always possible to fall back to using embedded C idioms instead. The one exception I found was with C preprocessor macros. I think it’s fixable, but more on that later in the article.

The Development Environment

Xanthe works pretty much like the Hello World example in that previous post. ~~In particular, it uses linker hacking to remove runtime dependencies. I’m hoping that one day this won’t be necessary.~~ That day has come; Xanthe now works without any linker hacking.

For normal embedded development, I’d want a cross-compilation toolchain that links in ported versions of the standard libraries. Of course, it wasn’t necessary in this case because I’m not linking in the standard libraries at all, and I’m writing for x86 on x86. Still, porting the D standard libraries to bare metal using a libc like Newlib sounds like an interesting project for another day.

I wrote a simple freelist-based allocator for game entities, but other than that everything’s allocated statically (or on the stack).

Xanthe actually has three backends. At one time I thought it would be a neat experiment to port Xanthe to more platforms. However, writing portable code doesn’t mix with writing like I’m programming a microcontroller, so I gave up on that. Xanthe runs on the BIOS bootloader from the previous article, as well as normal desktops using libSDL (handy for developing the main code). I wanted to see Xanthe running on real hardware (and the BIOS bootloader doesn’t work on the laptop I tried it on), so I also made it bootable using the GRUB bootloader.

`import`

An interesting feature of D is the ability to import arbitrary files directly into the compiled binary. This is a common need with small embedded systems that don’t have a filesystem for loading things at runtime. With embedded C, the solution is typically to either convert the file to a hexadecimal C array literal and compile it in, or just get the linker to link in the file directly.

I used import to include the game sprites and audio files in the game binary. On the one hand it was very convenient, and nice that I could use CTFE to validate the files. On the other hand, the amount of processing you can do with CTFE is practically limited because it’s illegal to typecast arbitrary data into structs at compile time. ~~Also, I ended up with two copies of each file in my binary, which I suspect is just because of a simplistic internal implementation of import and CTFE type casts.~~ This seems to be fixed in the latest dmd.

It’s nice to have this feature, but I think it’s most useful for simple jobs and quick testing. The classic C approaches still allow more control.

`pure`

I’ve written before about the practical benefits of (generalised) purity. I wondered how well so-called weak purity would work in systems programming, and I’m happy to report it’s a win. This kind of development requires statically allocating a lot of global variables, and pure helped keep that sane.

I like to use a refactor-heavy programming style. First I’ll focus on getting code written; then when I see the system taking shape, I’ll rewrite a lot of it. Paradoxically, writing code twice is often faster than trying to write it once, and the result is always better quality. Because (unlike Haskell) D doesn’t push for purity up front, it was easy to add purity this way, and actually I was surprised at how often “refactor for weak purity” was the answer to problems. For example, when I made the game replayable after a win or loss, there was one bug caused by game state not being properly reset between plays. Naturally, this was in one of the places in the code that hadn’t been refactored for weak purity yet, and the simplest fix was to go ahead and do that.

I didn’t make the entire codebase weakly pure, but I suspect that if I kept working on it, I’d keep adding the pure keyword.

Unittests

There’s not much to say; I’ll just confirm that D’s built-in unittest feature is useful for faster development. Even though unittests don’t work in the bare metal build, it’s easy to make a test build that runs them.

Replacing the Preprocessor

Embedded C code makes heavy use of the preprocessor. One reason is to avoid writing runtime initialisation code, and D’s CTFE is a much more powerful tool for that. D’s other compile-time features cover a lot of the other use cases.

There was one use case I didn’t find a elegant solution for. I had some assembly code in the sound driver that I wanted to work on both 64b and 32b. The only difference between the two is that the 64b code needs to use the RSI and RDI registers, and the 32b version needs to use ESI and EDI in the same places. This is trivial to implement cleanly with the C preprocessor. Normally it’s not hard with D, either, because you can emulate the preprocessor locally using CTFE, string functions and the mixin keyword. However, trying to use that trick in Xanthe pulled in a lot of Phobos as runtime dependencies, even though the code is only used at compile time. I don’t think any D compilers today can make that distinction. On the bright side, I think Stefan Koch’s new CTFE engine will encourage heavier usage of CTFE, which might drive extra demand for “CTFE-only” support, even from people who aren’t doing systems programming.

Attribute Creep

I think this is more of a nuisance than a real problem, but it needs mentioning: there are a lot of @nogc and nothrow attribute annotations in the Xanthe source. Ultimately, this is because there’s no way to undo an attribute. If I could put a big, fat, global @nogc and nothrow on each module and just mark the small amount of testing code that needs to be allowed to throw/allocate, there’d be much less attribute clutter.

Some D developers will point out that is possible to put global attributes at the top of modules. Xanthe uses that feature, but unfortunately these attributes aren’t totally global. Why not? It’s theoretically possible, for example, for a @nogc function to return a pointer to a nested function that’s not @nogc. Because there’s no way to selectively undo an attribute, it would be impossible to implement this if attributes on functions were transitive (like immutable data types are). This isn’t a common case, but it has to be supported, so @nogc and nothrow don’t carry inside function (and struct) scope. A really common (and slightly annoying) example in Xanthe is all the inline assembly snippets marked @nogc and nothrow.

It would be possible to avoid some @nogc and nothrow annotations by grouping functions together and wrapping with annotated braces. However, “avoid typing @nogc and nothrow a few times” is really low on my priorities when I’m deciding how to organise my code.

Polymorphism

D classes depend directly on the D runtime. C++-style classes don’t, but instatiating them seems to require run-time type information. (I have found a hack that works around this, but it’s horrible and I’d never use it in production. I’ll talk more about D classes and betterC in a later post.)

Typical embedded code (at least the control system code I’ve developed) doesn’t need classes or polymorphism much, so this isn’t a big loss. For Xanthe, though, I decided to make the game entities polymorphic, and I was able to do it using D structs and alias this. It worked well, especially with a little metaprogramming help. The one thing missing was support for the protected access specifier (which is probably just an oversight). This meant a lot of stuff ended up being public, but that’s how it would be done in C anyway.

Inline Assembly

D supports inline assembly just fine. It’s not as expressive and flexible as GCC’s inline assembly, but it’s a little nicer to use.

A plus for D is the naked feature. This stops the usual function entry/exit code being generated by the compiler, making it easy to implement special functions like interrupt handlers. For comparison, GCC has __attribute__((interrupt)) for this specific job, and it has to be implemented for each architecture. It’s currently not implemented for X86, so there doesn’t seem to be a clean way to write X86 interrupt handlers using GCC.

Summary

I could keep writing more, but I think this is enough for now. D as a language still isn’t totally “polished” for this kind of programming (to be fair, typical embedded C toolchains — proprietary or free — aren’t exactly polished, either), but generally if you need to fall back to C-style programming it works as intended.

~~The big caveat is the lack of “pay for what you use”, which is why (for now) some hacking is needed to get rid of D runtime stuff.~~ Update: things are much better in newer versions of D.