stow: Your Package Manager When You Can't Use Your Package Manager

Published 08 August 2021

Tags: Tools and Ops

GNU stow is an underrated tool. Generically, it helps maintain a unified tree of files that come from different sources. More concretely, I use a bunch of software (D compilers, various tools) that I install manually instead of through my system’s package manager (for various reasons). stow makes that maintainable by letting me cleanly add/remove packages and switch between versions. Here’s how it’s done.

How Real-World Apps Lose Data

Published 06 June 2021

A great thing about modern app development is that there are cloud providers to worry about things like hardware failures or how to set up RAID. Decent cloud providers are extremely unlikely to lose your app’s data, so sometimes I get asked what backups are really for these days. Here are some real-world stories that show exactly what.

Reverse Engineering a Docker Image

Published 18 March 2021

Tags: Tools , Ops and Low Level

This started with a consulting snafu: Government organisation A got government organisation B to develop a web application. Government organisation B subcontracted part of the work to somebody. Hosting and maintenance of the project was later contracted out to a private-sector company C. Company C discovered that the subcontracted somebody (who was long gone) had built a custom Docker image and made it a dependency of the build system, but without committing the original Dockerfile. That left company C with a contractual obligation to manage a Docker image they had no source code for. Company C calls me in once in a while to do various things, so doing something about this mystery meat Docker image became my job.

Fortunately, the Docker image format is a lot more transparent than it could be. A little detective work is needed, but a lot can be figured out just by pulling apart an image file. As an example, here’s a quick walkthrough of an image for the Prettier code formatter. (In fact, it’s so easy, there’s a tool for it. Thanks Ezequiel Gonzalez Rial.)

Robust and Race-free Server Logging using Named Pipes

Published 10 October 2020

Tags: Ops , Reliability , Servers , Tools and Systems Design

If you do any server administration work, you’ll have worked with log files. And if your servers need to be reliable, you’ll know that log files are common source of problems, especially when you need to rotate or ship them (which is practically always). In particular, moving files around causes race conditions.

Thankfully, there are better ways. With named pipes, you can have a simple and robust logging stack, with no race conditions, and without patching your servers to support some network logging protocol.

Debugging Software Deployments with strace

Published 14 November 2019

Tags: Ops , Tools , Reliability , Servers and Low Level

Translations:русский

Most of my paid work involves deploying software systems, which means I spend a lot of time trying to answer the following questions:

This software works on the original developer’s machine, so why doesn’t it work on mine?
This software worked on my machine yesterday, so why doesn’t it work today?

That’s a kind of debugging, but it’s a different kind of debugging from normal software debugging. Normal debugging is usually about the logic of the code, but deployment debugging is usually about the interaction between the code and its environment. Even when the root cause is a logic bug, the fact that the software apparently worked on another machine means that the environment is usually involved somehow.

So, instead of using normal debugging tools like gdb, I have another toolset for debugging deployments. My favourite tool for “Why isn’t this software working on this machine?” is strace.

Some Presentation Slides

Published 16 February 2019

Tags: Careers , Business , Ops , Tools , Reliability and Me

Here are the slide decks to a couple of talks I’ve given recently.

Being Self-Employed in Australia (at JAIT)

Because this talk is based on my own experiences, it’s particularly relevant to service businesses in Australia. But if you’re interested in being your own boss, anywhere or anyhow, you could find it useful. As I said in the talk, there’s a lot of stuff that feels obvious to me now, but I ended up learning the hard way.

Introduction to Infrastructure as Code (at RORO Sydney)

Here’s a common story: Devs write an app, and do all the right things like using source control and writing automated test suites. Then it comes to deploy the code, and they have to figure out all these things like DNS and server infrastructure. They hack something together using web UIs, but six months later no one can remember the deployment process any more.

This presentation was a really quick introduction to the tools you can use to get more app dependencies into source control.

Unfortunately, Garbage Collection isn't Enough

Published 05 December 2018

Tags: Software Engineering , Programming Languages , Systems Design , Reliability , Ops and Anecdotes

Translations:русский

Here’s a little story of some mysterious server failures I had to debug a year ago. The servers would run okay for a while, then eventually start crashing. After that, trying to run practically anything on the machines failed with “No space left on device” errors, but the filesystem only reported a few gigabytes of files on the ~20GB disks.

Understanding a *nix Shell by Writing One

Published 07 November 2018

Tags: Shell , Low Level , Tools , Ops and dlang

A typical *nix shell has a lot of programming-like features, but works quite differently from languages like Python or C++. This can make a lot of shell features — like process management, argument quoting and the export keyword — seem like mysterious voodoo.

But a shell is just a program, so a good way to learn how a shell works is to write one. I’ve written a simple shell that fits in a few hundred lines of commented D source. Here’s a post that walks through how it works and how you could write one yourself.

Using the Bourne Shell as a Cheap and Nasty Templating Language

Published 09 March 2018

Tags: Shell , Tools and Ops

Sometimes you need a really simple way to generate parameterised text without pulling in a full-blown templating language as a dependency — for example, when writing an install script that needs to generate a simple configuration file. Using the classic *nix Bourne shell that’s installed on practically every *nix system is one option. To be honest, it can be a terrible option, but it often gets simple jobs done, so I think it’s a trick worth remembering.

Popular Web Servers Compared

Published 07 January 2018

Tags: Ops , Servers and Tools

Here’s a comparison of the web servers I’ve used the most.