# Debugging Software Deployments with strace

Published

Tags: , , , and

Most of my paid work involves deploying software systems, which means I spend a lot of time trying to answer the following questions:

• This software works on the original developer’s machine, so why doesn’t it work on mine?
• This software worked on my machine yesterday, so why doesn’t it work today?

That’s a kind of debugging, but it’s a different kind of debugging from normal software debugging. Normal debugging is usually about the logic of the code, but deployment debugging is usually about the interaction between the code and its environment. Even when the root cause is a logic bug, the fact that the software apparently worked on another machine means that the environment is usually involved somehow.

So, instead of using normal debugging tools like gdb, I have another toolset for debugging deployments. My favourite tool for “Why isn’t this software working on this machine?” is strace.

# Hello World Marketing (or, How I Find Good, Boring Software)

Published

Back in 2001 Joel Spolsky wrote his classic essay “Good Software Takes Ten Years. Get Used To it”. Nothing much has changed since then: software is still taking around a decade of development to get good, and the industry is still getting used to that fact. Unfortunately, the industry has investors who want to see hockey stick growth rates on software that’s a year old or less. The result is an antipattern I like to call “Hello World Marketing”. Once you start to notice it, you see it everywhere, and it’s a huge red flag when choosing software tools.

# Popular Web Servers Compared

Published

Tags: , and

Here’s a comparison of the web servers I’ve used the most.

# Relational Databases Considered Incredibly Useful

Published

Tags: , , and

A year ago I worked on a web service that had Postgres and Elasticsearch as backends. Postgres was doing most of the work and was the primary source of truth about all data, but some documents were replicated in Elasticsearch for querying. Elasticsearch was easy to get started with, but had an ongoing maintenance cost: it was one more moving part to break down, it occasionally went out of sync with the main database, it was another thing for new developers to install, and it added complexity to the deployment, as well as the integration tests. But most of the features of Elasticsearch weren’t needed because the documents were semi-structured, and the search queries were heavily keyword-based. Dropping Elasticsearch and just using Postgres turned out to work okay. No, I’m not talking about brute-force string matching using LIKE expressions (as implemented in certain popular CMSs); I’m talking about using the featureful text search indexes in good modern databases. Text search with Postgres took more work to implement, and couldn’t do all the things Elasticsearch could, but it was easier to deploy, and since then it’s been zero maintenance. Overall, it’s considered a net win (I talked to some of the developers again just recently).

# Terraform is Best for Configuring Hashicorp Vault

Published

Tags: , and

Hashicorp Vault is a handy tool for scalable secrets management in a distributed system or team-based project. Unfortunately, the only out-of-the-box way to configure it is through its API (or a UI), but most projects that need Vault will need to manage the configuration in source control.

There’s a workaround explained on the Hashicorp blog. It’s a neat hack, but here’s a quick note about why using Terraform’s Vault integration is a better idea for production use.

# A Tale of Three Server Caching Architectures

Published

Exactly where you put caching in a distributed system has a significant impact on its effectiveness, in ways that aren’t always obvious during the design phase of development.

# Offline Compression with Nginx

Published

Tags: , and

There’s a clear tradeoff with compressing HTTP responses on the fly: compress “harder” and you’ll (hopefully) get a smaller file that takes less time to send over the network – but the net benefit might be negative if the extra work takes too much time, or (when under heavy load) too much CPU. A lot of work has been done analysing this tradeoff, but for static content there’s a neat and simple way to avoid the tradeoff completely: compress offline before serving. Nginx supports this using the gzip_static module.

Published

Story time.