Data Still Dominates

Published

Tags: and

Translations: русский

Here’s a quote from Linus Torvalds in 2006:

I’m a huge proponent of designing your code around the data, rather than the other way around, and I think it’s one of the reasons git has been fairly successful… I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

Which sounds a lot like Eric Raymond’s “Rule of Representation” from 2003:

Fold knowledge into data, so program logic can be stupid and robust.

Which was just his summary of ideas like this one from Rob Pike in 1989:

Data dominates. If you’ve chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

Which cites Fred Brooks from 1975:

Representation is the Essence of Programming

Beyond craftmanship lies invention, and it is here that lean, spare, fast programs are born. Almost always these are the result of strategic breakthrough rather than tactical cleverness. Sometimes the strategic breakthrough will be a new algorithm, such as the Cooley-Tukey Fast Fourier Transform or the substitution of an n log n sort for an n2 set of comparisons.

Much more often, strategic breakthrough will come from redoing the representation of the data or tables. This is where the heart of your program lies. Show me your flowcharts and conceal your tables, and I shall be continued to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.

So, smart people have been saying this again and again for nearly half a century: focus on the data first. But sometimes it feels like the most famous piece of smart programming advice that everyone forgets.

Let me give some real examples.

Unfortunately, Garbage Collection isn't Enough

Published

Tags: , , , and

Here’s a little story of some mysterious server failures I had to debug a year ago. The servers would run okay for a while, then eventually start crashing. After that, trying to run practically anything on the machines failed with “No space left on device” errors, but the filesystem only reported a few gigabytes of files on the ~20GB disks.

The Catch-22 of Risk-Averse Organisations

Published

Tags: and

Markets are supposed to make corporations efficient, so if you do consulting, you have to wonder why so many organisations (both private and government) are so absurdly dysfunctional. The way I see it, there’s no real paradox: organisations are pushed by both forces of efficiency (like market forces) and forces of dysfunction (like political drama). Corporations are only efficient if the forces of efficiency are stronger.

There are many forces of dysfunction that can affect an organisation, but there’s one that’s particularly important for risk-averse organisations (like banks and large government departments). Whenever I see people in an organisation doing something that doesn’t make any sense, I always ask if this catch-22 can explain it:

Every risk-averse organisation needs someone to take the initiative to eliminate risks. But in a risk-averse organisation, that’s exactly what no one does.

Busywork

Published

Tags: , , , and

My first small business wasn’t actually in the software industry. Back when I was a student, I did contract office jobs in the summer holidays to pay for things while I was studying. I got a feeling that I’d be happier self-employed than someone else’s employee, so after graduation I experimented with registering an Australian Business Number (ABN) and using it to do maths and sciences tuition. I went back to working for other companies eventually, but I learned a lot from the experience, and that know-how was extremely valuable later when I quit my full-time job to start my own little consulting business.

I might write more about that experience some other time, but for now I want to write about what’s been hardest for me to get used to: when you’re self-employed, no one cares how much work you do.

The Enterprise Pushbutton

Published

Tags: , and

Let’s talk about a hardware driver for a pushbutton. A pushbutton driver isn’t as completely trivial as it might sound because you need debouncing logic to ensure a crisp on/off signal, but it’s hard to imagine how it might need more than about 100 lines of C code.

After working on this particular embedded system, I didn’t need to stress my imagination any more. This pushbutton driver was modelled as an explicit finite state machine, and all the possible states and transitions were specified in a spreadsheet. Then there was a python script that processed this spreadsheet and generated state table data as C code. This was linked to an FSM evaluator in C. The FSM was controlled by a bare-metal driver and triggered callbacks on each state transition.

Most of the callbacks were marked “not yet implemented”. In fact, only two states were even reachable: BUTTON_UP and BUTTON_DOWN. Eventually the entire project was canned, but not because of missing support for BUTTON_TIMEOUT or any of the other states.

Oh, yeah, the FSM didn’t do any debouncing, so the low-level driver had to do that before passing button up/down events to the FSM.

Why Defensive Coding Matters - A War Story

Published

Tags: , , and

Story time.