Here’s a quote from Linus Torvalds in 2006:
I’m a huge proponent of designing your code around the data, rather than the other way around, and I think it’s one
of the reasons git has been fairly successful… I will, in fact, claim that the difference between a bad programmer and
a good one is whether he considers his code or his data structures more important. Bad programmers worry about the
code. Good programmers worry about data structures and their relationships.
Which sounds a lot like Eric Raymond’s “Rule of
Representation” from 2003:
Fold knowledge into data, so program logic can be stupid and robust.
Which was just his summary of ideas like this one from Rob Pike
Data dominates. If you’ve chosen the right data structures and organized things well, the algorithms will almost
always be self-evident. Data structures, not algorithms, are central to programming.
Which cites Fred Brooks
Representation is the Essence of Programming
Beyond craftmanship lies invention, and it is here that lean, spare, fast programs are born. Almost always these are
the result of strategic breakthrough rather than tactical cleverness. Sometimes the strategic breakthrough will be a
new algorithm, such as the Cooley-Tukey Fast Fourier Transform or the substitution of an n log n sort for an
n2 set of comparisons.
Much more often, strategic breakthrough will come from redoing the representation of the data or tables. This is
where the heart of your program lies. Show me your flowcharts and conceal your tables, and I shall be continued to be
mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.
So, smart people have been saying this again and again for nearly half a century: focus on the data first. But
sometimes it feels like the most famous piece of smart programming advice that everyone forgets.
Let me give some real examples.