On Not Optimising for Last Century's Hardware
Published
Once upon a time I wrote a super-optimised algorithm for rotating data in an array. At least, it was meant to be super-optimised, but its real-world performance turned out to be terrible. That’s because my intuition about performance was stuck in the 20th century:
- Break a program down into basic operations like multiplication and assignment
- Give each operation a cost (or just slap on an O(1) if you’re lazy)
- Add up all the numbers
- Try to make the total in step #3 small
A lot of textbooks still teach this “classic” thinking, but except for some highly constrained embedded systems, it just doesn’t work that well on modern hardware.