Why Textbook Statistical Methods aren't as Effective in IT

Published 01 December 2021

If you work with tech, there’s a good chance you’ve come across some of the following statistical tools:

Averages
Standard deviations
t-tests
Least-squares line of best fit

These are the most common tools in a kit that’s typically taught in undergraduate statistics classes and widely used in the outside world. However, this toolkit just isn’t that effective in most IT applications (such as analysing performance benchmarks). Fortunately, there are other tools that do work well. They’re normally taught in “advanced” statistics classes, but I think some of them should become the standard toolkit for tech work (and possibly elsewhere).

In this post I want to talk a bit about why the usual toolkit doesn’t work well. First, let me give an example.

Extending Looped Music for Fun, Relaxation and Productivity

Published 12 March 2021

Some work (like programming) takes a lot of concentration, and I use noise-cancelling headphones to help me work productively in silence. But for other work (like doing business paperwork), I prefer to have quiet music in the background to help me stay focussed. Quiet background music is good for meditation or dozing, too. If you can’t fall asleep or completely clear your mind, zoning out to some music is the next best thing.

The best music for that is simple and repetitive — something nice enough to listen too, but not distracting, and okay to tune out of when needed. Computer game music is like that, by design, so there’s plenty of good background music out there. The harder problem is finding samples that play for more than a few minutes.

So I made loopx, a tool that takes a sample of music that loops a few times, and repeats the loop to make a long piece of music.

When you’re listening to the same music loop for a long time, even slight distortion becomes distracting. Making quality extended music audio out of real-world samples (and doing it fast enough) takes a bit of maths and computer science. About ten years ago I was doing digital signal processing (DSP) programming for industrial metering equipment, so this side project got me digging up some old theory again.

Pi from High School Maths

Published 26 October 2020

Tags: Mathematics and Computer Science

Warning: I don’t think the stuff in this post has any direct practical application by itself (unless you’re a nuclear war survivor and need to reconstruct maths from scratch or something). Sometimes I like to go back to basics, though. Here’s a look at $\pi$ and areas of curved shapes without any calculus or transcendental functions.

Glico (Weighted Rock Paper Scissors)

Published 21 May 2020

Tags: Mathematics , Computer Science , Julia and Translation and Japanese

This still isn’t the blog post I said I was going to write about now, but I figured some game theory would make a good post at the moment, especially when a lot of people I know are working at home with kids who need entertaining. Here’s some stuff about a traditional Japanese kids’ game called Glico, a form of weighted Rock Paper Scissors (RPS).

Some Useful Probability Facts for Systems Programming

Published 27 January 2020

Tags: Systems Design , Mathematics and Computer Science

Probability problems come up a lot in systems programming, and I’m using that term loosely to mean everything from operating systems programming and networking, to building large online services, to creating virtual worlds like in games. Here’s a bunch of rough-and-ready probability rules of thumb that are deeply related and have many practical applications when designing systems.

Euler's Identity Really is a Miracle, Too

Published 20 September 2019

Tags: Mathematics

A post about the exponential function being a miracle did the rounds recently, and the Hacker News comment thread brought up some debate about the miracle of Euler’s famous identity:

e^{\pi i} + 1 = 0

A while back I used to make a living teaching this stuff to high school students and university undergrads. Let me give my personal take on what’s so special about Euler’s identity.

Why Sorting is O(N log N)

Published 05 January 2019

Tags: Mathematics , Computer Science and Performance

Any decent algorithms textbook will explain how fast sorting algorithms like quicksort and heapsort are, but it doesn’t take crazy maths to prove that they’re as asymptotically fast as you can possibly get.

Counting Sudoku Solution Grids using Monte Carlo

Published 14 August 2017

Tags: Mathematics and Computer Science

How many ways can you fill a 9x9 grid, obeying all the rules of the sudoku puzzle? The answer is too big to just calculate directly on a computer, so an exact answer takes careful analysis. But if an absolutely exact answer isn’t required, we can get a good statistical approximation using a Monte Carlo algorithm. As a bonus, the algorithm doesn’t need any application-specific analysis and works on many other problems, too. It’s a handy “stupid things that work” approach to solving problems.