What Difference Can Order Make When Hashing?

I saw this thread about password hashing on the D language forums. The original post had a good question that didn’t get answered at the time: if you’re hashing a bunch of things, can it make any difference (for security) what order you do it?

The answer turns out to be yes, and it’s a neat example of the difference between theoretical ideals and real-world systems. Because I think this stuff is worth knowing if you’re using cryptographic hash functions for, you know, actual crypto, I thought I’d write up a blog post about why it can matter.

The Ideal

The spherical cow version of the cryptographic hash function is the random oracle. This is a function that simply takes your cleartext input and returns an unbiased, purely random response that’s used as the hash. A random function might sound pointless or trivial, but the catch is that it must be consistent — if it ever gets the same input again, it must return the same hash.

Random oracles are nice. They’re simple to explain, and their pure randomness makes a lot of attacks outright impossible. Unfortunately, they just aren’t feasible to actually implement.

The Compromise

So, we need an alternative standard for judging the real-world hash functions we can have. The usual benchmarks measure how much work is needed to pull off the following attacks:

Collision Attacks

A collision is any pair of inputs that have the same hash. Collisions have to exist because there are infinitely many possible inputs and only finitely many values they can hash to, but actually attacking a cryptosystem by finding a collision should be hard work — effectively trial and error.

Note that the inputs can be totally arbitrary, even gibberish. (Any collision might be enough to trigger bugs in some crypto software.)

Second Preimage Attacks

This is a harder version of the previous attack. This time one input is fixed and the attacker tries to generate another input with the same hash. A real-world scenario is an attacker trying to replace a signed smartphone app with malware that validates with the same signature (although in general there are no restrictions on what the forgery is).

By the way, you’ll often hear the word “collision” used when strictly, technically “second preimage” is what’s meant.

Preimage Attacks

This is basically reversing a hash: given a hash value, try to figure out an input that hashes to it.

The Real World

This set of benchmarks is very practical, but unfortunately it’s like technical fine-print compared to the simpler random oracle model. And that means real-world hash functions can have a few surprises. Flickr is a big-name site that famously got caught out this way (along with many others) in 2009. Flickr’s API used a keyed hash for authentication, using a secret key shared between the API user and Flickr. Users had to send the value of md5Of(key ~ request) with each request. Because only the API user and Flickr knew the key for generating the hash, third parties couldn’t forge requests — or at least not if a random oracle had been used instead of MD5.

The Hack

Let’s take a step back and look at an important feature of practical hash functions: they are designed to process input incrementally because we might not have all the data in memory at once. Here’s some example D code, expanded out to make the process clear:

import std.digest.md;
import io = std.stdio;

void main()
{
    auto digest = makeDigest!MD5();
    // Pretend these are large chunks of text being read one at a time from a file or network link
    digest.put(cast(ubyte[])"A Descent ");
    digest.put(cast(ubyte[])"into the ");
    digest.put(cast(ubyte[])"Maelstrom");
    auto result = digest.finish();
    // Prints "2D9BBE8AF25B4BF58E98A2620C15BA00"
    io.writeln(result.toHexString());
}

Notice how you can keep feeding more data into the digest, and then call finish() any time you like to get the hash value. Apart from the fact that MD5 has to buffer data into blocks of 64 bytes for processing, each put() is doing actual real work in calculating the hash. In other words, the hash of this paragraph is pretty much based on the hash of “Notice how you can keep feeding more data into the digest, and t”, and then that plus the next 64 bytes, and so on. The only difference is that finish() pads any leftover data to 64 bytes (adding the message length as a value as well) and processes that too before returning the hash.

In fact, for MD5 (and SHA1 and SHA2, which all use what’s called the Merkle–Damgård construction) the hash result gives you the entire internal state of the digest algorithm. So, given md5Of(key ~ request), it’s trivial to pick up where the digest left off and generate md5Of(key ~ request ~ padding ~ arbitrary_forged_extension), even without knowing the value of key. This is called a length extension attack. If it shocks you that this is possible, take another look at the “fine print” above. Nothing there says anything about preventing anyone from creating new hash values.

More technical details about the practical Flickr exploit are in this paper. The relevant point is that this specific hack wouldn’t have worked if the secret had been appended instead of prepended. The real solution is to use HMAC for keyed hashing, or to carefully choose an algorithm like SHA3 that isn’t vulnerable to this length extension attack (because the hash value doesn’t reveal the entire digest state).

This kind of breakage caused by non-standard use of hash functions isn’t exactly rare — a similar vulnerability was one of multiple flaws in SSL 2.0.

Back to Password Hashing

With password hashing we’re using a non-secret salt, not a secret key. Putting the salt first does make the hashing weaker (in theory) because some amount of precomputation can be done based on hashOf(salt), which can be used many times to calculate hashOf(salt ~ password_guess). Note that this is true even for hash algorithms that aren’t vulnerable to extension attacks — you may not be able to calculate hashes without knowing the salt (which isn’t even secret anyway), but the hash is still calculated incrementally.

I said “in theory” because traditional password hashing using a single iteration of a hash function like SHA1 is just so fast to crack these days that it really doesn’t matter. It’s too easy to parallelise brute force guessing with GPUs, or even specialised hardware (ASICs). The more resistant options for password hashing today are

bcrypt — an old and respected algorithm that’s designed to be slow and is also resistant to GPU/ASIC acceleration
scrypt — a shiny, new algorithm that’s designed to be resistant to GPU/ASIC acceleration
PBKDF2 — a standard system for iterating a hash many times. At least it’s slow, but prefer the other two because it’s basically a losing battle against custom hardware

By the way, although MD5 is generally considered a broken hash function in practice today, you’ll sometimes still see it used for password hashing and in HMAC. Arguably this is okay (ignoring the fact that MD5ed passwords are easy to bruteforce anyway) because the practical MD5 breaks are collision attacks, and there are no publicly known, practical preimage attacks that could be used against these systems. RFC6151 has some analysis for HMAC-MD5 from 2011. If you have the choice, avoiding MD5 (and SHA1) completely is a sound option though.

Update: About a year after I wrote this, Google announced the first public SHA1 collision.