In a post a few months back I said it’s a popular
myth that const
is helpful for enabling compiler
optimisations in C and C++. I figured I should explain that one, especially because I used to believe it was
obviously true, myself. I’ll start off with some theory and artificial examples, then I’ll do some experiments and
benchmarks on a real codebase: Sqlite.
A simple test
Let’s start with what I used to think was the simplest and most obvious example of how const
can make C code faster. First, let’s say we have these two function
declarations:
And suppose we have these two versions of some code:
To do the printf()
, the CPU has to fetch the value of
*x
from RAM through the pointer. Obviously, constByArg()
can be made slightly faster because the compiler knows that
*x
is constant, so there’s no need to load its value a second
time after constFunc()
does its thing. It’s just printing the
same thing. Right? Let’s see the assembly code generated by GCC with optimisations cranked up:
Here’s the full assembly output for byArg()
:
The only difference between the generated assembly code for byArg()
and constByArg()
is that constByArg()
has a call constFunc@PLT
, just like the source code asked. The const
itself has literally made zero difference.
Okay, that’s GCC. Maybe we just need a sufficiently smart compiler. Is Clang any better?
Here’s the IR. It’s more compact than assembly, so I’ll dump both functions so you can see what I mean by “literally zero difference except for the call”:
Something that (sort of) works
Here’s some code where const
actually does make a
difference:
Here’s the assembly for localVar()
, which has two
instructions that have been optimised out of constLocalVar()
:
The LLVM IR is a little clearer. The load
just before the
second printf()
call has been optimised out of constLocalVar()
:
Okay, so, constLocalVar()
has sucessfully elided the
reloading of *x
, but maybe you’ve noticed something a bit
confusing: it’s the same constFunc()
call in the bodies of
localVar()
and constLocalVar()
. If the compiler can deduce that constFunc()
didn’t modify *x
in constLocalVar()
, why can’t it deduce that the exact same function call
didn’t modify *x
in localVar()
?
The explanation gets closer to the heart of why C const
is
impractical as an optimisation aid. C const
effectively has
two meanings: it can mean the variable is a read-only alias to some data that may or may not be constant, or it can
mean the variable is actually constant. If you cast away const
from a pointer to a constant value and then write to it, the result
is undefined behaviour. On the other hand, it’s okay if it’s just a const
pointer to a value that’s not constant.
This possible implementation of constFunc()
shows what
that means:
localVar()
gave constFunc()
a const
pointer to non-const
variable. Because the variable wasn’t originally const
, constFunc()
can be a liar and forcibly modify it without triggering UB.
So the compiler can’t assume the variable has the same value after constFunc()
returns. The variable in constLocalVar()
really is const
, though, so the compiler can assume it won’t change — because this
time it would be UB for constFunc()
to cast
const
away and write to it.
The byArg()
and constByArg()
functions in the first example are hopeless because the
compiler has no way of knowing if *x
really is const
.
Update (and digression): Quite a few readers have correctly pointed out that with const int *x
, the pointer itself isn’t qualified const
, just the data being aliased, and that const int * const extra_const
is a pointer that’s qualified const
“both ways”. But because the constness of the pointer itself is
independent of the constness of the data being aliased, the result is the same. *(int*const)extra_const = 0
is still UB only if extra_const
points to an object that’s defined with const
. (In fact, *(int*)extra_const = 0
wouldn’t be any worse.) Because it’s a mouthful to
keep distinguishing between a fully const
pointer and a
pointer that may or may not be itself constant but is a read-only alias to an object that may or not be constant, I’ll
just keep referring loosely to “const
pointers”. (End of
digression.)
But why the inconsistency? If the compiler can assume that constFunc()
doesn’t modify its argument when called in constLocalVar()
, surely it can go ahead an apply the same optimisations
to other constFunc()
calls, right? Nope. The compiler can’t
assume constLocalVar()
is ever run at all. If it isn’t (say,
because it’s just some unused extra output of a code generator or macro), constFunc()
can sneakily modify data without ever triggering UB.
You might want to read the above explanation and examples a few times, but don’t worry if it sounds absurd: it is.
Unfortunately, writing to const
variables is the worst kind
of UB: most of the time the compiler can’t know if it even would be UB. So most of the time the compiler sees
const
, it has to assume that someone, somewhere could cast it
away, which means the compiler can’t use it for optimisation. This is true in practice because enough real-world C code
has “I know what I’m doing” casting away of const
.
In short, a whole lot of things can prevent the compiler from using const
for optimisation, including receiving data from another scope using
a pointer, or allocating data on the heap. Even worse, in most cases where const
can be used by the compiler, it’s not even necessary. For example,
any decent compiler can figure out that x
is constant in the
following code, even without const
:
TL;DR: const
is almost useless for optimisation
because
- Except for special cases, the compiler has to ignore it because other code might legally cast it away
- In most of the exceptions to #1, the compiler can figure out a variable is constant, anyway
C++
There’s another way const
can affect code generation if
you’re using C++: function overloads. You can have const
and
non-const
overloads of the same function, and maybe the
non-const
can be optimised (by the programmer, not the
compiler) to do less copying or something.
On the one hand, I don’t think this is exploited much in practical C++ code. On the other hand, to make a real difference, the programmer has to make assumptions that the compiler can’t make because they’re not guaranteed by the language.
An experiment with Sqlite3
That’s enough theory and contrived examples. How much effect does const
have on a real codebase? I thought I’d do a test on the Sqlite
database (version 3.30.0) because
- It actually uses
const
- It’s a non-trivial codebase (over 200KLOC)
- As a database, it includes a range of things from string processing to arithmetic to date handling
- It can be tested with CPU-bound loads
Also, the author and contributors have put years of effort into performance optimisation already, so I can assume they haven’t missed anything obvious.
The setup
I made two copies of the source code and compiled one
normally. For the other copy, I used this hacky preprocessor snippet to turn const
into a no-op:
(GNU) sed
can add that to the top of each file with
something like sed -i '1i#define const' *.c *.h
.
Sqlite makes things slightly more complicated by generating code using scripts at build time. Fortunately, compilers
make a lot of noise when const
and non-const
code are mixed, so it was easy to detect when this happened, and
tweak the scripts to include my anti-const
snippet.
Directly diffing the compiled results is a bit pointless because a tiny change can affect the whole memory layout,
which can change pointers and function calls throughout the code. Instead I took a fingerprint of the disassembly
(objdump -d libsqlite3.so.0.8.6
), using the binary size and
mnemonic for each instruction. For example, this function:
would turn into something like this:
I left all the Sqlite build settings as-is when compiling anything.
Analysing the compiled code
The const
version of libsqlite3.so was 4,740,704 bytes,
about 0.1% larger than the 4,736,712 bytes of the non-const
version. Both had 1374 exported functions (not including low-level helpers like stuff in the PLT), and a total of 13
had any difference in fingerprint.
A few of the changes were because of the dumb preprocessor hack. For example, here’s one of the changed functions (with some Sqlite-specific definitions edited out):
Removing const
makes those constants into static
variables. I don’t see why anyone who didn’t care about
const
would make those variables static
. Removing both static
and const
makes GCC recognise them as constants again, and we get the same
output. Three of the 13 functions had spurious changes because of local static const
variables like this, but I didn’t bother fixing any of
them.
Sqlite uses a lot of global variables, and that’s where most of the real const
optimisations came from. Typically they were things like a
comparison with a variable being replaced with a constant comparison, or a loop being partially unrolled a step. (The
Radare toolkit was handy for figuring out what the optimisations did.) A few changes
were underwhelming. sqlite3ParseUri()
is 487 instructions,
but the only difference const
made was taking this pair of
comparisons:
And swapping their order:
Benchmarking
Sqlite comes with a performance regression test, so I tried running it a hundred times for each version of the code, still using the default Sqlite build settings. Here are the timing results in seconds:
const | No const | |
---|---|---|
Minimum | 10.658s | 10.803s |
Median | 11.571s | 11.519s |
Maximum | 11.832s | 11.658s |
Mean | 11.531s | 11.492s |
Personally, I’m not seeing enough evidence of a difference worth caring about. I mean, I removed const
from the entire program, so if it made a significant difference,
I’d expect it to be easy to see. But maybe you care about any tiny difference because you’re doing something absolutely
performance critical. Let’s try some statistical analysis.
I like using the Mann-Whitney U test for stuff like this. It’s similar to the more-famous t test for detecting differences in groups, but it’s more robust to the kind of complex random variation you get when timing things on computers (thanks to unpredictable context switches, page faults, etc). Here’s the result:
const | No const | |
---|---|---|
N | 100 | 100 |
Mean rank | 121.38 | 79.62 |
Mann-Whitney U | 2912 |
---|---|
Z | -5.10 |
2-sided p value | <10-6 |
HL median difference | -.056s |
95% confidence interval | -.077s – -0.038s |
The U test has detected a statistically significant difference in performance. But, surprise, it’s actually the
non-const
version that’s faster — by about 60ms, or 0.5%. It
seems like the small number of “optimisations” that const
enabled weren’t worth the cost of extra code. It’s not like const
enabled any major optimisations like auto-vectorisation. Of course,
your mileage may vary with different compiler flags, or compiler versions, or codebases, or whatever, but I think it’s
fair to say that if const
were effective at improving C
performance, we’d have seen it by now.
So, what’s const
for?
For all its flaws, C/C++ const
is still useful for type
safety. In particular, combined with C++ move semantics and std::unique_pointer
s, const
can make pointer ownership explicit. Pointer ownership ambiguity
was a huge pain in old C++ codebases over ~100KLOC, so personally I’m grateful for that alone.
However, I used to go beyond using const
for meaningful
type safety. I’d heard it was best practices to use const
literally as much as possible for performance reasons. I’d heard that when performance really mattered, it was
important to refactor code to add more const
, even in ways
that made it less readable. That made sense at the time, but I’ve since learned that it’s just not true.