When LICM fails us

xania.org2025年12月14日 12:00

Written by me, proof-read by an LLM. Details at end.

ended with the compiler pulling invariants like and out of our loop - clean assembly, great performance. Job done, right?Yesterday's LICM postsize()get_range

Not quite. Let's see how that optimisation can disappear.

Let's say you had a , and wanted to write a function to return if there was an exclamation mark or not:const char *1

Here we're relying on loop-invariant code motion (LICM) to move the out of the loop, called once before we loop, and then the loop is:strlen

All pretty good. We'll learn in later posts that we can improve on this simple loop, but for now this seems ok.2

Now let's assume we were curious how many characters in total we were comparing across our entire program, and add some basic instrumentation:3

We might reasonably expect a single increment to be added to our loop. But if we take a closer look:

Suddenly we've lost our lovely LICM! By simply incrementing a global variable, we lost the ability to call and keep the result. Why is this?4strlen

It comes down to aliasing: the compiler can't prove that the string we're getting the length of doesn't share memory with the variable! Every time we modify the compiler has to assume that the string changed as a result!num_comparesnum_comparesmaybe

This seems pretty odd: why on earth would a string overlap with a ? Don't we have rules about this? Aren't we disallowed from doing such things - type punning is barred - at least in C++?std::size_t

Unfortunately, has a status in the standard: it's allowed to alias with . That's why the compiler can't assume our string and occupy different memory! If we had any other type than then the compiler use LICM.char*num_compareschar *specialanything5could

At least, that's what I hoped. In practice neither GCC, clang nor MSVC were able to LICM the code I tried. I could easily be missing something here, so do let me know if you have any ideas.67

Today's post seems a bit of a downer: the compiler wasn't able to do an optimisation I had otherwise hoped it might. Tomorrow we'll look at some other examples where the compiler struggles with aliasing, and we'll see how to help it out.

See that accompanies this post.the video

This post is day 14 of , a 25-day series exploring how compilers transform our code.Advent of Compiler Optimisations 2025

← | →Loop-Invariant Code MotionAliasing

This post was written by a human () and reviewed and proof-read by LLMs and humans.Matt Godbolt

.Support Compiler Explorer on or , or by buying CE products in the PatreonGitHubCompiler Explorer Shop

.L4:addrdi,1; ++indexcmpBYTEPTR[rdi-1],33; is string[index-1] a '!' ?je.L5; if so, jump to "return true".L2:cmprdi,rax; else...are we at the end?jne.L4; loop if not
.L4:addQWORDPTRnum_compares[rip],1; ++num_compares;cmpBYTEPTR[rbp+0+rbx],33; is char a '!' ?je.L5; if so, jump to "return true"addrbx,1; ++index.L2:movrdi,rbp; er...callstrlen; what the heck? strlen?cmprbx,rax; oh no! what happened?jb.L4; loop if index != strlen(...)

  1. Think of this as more a C-style implementation: in C++ we'd probably use a here which avoids the specific issue we're about to run in to. And of course and friends are more optimal, but...bear with me! string_viewstrchr

  2. Specifically, either by using or other settings, we can get the compiler's help to vectorize this. We'll talk more of that later. strchr

  3. We're going to ignore any threading issues here. 

  4. Making the variable helps for clang, but not gcc. Clang sees that we don't use the variable and gets rid of it, but if we were to use it elsewhere it's possible it wouldn't be able to do this optimisation. static

  5. See in the C++ draft standard, which lists , , and as exceptions to the strict aliasing rule. [basic.lval] paragraph 11charunsigned charstd::byte

  6. I filed a for GCC and the thought is the heuristic that handles inlining is perhaps making a poor choice. We'll talk about inlining later. missed optimisation opportunity

  7. Clang's optimiser had slightly different characteristics, and I to track. filed this issue