Loop unrolling

Code size:   increased
Performance: better
Done by:     developer or compiler

Loop unrolling, also known as loop unwinding, is an optimization which can reduce overhead of running a loop — number of instructions of checking the loop termination condition and loop counter modification. It requires fixed and known loop count.

For loop executes conditional statement e.g. i < 0 and modifies the loop counter e.g i++ once per iteration. It is shown in example below — a simple loop with very small body. To call this body 100 times the loop also executes conditional statement and loop counter incrementation, both 100 times. It means 200 of 300 instructions of this loop are overhead.

for(unsigned int i = 0; i < 100; i++)
   x[i] = i;

With a little modification the number of these overhead executions can be reduced to 25 each:

for(unsigned int i = 0; i < 100; i += 4)
   x[i]   = i;
   x[i+1] = i + 1;
   x[i+2] = i + 2;
   x[i+3] = i + 3;

The number of copies inside loop body is called the loop unrolling factor. Be careful while choosing unrolling factor to not exceed the array bounds.

To ensure your loop is optimized use unsigned type for loop counter instead of signed type. On some compilers it is also better to make loop counter decrement and make termination condition as comparison to zero (ARM).

You can find loop unrolling useful while dealing with performance-critical parts of code which are executed very often. But… should you really do optimization by hand?

The answer is: it depends on your compiler abilities and of course your needs (it is really necessary to optimize this loop? You should check it with profiler before starting optimizations!). Such optimization done by developer has several disadvantages with the less readable code on the top. It is also easy to make mistake while manually unrolling bigger loop, and such mistake can be hard to spot. In case your compiler can do it for you, leave it to compiler — it will be done better this way. On these terms your optimization may in fact decrease the performance and make your code less portable (Intel).

On the other hand there are compilers that do not unroll loops and in that case you could do it manually; especially if the loop body is small. But first do some research with profiler if it’s worth it.

Felix von Leitner summarized it shortly:

Learn what your compiler does, Then let the compiler do it.

GCC provides loop unrolling when flag -floop-optimize is enabled (which is true at levels -O, -O2, -O3, -Os).


2 thoughts on “Loop unrolling

  1. mminich says:

    Loop unrolling does sometimes make sense on platforms without an instruction cache, as long as there’s plenty of code space, or if performance of that loop’s body is critical. But on platforms with an instruction cache, manually unrolling loops may *reduce* performance (and waste code space!). If your loop body (including any function calls in it) is small in comparison to the loop overhead, and if it will loop 100,000 times and you unroll it 10 times so it only has to loop 10,000 times, that may very well make a measurable improvement. But if the same loop would only run 20 times and you unroll it 10 times so it only has to loop twice, or if you unroll it a full 20 times, you’re defeating the purpose of the cache, and the result will probably be slower and take more code space. Defeating the cache is a double-whammy: You have to load the loop body’s instructions from main memory more than once (instead of benefiting from the cache on every loop except the first), and in addition to that, you evict other (older) instructions from the cache which might have been useful in the near future, and will now need to be re-loaded since they were evicted from the cache.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s