Cross jumping/tail merging

Code size:   decreased or the same
Performance: worse or the same
Done by:     developer or compiler

Cross jumping or tail merging is an optimization technique used by compilers and humans. This approach is very easy to understand; first detect identical code sequences that can be shared, and then modify the program flow to work the same with the only one unique instance of this sequence.

Look at following example:

if (result > 0)
{
   doSomething();
   result = 0;
}
else
{
   doSomethingDifferent();
   result = 0;
}

The duplication of result assignment is unnecessary; it can be moved behind the if-else body.

if (result > 0)
{
   doSomething();
}
else
{
   doSomethingDifferent();
}

result = 0;

This is the easiest example of tail merging and it should be spotted by developer at first glance. Anyway, in real life the code is more complicated and it’s not always possible to optimize code by hand. Then we can make advantage of compiler optimization. Gcc performs cross jumping optimization when -fcrossjumping flag is enabled (which is enabled at -O2, -O3, -Os levels). Let’s take a look on how it’s done in practice.

I created a function that has some duplicated code inside if-else body, then I compile it with and without gcc cross jumping optimization. The resulted assembly is shown in table 1.

int fun(int& a, int& b)
{
   int c = 0;
   if (a > 100)
   {
      a += 100;
      b += 200;
      c = a + b;
   }
   else
   {
      a += 300;
      b += 200;
      c = a + b;
   }
   return c;
}

Table 1. Gcc cross jumping

without -fcrossjumping with -fcrossjumping
_Z3funRiS_:
.LFB0:
.cfi_startproc
pushl    %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl    %esp, %ebp
.cfi_def_cfa_register 5
subl    $16, %esp
movl    $0, -4(%ebp)
movl    8(%ebp), %eax
movl    (%eax), %eax
cmpl    $100, %eax
jle    .L2
movl    8(%ebp), %eax
movl    (%eax), %eax
leal    100(%eax), %edx
movl    8(%ebp), %eax
movl    %edx, (%eax)
movl    12(%ebp), %eax
movl    (%eax), %eax
leal    200(%eax), %edx
movl    12(%ebp), %eax
movl    %edx, (%eax)
movl    8(%ebp), %eax
movl    (%eax), %edx
movl    12(%ebp), %eax
movl    (%eax), %eax
addl    %edx, %eax
movl    %eax, -4(%ebp)

jmp    .L3
.L2:
movl    8(%ebp), %eax
movl    (%eax), %eax
leal    300(%eax), %edx
movl    8(%ebp), %eax
movl    %edx, (%eax)
movl    12(%ebp), %eax
movl    (%eax), %eax
leal    200(%eax), %edx
movl    12(%ebp), %eax
movl    %edx, (%eax)
movl    8(%ebp), %eax
movl    (%eax), %edx
movl    12(%ebp), %eax
movl    (%eax), %eax
addl    %edx, %eax
movl    %eax, -4(%ebp)
.L3:
movl    -4(%ebp), %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
_Z3funRiS_:
.LFB0:
.cfi_startproc
pushl    %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl    %esp, %ebp
.cfi_def_cfa_register 5
subl    $16, %esp
movl    $0, -4(%ebp)
movl    8(%ebp), %eax
movl    (%eax), %eax
cmpl    $100, %eax
movl    8(%ebp), %eax
movl    (%eax), %eax
jle    .L2
leal    100(%eax), %edx
jmp    .L5
.L2:
leal    300(%eax), %edx
.L5:
movl    8(%ebp), %eax
movl    %edx, (%eax)
movl    12(%ebp), %eax
movl    (%eax), %eax
leal    200(%eax), %edx
movl    12(%ebp), %eax
movl    %edx, (%eax)
movl    8(%ebp), %eax
movl    (%eax), %edx
movl    12(%ebp), %eax
movl    (%eax), %eax
addl    %edx, %eax
movl    %eax, -4(%ebp)
movl    -4(%ebp), %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc

The duplicated fragment is replaced with the jump to .L5. The code size is reduced, but the additional instruction jmp .L5 is added. Such optimization can play significant role when dealing with strict code size constraints in embedded software.

Results of this optimization will be the better, the worse — more duplicated — was the initial code. Remember that the abilities of compiler are much wider than the examples presented here; it can make improvements where the developer’s eye do not see any flaws. It can also happen that the optimization improves nothing. Regardless of result, the compilation time is extended.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s