Code Replication to Eliminate Branches

In addition to loop unrolling and other optimizations, the number of branches are reduced by replicating code that will eliminate branches. Code replication decreases the number of basic blocks (a stream of instructions entered only at the beginning and exited only at the end) and increases instruction-scheduling opportunities.

Code replication normally occurs when a branch is at the end of a flow of control, such as a routine with multiple, short exit sequences. The code at the exit sequence gets replicated at the various places where a branch to it might occur.

For example, consider the following unoptimized routine and its optimized equivalent that uses code replication, where R0 (EAX on ia32 systems) is register 0:

Unoptimized Instructions   Optimized (Replicated) Instructions
      .
      .
      .
      branch to exit1
      .
      .
      .
      branch to exit1
      .
      .
      .
exit1: move 1 into R0
       return
      .
      .
      .
      move 1 into R0
      return
      .
      .
      .
      move 1 into R0
      return
      .
      .
      .
      move 1 into R0
      return

Similarly, code replication can also occur within a loop that contains a small amount of shared code at the bottom of a loop and a case-type dispatch within the loop. The loop-end test-and-branch code might be replicated at the end of each case to create efficient instruction pipelining within the code for each case.