Loop splitting

Summary

Loop splitting is a compiler optimization technique. It attempts to simplify a loop or eliminate dependencies by breaking it into multiple loops which have the same bodies but iterate over different contiguous portions of the index range.

Loop peeling edit

Loop peeling is a special case of loop splitting which splits any problematic first (or last) few iterations from the loop and performs them outside of the loop body.

Suppose a loop was written like this:

 int p = 10;
 for (int i=0; i<10; ++i)
 {
   y[i] = x[i] + x[p];
   p = i;
 }

Notice that p = 10 only for the first iteration, and for all other iterations, p = i - 1. A compiler can take advantage of this by unwinding (or "peeling") the first iteration from the loop.

After peeling the first iteration, the code would look like this:

 y[0] = x[0] + x[10];
 for (int i=1; i<10; ++i)
 {
   y[i] = x[i] + x[i-1];
 }

This equivalent form eliminates the need for the variable p inside the loop body.

Loop peeling was introduced in gcc in version 3.4. More generalised loop splitting was added in GCC 7.[1]

Brief history of the term edit

Apparently the term was for the first time used by Cannings, Thompson and Skolnick[2] in their 1976 paper on computational models for (human) inheritance. There the term was used to denote a method for collapsing phenotypic information onto parents. From there the term was used again in their papers, including their seminal paper on probability functions on complex pedigrees.[3]

In compiler technology, the term first turned up in late 1980s papers on VLIW and superscalar compilation, including [4] and.[5]

References edit

  1. ^ GCC 7 Release Series — Changes, New Features, and Fixes - GNU Project
  2. ^ Cannings, C.; Thompson, E. A.; Skolnick, H. H. (1976). "The recursive derivation of likelihoods on complex pedigrees". Advances in Applied Probability. 8 (4): 622–625. doi:10.2307/1425918. JSTOR 1425918.
  3. ^ Cannings, C.; Thompson, E. A.; Skolnick, H. H. (1978). "Probability functions on complex pedigrees". Advances in Applied Probability. 10 (1): 26–61. doi:10.2307/1426718. JSTOR 1426718.
  4. ^ Callahan, D.; Kennedy, Ken (1988). "Compiling Programs for Distributed-memory Multiprocessors". The Journal of Supercomputing. 2 (2): 151–169. doi:10.1007/BF00128175. S2CID 10214341.
  5. ^ Mahlke, S. A.; Lin, D. C.; Chen, W. Y.; Hank, R. E.; Bringman, R. A. (1992). Effective compiler support for predicated execution using the hyperblock. 25th Annual International Symposium on Microarchitecture. pp. 45–54.

Further reading edit