[SimpleLoopUnswitch] Skip non-trivial unswitching of cold loop nests
This fixes a compile time issue due to guarding loop unswitching based on whether the enclosing function is cold. That approach is very inefficient in the case of large cold functions that contain numerous loops, since the loop pass calls isFunctionColdInCallGraph once per loop, and that function walks all BBs in the function (twice for Sample PGO) looking for any non-cold blocks. Originally, this code only checked if the current Loop's header was cold (D129599). However, that apparently caused a slowdown on a SPEC benchmark, and the example given was that of a cold inner loop nested in a non-cold outer loop (see comments in D129599). The fix was to check if the whole function is cold, done in D133275. This is overkill, and we can simply check if the header of any loop in the current loop's loop nest is non-cold (looking at both outer and inner loops). This patch drops the compile time for a large module by 40% with this approach. I also updated PGO-nontrivial-unswitch2.ll since it only had one cold loop in a non-cold function, so that it instead had IR based off the example given in the comments relating to the SPEC degradation in D129599. I confirmed that the new version of the test fails with the original check done in D129599 of only the current loop's header coldness. Similarly updated test PGO-nontrivial-unswitch.ll to contain a cold loop in a cold loop nest, and created PGO-nontrivial-unswitch3.ll to contain a non-cold loop in a non-cold loop nest. Differential Revision: https://reviews.llvm.org/D146383
Loading
Please sign in to comment