Commits · 2b4907e73a11794d6f3cc8ca778d1a73a9fbcd04 · Roger Ferrer / llvm-epi-0.8

Oct 24, 2011

Implement comparison operators for BranchProbability in a way that can't overflow INT64_MAX. · 812da491
Benjamin Kramer authored Oct 24, 2011
```
Add a test case for the edge case that triggers this. Thanks to Chandler for bringing this to my attention.

llvm-svn: 142794
```
812da491

Remove return heuristics from the static branch probabilities, and · 7111f456

Chandler Carruth authored Oct 24, 2011

introduce no-return or unreachable heuristics.

The return heuristics from the Ball and Larus paper don't work well in
practice as they pessimize early return paths. The only good hitrate
return heuristics are those for:
 - NULL return
 - Constant return
 - negative integer return

Only the last of these three can possibly require significant code for
the returning block, and even the last is fairly rare and usually also
a constant. As a consequence, even for the cold return paths, there is
little code on that return path, and so little code density to be gained
by sinking it. The places where sinking these blocks is valuable (inner
loops) will already be weighted appropriately as the edge is a loop-exit
branch.

All of this aside, early returns are nearly as common as all three of
these return categories, and should actually be predicted as taken!
Rather than muddy the waters of the static predictions, just remain
silent on returns and let the CFG itself dictate any layout or other
issues.

However, the return heuristic was flagging one very important case:
unreachable. Unfortunately it still gave a 1/4 chance of the
branch-to-unreachable occuring. It also didn't do a rigorous job of
finding those blocks which post-dominate an unreachable block.

This patch builds a more powerful analysis that should flag all branches
to blocks known to then reach unreachable. It also has better worst-case
runtime complexity by not looping through successors for each block. The
previous code would perform an N^2 walk in the event of a single entry
block branching to N successors with a switch where each successor falls
through to the next and they finally fall through to a return.

Test case added for noreturn heuristics. Also doxygen comments improved
along the way.

llvm-svn: 142793

7111f456

Revert "Test commit" · d1175cf7
NAKAMURA Takumi authored Oct 24, 2011
```
llvm-svn: 142792
```
d1175cf7
Test commit · 6ff417a1
NAKAMURA Takumi authored Oct 24, 2011
```
llvm-svn: 142791
```
6ff417a1

Reapply r142781 with fix. Original message: · 9be7f277

Nick Lewycky authored Oct 24, 2011

  Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
  loop header when computing the trip count.

  With this, we now constant evaluate:
    struct ListNode { const struct ListNode *next; int i; };
    static const struct ListNode node1 = {0, 1};
    static const struct ListNode node2 = {&node1, 2};
    static const struct ListNode node3 = {&node2, 3};
    int test() {
      int sum = 0;
      for (const struct ListNode *n = &node3; n != 0; n = n->next)
        sum += n->i;
      return sum;
    }

llvm-svn: 142790

9be7f277

Doxygen-ify the comments on the public interface for BPI. Also, move the · f5394bcf

Chandler Carruth authored Oct 24, 2011

two more subtle routines to the bottom and expand on their cautionary
comments a bit. No functionality or actual interface change here.

llvm-svn: 142789

f5394bcf

PHI nodes not in the loop header aren't part of the loop iteration initial · 8e904dee

Nick Lewycky authored Oct 24, 2011

state. Furthermore, they might not have two operands. This fixes the underlying
issue behind the crashes introduced in r142781.

llvm-svn: 142788

8e904dee

A dead malloc, a free(NULL) and a free(undef) are all trivially dead · dd1d3df5

Nick Lewycky authored Oct 24, 2011

instructions.

This doesn't introduce any optimizations we weren't doing before (except
potentially due to pass ordering issues), now passes will eliminate them sooner
as part of their own cleanups.

llvm-svn: 142787

dd1d3df5

Speculatively revert r142781. Bots are showing · 9d28c26d

Nick Lewycky authored Oct 24, 2011

  Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed.
coming out of indvars.

llvm-svn: 142786

9d28c26d

Windows/Path.inc: [PR8460] Get rid of ScopedNullTerminator. Thanks to Zvi Rackover! · 66c9e4ff
NAKAMURA Takumi authored Oct 24, 2011
```
llvm-svn: 142785
```
66c9e4ff

Simplify the design of BranchProbabilityInfo by collapsing it into · 7a0094a6

Chandler Carruth authored Oct 24, 2011

a single class. Previously it was split between two classes, one
internal and one external. The concern seemed to center around exposing
the weights used, but those can remain confined to the implementation
file.

Having a single class to maintain the state and analyses in use will
also simplify several of the enhancements I want to make to our static
heuristics.

llvm-svn: 142783

7a0094a6

Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the · 1700007e

Nick Lewycky authored Oct 23, 2011

loop header when computing the trip count.

With this, we now constant evaluate:
  struct ListNode { const struct ListNode *next; int i; };
  static const struct ListNode node1 = {0, 1};
  static const struct ListNode node2 = {&node1, 2};
  static const struct ListNode node3 = {&node2, 3};
  int test() {
    int sum = 0;
    for (const struct ListNode *n = &node3; n != 0; n = n->next)
      sum += n->i;
    return sum;
  }

llvm-svn: 142781

1700007e

Tidy up a loop to be more idiomatic for LLVM's codebase, and remove some · 24cee10f

Chandler Carruth authored Oct 23, 2011

extraneous whitespace. Trying to clean-up this pass as much as I can
before I start making functional changes.

llvm-svn: 142780

24cee10f

Add X86 SARX, SHRX, and SHLX instructions. · b05d9e9b
Craig Topper authored Oct 23, 2011
```
llvm-svn: 142779
```
b05d9e9b

Oct 23, 2011

Teach the BranchProbabilityInfo pass to print its results, and use that · 1c8ace0e

Chandler Carruth authored Oct 23, 2011

to bring it under direct test instead of merely indirectly testing it in
the BlockFrequencyInfo pass.

The next step is to start adding tests for the various heuristics
employed, and to start fixing those heuristics once they're under test.

llvm-svn: 142778

1c8ace0e

Rename the script to indicate that this is for the TEST=simple tests. · addcfcac
Bill Wendling authored Oct 23, 2011
```
llvm-svn: 142764
```
addcfcac
Resurrect the 'find regressions for the TEST=nightly tests' script. · e10675a3
Bill Wendling authored Oct 23, 2011
```
llvm-svn: 142763
```
e10675a3

Now that we have comparison on probabilities, add some static functions · fd7475e9

Chandler Carruth authored Oct 23, 2011

to get important constant branch probabilities and use them for finding
the best branch out of a set of possibilities.

llvm-svn: 142762

fd7475e9

Remove a commented out line of code that snuck by my auditing. · 446210b6
Chandler Carruth authored Oct 23, 2011
```
llvm-svn: 142761
```
446210b6
Print branch probabilities as percentages. · cc0ed6ba
Benjamin Kramer authored Oct 23, 2011
```
50% is much more readable than 5.000000e-01.

llvm-svn: 142752
```
cc0ed6ba
Add compare operators to BranchProbability and use it to determine if an edge is hot. · 929f53f6
Benjamin Kramer authored Oct 23, 2011
```
llvm-svn: 142751
```
929f53f6

Completely re-write the algorithm behind MachineBlockPlacement based on · bd1be4d0

Chandler Carruth authored Oct 23, 2011

discussions with Andy. Fundamentally, the previous algorithm is both
counter productive on several fronts and prioritizing things which
aren't necessarily the most important: static branch prediction.

The new algorithm uses the existing loop CFG structure information to
walk through the CFG itself to layout blocks. It coalesces adjacent
blocks within the loop where the CFG allows based on the most likely
path taken. Finally, it topologically orders the block chains that have
been formed. This allows it to choose a (mostly) topologically valid
ordering which still priorizes fallthrough within the structural
constraints.

As a final twist in the algorithm, it does violate the CFG when it
discovers a "hot" edge, that is an edge that is more than 4x hotter than
the competing edges in the CFG. These are forcibly merged into
a fallthrough chain.

Future transformations that need te be added are rotation of loop exit
conditions to be fallthrough, and better isolation of cold block chains.
I'm also planning on adding statistics to model how well the algorithm
does at laying out blocks based on the probabilities it receives.

The old tests mostly still pass, and I have some new tests to add, but
the nested loops are still behaving very strangely. This almost seems
like working-as-intended as it rotated the exit branch to be
fallthrough, but I'm not convinced this is actually the best layout. It
is well supported by the probabilities for loops we currently get, but
those are pretty broken for nested loops, so this may change later.

llvm-svn: 142743

bd1be4d0

Add X86 RORX instruction · 980d5983
Craig Topper authored Oct 23, 2011
```
llvm-svn: 142741
```
980d5983

The element insertion code in scalar replacement doesn't handle incorrect · 057fbb1a

Cameron Zwarich authored Oct 23, 2011

element types, even though the element extraction code does. It is surprising
that this bug has been here for so long. Fixes <rdar://problem/10318778>.

llvm-svn: 142740

057fbb1a

Add X86 MULX instruction for disassembler. · e94d277d
Craig Topper authored Oct 23, 2011
```
llvm-svn: 142738
```
e94d277d
Remove some duplicate specifying of neverHasSideEffects and mayLoad from X86 multiply instructions. · 7412aa98
Craig Topper authored Oct 22, 2011
```
llvm-svn: 142737
```
7412aa98
Oops! Fix test I forgot to submit as part of r142735. · 52340ac5
Nick Lewycky authored Oct 22, 2011
```
llvm-svn: 142736
```
52340ac5

Oct 22, 2011
- A non-escaping malloc in the entry block is not unlike an alloca. Do dead-store · 32f8051d
  Nick Lewycky authored Oct 22, 2011
```
elimination on them too.

llvm-svn: 142735
```
  32f8051d
- Make SCEV's brute force analysis stronger in two ways. Firstly, we should be · a6674c7f
  Nick Lewycky authored Oct 22, 2011
```
able to constant fold load instructions where the argument is a constant.
Second, we should be able to watch multiple PHI nodes through the loop; this
patch only supports PHIs in loop headers, more can be done here.

With this patch, we now constant evaluate:
  static const int arr[] = {1, 2, 3, 4, 5};
  int test() {
    int sum = 0;
    for (int i = 0; i < 5; ++i) sum += arr[i];
    return sum;
  }

llvm-svn: 142731
```
  a6674c7f
- Fix a typo.w · aa6fab24
  Nadav Rotem authored Oct 22, 2011
```
llvm-svn: 142729
```
  aa6fab24
- Minor updates. · dfc072d4
  Jim Grosbach authored Oct 22, 2011
```
llvm-svn: 142728
```
  dfc072d4
- Added my name to CREDITS.TXT · 9f5ca0ba
  Nadav Rotem authored Oct 22, 2011
```
llvm-svn: 142727
```
  9f5ca0ba
- Move various generated tables into read-only memory, fixing up const correctness along the way. · 0d6d0988
  Benjamin Kramer authored Oct 22, 2011
```
llvm-svn: 142726
```
  0d6d0988
- Fix pr11193. · e649d665
  Nadav Rotem authored Oct 22, 2011
```
SHL inserts zeros from the right, thus even when the original
sign_extend_inreg value was of 1-bit, we need to sra.

llvm-svn: 142724
```
  e649d665
- The different flavors of ARM have different valid subsets of registers. Check · 94e6643f
  Bill Wendling authored Oct 22, 2011
```
that the set of callee-saved registers is correct for the specific platform.
<rdar://problem/10313708> & ctor_dtor_count & ctor_dtor_count-2

llvm-svn: 142706
```
  94e6643f
- Assembly parsing for 4-register sequential variant of VLD2. · 11c0b347
  Jim Grosbach authored Oct 21, 2011
```
llvm-svn: 142704
```
  11c0b347
- Assembly parsing for 2-register sequential variant of VLD2. · 118b38cb
  Jim Grosbach authored Oct 21, 2011
```
llvm-svn: 142691
```
  118b38cb
- Make sure that the landing pads themselves have no PHI instructions in them. · b1c43088
  Bill Wendling authored Oct 21, 2011
```
The assumption in the back-end is that PHIs are not allowed at the start of the
landing pad block for SjLj exceptions.
<rdar://problem/10313708>

llvm-svn: 142689
```
  b1c43088
Oct 21, 2011

Extend the floating point heuristic to consider NaN checks unlikely. · 606a50a9
Benjamin Kramer authored Oct 21, 2011
```
llvm-svn: 142687
```
606a50a9

Revert r141657 for now. This has broken css and changed links on llvm.org. I'd... · 8a8d6466

Tanya Lattner authored Oct 21, 2011

Revert r141657 for now. This has broken css and changed links on llvm.org. I'd like to understand exactly why the links have changed and if a newer doxygen is required. This may be reapplied once we upgrade on llvm.org and it is fully tested.

llvm-svn: 142686

8a8d6466