Commits · b5b2a1e19a4ee8c49b73837e7247b403a74be8da · Roger Ferrer / llvm-epi-0.8

Jan 08, 2011

When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85

Chris Lattner authored Jan 08, 2011

to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079

59c82f85

make domtree verification print something useful on failure. · 5f7734c4
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123078
```
5f7734c4
split ssa updating code out to its own helper function. Don't bother · 30f318e5
Chris Lattner authored Jan 08, 2011
```
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
```
30f318e5

Implement a TODO: Enhance loopinfo to merge away the unconditional branch · 2615130e

Chris Lattner authored Jan 08, 2011

that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075

2615130e

various code cleanups, enhance MergeBlockIntoPredecessor to preserve · 930b716e
Chris Lattner authored Jan 08, 2011
```
loop info.

llvm-svn: 123074
```
930b716e
inline preserveCanonicalLoopForm now that it is simple. · fee37c5f
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123073
```
fee37c5f

Three major changes: · 063dca0f

Chris Lattner authored Jan 08, 2011

1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072

063dca0f

reduce nesting. · 30d95f9f
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123071
```
30d95f9f
LoopRotate requires canonical loop form, so it always has preheaders · 7fab23bc
Chris Lattner authored Jan 08, 2011
```
and latch blocks.  Reorder entry conditions to make hte pass faster
and more logical.

llvm-svn: 123069
```
7fab23bc
use the LI ivar. · d62691f4
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123068
```
d62691f4
some cleanups: remove dead arguments and eliminate ivars · 385f2ec6
Chris Lattner authored Jan 08, 2011
```
that are just passed to one function.

llvm-svn: 123067
```
385f2ec6
fix an issue duncan pointed out, which could cause loop rotate · 25ba40a0
Chris Lattner authored Jan 08, 2011
```
to violate LCSSA form

llvm-svn: 123066
```
25ba40a0
Fix coding style issues. · b4ab257b
Cameron Zwarich authored Jan 08, 2011
```
llvm-svn: 123065
```
b4ab257b

Make more passes preserve dominators (or state that they preserve dominators if · 84986b29

Cameron Zwarich authored Jan 08, 2011

they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.

The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.

llvm-svn: 123064

84986b29

First step in fixing PR8927: · 45e6c195

Rafael Espindola authored Jan 08, 2011

Add a unnamed_addr bit to global variables and functions. This will be used
to indicate that the address is not significant and therefore the constant
or function can be merged with others.

If an optimization pass can show that an address is not used, it can set this.

Examples of things that can have this set by the FE are globals created to
hold string literals and C++ constructors.

Adding unnamed_addr to a non-const global should have no effect unless
an optimization can transform that global into a constant.

Aliases are not allowed to have unnamed_addr since I couldn't figure
out any use for it.

llvm-svn: 123063

45e6c195

Contract subloop bodies. However, it is still important to visit the phis at the · 80bd9af7
Cameron Zwarich authored Jan 08, 2011
```
top of subloop headers, as the phi uses logically occur outside of the subloop.

llvm-svn: 123062
```
80bd9af7
Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little. · 6a1fb8f2
Frits van Bommel authored Jan 08, 2011
```
llvm-svn: 123061
```
6a1fb8f2

Have loop-rotate simplify instructions (yay instsimplify!) as it clones · 8c5defd0

Chris Lattner authored Jan 08, 2011

them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops. This also better exposes the
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations
somethin' fierce by making "dominates all exit blocks" checks fail.

llvm-svn: 123060

8c5defd0

Revamp the ValueMapper interfaces in a couple ways: · 43f8d164

Chris Lattner authored Jan 08, 2011

1. Take a flags argument instead of a bool.  This makes
   it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
   map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
   more efficient.  For lookup failures, don't drop null values
   into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
   and LoopUnswitch, kill it.

No functionality change.

llvm-svn: 123058

43f8d164

two minor changes: switch to the standard ValueToValueMapTy · 2b3f20e6

Chris Lattner authored Jan 08, 2011

map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.

llvm-svn: 123057

2b3f20e6

Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. · 078b0b09
Evan Cheng authored Jan 08, 2011
```
llvm-svn: 123048
```
078b0b09

Do not model all INLINEASM instructions as having unmodelled side effects. · 6eb516db

Evan Cheng authored Jan 07, 2011

Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.

This allows memory instructions to be moved around INLINEASM instructions.

llvm-svn: 123044

6eb516db

Add an explanatory message for an assertion. · 3fa9c064
Bob Wilson authored Jan 07, 2011
```
llvm-svn: 123042
```
3fa9c064

Jan 07, 2011

Eliminate variable only used in debug builds. · 5cc7a1fc
Matt Beaumont-Gay authored Jan 07, 2011
```
llvm-svn: 123040
```
5cc7a1fc
Speculatively revert r123032. · acbee0b0
Devang Patel authored Jan 07, 2011
```
llvm-svn: 123039
```
acbee0b0
Lower some BUILD_VECTORS using VEXT+shuffle. · 6f2b8966
Bob Wilson authored Jan 07, 2011
```
Patch by Tim Northover.

llvm-svn: 123035
```
6f2b8966

InstCombine: Match min/max hidden by sext/zext · fc3d7f66

Tobias Grosser authored Jan 07, 2011

X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X

Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866

llvm-svn: 123034

fc3d7f66

Some whitespace fixes · 411e6eed
Tobias Grosser authored Jan 07, 2011
```
llvm-svn: 123033
```
411e6eed
Appropriately truncate debug info range in dwarf output. · 6381e158
Devang Patel authored Jan 07, 2011
```
Enable live debug variables pass.

llvm-svn: 123032
```
6381e158
DBG_VALUE does not have any side effects; it also makes no sense to mark it cheap as a copy. · 0638c20e
Evan Cheng authored Jan 07, 2011
```
llvm-svn: 123031
```
0638c20e
Revert 122959, it needs more thought. Add it back to README.txt with additional notes. · 134cde91
Benjamin Kramer authored Jan 07, 2011
```
llvm-svn: 123030
```
134cde91
Simplify the allocation and freeing of Users' operand lists, now that · d81f3c96
Jay Foad authored Jan 07, 2011
```
every BranchInst has a fixed number of operands.

llvm-svn: 123027
```
d81f3c96
Remove all uses of the "ugly" method BranchInst::setUnconditionalDest(). · 89afb43b
Jay Foad authored Jan 07, 2011
```
llvm-svn: 123025
```
89afb43b

Revert r122955. It seems using movups to lower memcpy can cause massive... · a048c83f

Evan Cheng authored Jan 07, 2011

Revert r122955. It seems using movups to lower memcpy can cause massive regression (even on Nehalem) in edge cases. I also didn't see any real performance benefit.

llvm-svn: 123015

a048c83f

Add ARM patterns to match EXTRACT_SUBVECTOR nodes. · 8265d566

Bob Wilson authored Jan 07, 2011

Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995

8265d566

Fix a comment typo. · d23b3d2d
Bob Wilson authored Jan 07, 2011
```
llvm-svn: 122994
```
d23b3d2d

Change EXTRACT_SUBVECTOR to require a constant index. · f291cb26

Bob Wilson authored Jan 07, 2011

We were never generating any of these nodes with variable indices, and there
was one legalizer function asserting on a non-constant index.  If we ever have
a need to support variable indices, we can add this back again.

llvm-svn: 122993

f291cb26

Early exit if we don't have invokes. The 'Unwinds' vector isn't modified unless · 34e2bc0f
Bill Wendling authored Jan 07, 2011
```
we have invokes, so there is no functionality change here.

llvm-svn: 122990
```
34e2bc0f
Fix the other problem reported in PR8582. Testcase and patch by · 61c5708b
Duncan Sands authored Jan 06, 2011
```
Nadav Rotem.

llvm-svn: 122983
```
61c5708b

Jan 06, 2011
- Add some fairly duplicated code to let type legalization split illegal · e516af75
  Eric Christopher authored Jan 06, 2011
```
typed atomics. This will lower exclusively to libcalls at the moment.

llvm-svn: 122979
```
  e516af75