Commits · dd5f60b7a7c99f33d63fc2ce660850ba1c0f22b8 · Roger Ferrer / llvm-epi-0.8

Jan 12, 2011
- revert 123144, reenabling the rest of memset formation. · dd5f60b7
  Chris Lattner authored Jan 12, 2011
```
llvm-svn: 123302
```
  dd5f60b7
- revert r123146 which disabled code that wasn't the root cause · 654098f4
  Chris Lattner authored Jan 12, 2011
```
of the bootstrap miscompare issue.

llvm-svn: 123299
```
  654098f4
- revert r123149, reenabling an improvement to memcpyopt that wasn't · fa7c29d2
  Chris Lattner authored Jan 12, 2011
```
the source of the bootstrap problem.

llvm-svn: 123298
```
  fa7c29d2
Jan 11, 2011
- Remove the PR8954 workaround. · 12cc296b
  Jakob Stoklund Olesen authored Jan 11, 2011
```
llvm-svn: 123288
```
  12cc296b
- Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators · cb9c4f85
  Cameron Zwarich authored Jan 11, 2011
```
once at the beginning of GVN instead of once per iteration.

llvm-svn: 123278
```
  cb9c4f85
- Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux. · 51eb4039
  Cameron Zwarich authored Jan 11, 2011
```
llvm-svn: 123270
```
  51eb4039
- update memdep when an instruction is deleted. This code isn't · 193ce7c4
  Chris Lattner authored Jan 11, 2011
```
actually reached in the testcase in PR8954, but it's safe and good
practice.

llvm-svn: 123224
```
  193ce7c4
- Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes · f6ae904e
  Chris Lattner authored Jan 11, 2011
```
phi nodes.  It is called from MergeBlockIntoPredecessor which is 
called from GVN, which claims to preserve these.

I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.

llvm-svn: 123222
```
  f6ae904e
- random cleanups · dfcfcb49
  Chris Lattner authored Jan 11, 2011
```
llvm-svn: 123221
```
  dfcfcb49
- remove a bogus assertion: the latch block of a loop is not · 63fe78de
  Chris Lattner authored Jan 11, 2011
```
neccesarily an uncond branch to the header.  This fixes 
PR8955 (the assertion tripping).

llvm-svn: 123219
```
  63fe78de
Jan 10, 2011
- another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost · 88bc848a
  Chris Lattner authored Jan 10, 2011
```
llvm-svn: 123149
```
  88bc848a
- another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost · 4662bd4b
  Chris Lattner authored Jan 10, 2011
```
back to life.

llvm-svn: 123146
```
  4662bd4b
- temporarily disable memset formation from memsets in an effort to restore buildbot stability. · 1017fa67
  Chris Lattner authored Jan 09, 2011
```
llvm-svn: 123144
```
  1017fa67
Jan 09, 2011
- fix a few old bugs (found by inspection) where we would zap instructions · caf5c0d0
  Chris Lattner authored Jan 09, 2011
```
without informing memdep.  This could cause nondeterminstic weirdness 
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.

llvm-svn: 123124
```
  caf5c0d0
- LoopInstSimplify preserves LoopSimplify. · a42e5915
  Cameron Zwarich authored Jan 09, 2011
```
llvm-svn: 123117
```
  a42e5915
- reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's · a337f5ec
  Chris Lattner authored Jan 09, 2011
```
that have the bit set.

llvm-svn: 123104
```
  a337f5ec
Jan 08, 2011

fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't · 7d6433ae
Chris Lattner authored Jan 08, 2011
```
updating memdep when fusing stores together.  This fixes the crash optimizing
the bullet benchmark.

llvm-svn: 123091
```
7d6433ae
tryMergingIntoMemset can only handle constant length memsets. · ff6ed2ac
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123090
```
ff6ed2ac

Merge memsets followed by neighboring memsets and other stores into · 9a1d63ba

Chris Lattner authored Jan 08, 2011

larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089

9a1d63ba

fix an issue in IsPointerOffset that prevented us from recognizing that · 5120ebf1
Chris Lattner authored Jan 08, 2011
```
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
```
5120ebf1
enhance memcpyopt to merge a store and a subsequent · 4dc1fd93
Chris Lattner authored Jan 08, 2011
```
memset into a single larger memset.

llvm-svn: 123086
```
4dc1fd93

constify TargetData references. · c638147e

Chris Lattner authored Jan 08, 2011

Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.

llvm-svn: 123081

c638147e

When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85

Chris Lattner authored Jan 08, 2011

to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079

59c82f85

split ssa updating code out to its own helper function. Don't bother · 30f318e5
Chris Lattner authored Jan 08, 2011
```
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
```
30f318e5

Implement a TODO: Enhance loopinfo to merge away the unconditional branch · 2615130e

Chris Lattner authored Jan 08, 2011

that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075

2615130e

inline preserveCanonicalLoopForm now that it is simple. · fee37c5f
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123073
```
fee37c5f

Three major changes: · 063dca0f

Chris Lattner authored Jan 08, 2011

1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072

063dca0f

LoopRotate requires canonical loop form, so it always has preheaders · 7fab23bc
Chris Lattner authored Jan 08, 2011
```
and latch blocks.  Reorder entry conditions to make hte pass faster
and more logical.

llvm-svn: 123069
```
7fab23bc
use the LI ivar. · d62691f4
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123068
```
d62691f4
some cleanups: remove dead arguments and eliminate ivars · 385f2ec6
Chris Lattner authored Jan 08, 2011
```
that are just passed to one function.

llvm-svn: 123067
```
385f2ec6
fix an issue duncan pointed out, which could cause loop rotate · 25ba40a0
Chris Lattner authored Jan 08, 2011
```
to violate LCSSA form

llvm-svn: 123066
```
25ba40a0
Fix coding style issues. · b4ab257b
Cameron Zwarich authored Jan 08, 2011
```
llvm-svn: 123065
```
b4ab257b

Make more passes preserve dominators (or state that they preserve dominators if · 84986b29

Cameron Zwarich authored Jan 08, 2011

they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.

The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.

llvm-svn: 123064

84986b29

Contract subloop bodies. However, it is still important to visit the phis at the · 80bd9af7
Cameron Zwarich authored Jan 08, 2011
```
top of subloop headers, as the phi uses logically occur outside of the subloop.

llvm-svn: 123062
```
80bd9af7

Have loop-rotate simplify instructions (yay instsimplify!) as it clones · 8c5defd0

Chris Lattner authored Jan 08, 2011

them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops. This also better exposes the
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations
somethin' fierce by making "dominates all exit blocks" checks fail.

llvm-svn: 123060

8c5defd0

Revamp the ValueMapper interfaces in a couple ways: · 43f8d164

Chris Lattner authored Jan 08, 2011

1. Take a flags argument instead of a bool.  This makes
   it more clear to the reader what it is used for.
2. Add a flag that says that "remapping a value not in the
   map is ok".
3. Reimplement MapValue to share a bunch of code and be a lot
   more efficient.  For lookup failures, don't drop null values
   into the map.
4. Using the new flag a bunch of code can vaporize in LinkModules
   and LoopUnswitch, kill it.

No functionality change.

llvm-svn: 123058

43f8d164

two minor changes: switch to the standard ValueToValueMapTy · 2b3f20e6

Chris Lattner authored Jan 08, 2011

map from ValueMapper.h (giving us access to its utilities)
and add a fastpath in the loop rotation code, avoiding expensive
ssa updator manipulation for values with nothing to update.

llvm-svn: 123057

2b3f20e6

Jan 06, 2011
- Add the CallInst optimizations that don't involve expanding inline assembly to · 9ec19ea0
  Cameron Zwarich authored Jan 06, 2011
```
OptimizeInst() so that they can be used on a worklist instruction.

llvm-svn: 122945
```
  9ec19ea0
- Move the GEP handling in CodeGenPrepare to OptimizeInst(). · d28c78eb
  Cameron Zwarich authored Jan 06, 2011
```
llvm-svn: 122944
```
  d28c78eb
- Split the optimizations in CodeGenPrepare that don't manipulate the iterators · 14ac865c
  Cameron Zwarich authored Jan 06, 2011
```
into a separate function, so that it can be called from a loop using a worklist
rather than a loop traversing a whole basic block.

llvm-svn: 122943
```
  14ac865c