Commits · ada4b79947d7f38abd03b9c8ea5e194d48b4e730 · Lorenzo Albano / LLVM bpEVL

Jan 14, 2011

Move some shift transforms out of instcombine and into InstructionSimplify. · 7f60dc1e

Duncan Sands authored Jan 14, 2011

While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything. I fixed this in the constant folder as well. Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero. This is in accordance with the LangRef, but I must
admit that it is fairly aggressive. Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.

llvm-svn: 123417

7f60dc1e

Jan 13, 2011

Fix whitespace. · 328e91bb
Bob Wilson authored Jan 13, 2011
```
llvm-svn: 123396
```
328e91bb
Check for empty structs, and for consistency, zero-element arrays. · c8056a95
Bob Wilson authored Jan 13, 2011
```
llvm-svn: 123383
```
c8056a95

Extend SROA to handle arrays accessed as homogeneous structs and vice versa. · 08713d3c

Bob Wilson authored Jan 13, 2011

This is a minor extension of SROA to handle a special case that is
important for some ARM NEON operations. Some of the NEON intrinsics
return multiple values, which are handled as struct types containing
multiple elements of the same vector type. The corresponding return
types declared in the arm_neon.h header have equivalent arrays. We
need SROA to recognize that it can split up those arrays and structs
into separate vectors, even though they are not always accessed with
the same type. SROA already handles loads and stores of an entire
alloca by using insertvalue/extractvalue to access the individual
pieces, and that code works the same regardless of whether the type
is a struct or an array. So, all that needs to be done is to check
for compatible arrays and homogeneous structs.

llvm-svn: 123381

08713d3c

Make SROA more aggressive with allocas containing padding. · 12eec40c

Bob Wilson authored Jan 13, 2011

SROA only split up structs and arrays one level at a time, so padding can
only cause trouble if it is located in between the struct or array elements.

llvm-svn: 123380

12eec40c

Jan 12, 2011
- Use SmallVector instead of SmallPtrSet and avoid non-deterministic behavior. · 30f3ebbc
  Devang Patel authored Jan 12, 2011
```
llvm-svn: 123318
```
  30f3ebbc
- revert 123144, reenabling the rest of memset formation. · dd5f60b7
  Chris Lattner authored Jan 12, 2011
```
llvm-svn: 123302
```
  dd5f60b7
- revert r123146 which disabled code that wasn't the root cause · 654098f4
  Chris Lattner authored Jan 12, 2011
```
of the bootstrap miscompare issue.

llvm-svn: 123299
```
  654098f4
- revert r123149, reenabling an improvement to memcpyopt that wasn't · fa7c29d2
  Chris Lattner authored Jan 12, 2011
```
the source of the bootstrap problem.

llvm-svn: 123298
```
  fa7c29d2
Jan 11, 2011
- Remove the PR8954 workaround. · 12cc296b
  Jakob Stoklund Olesen authored Jan 11, 2011
```
llvm-svn: 123288
```
  12cc296b
- Fix a non-deterministic loop in llvm::MergeBlockIntoPredecessor. · f2407aa9
  Jakob Stoklund Olesen authored Jan 11, 2011
```
DT->changeImmediateDominator() trivially ignores identity updates, so there is
really no need for the uniqueing provided by SmallPtrSet.

I expect this to fix PR8954.

llvm-svn: 123286
```
  f2407aa9
- Dial back the speculative fix for PR8954 a bit, so that we only recompute dominators · cb9c4f85
  Cameron Zwarich authored Jan 11, 2011
```
once at the beginning of GVN instead of once per iteration.

llvm-svn: 123278
```
  cb9c4f85
- Attempt to fix the bootstrap buildbot. Rafael says this works for him on x86-64 Linux. · 51eb4039
  Cameron Zwarich authored Jan 11, 2011
```
llvm-svn: 123270
```
  51eb4039
- Remove dead variable, const-ref-ize an APInt. · 0022a4b4
  Owen Anderson authored Jan 11, 2011
```
llvm-svn: 123248
```
  0022a4b4
- this pass claims to preserve scev, make sure to tell it about deletions. · d41db8f9
  Chris Lattner authored Jan 11, 2011
```
llvm-svn: 123247
```
  d41db8f9
- Factor the actual simplification out of SimplifyIndirectBrOnSelect and into a... · 8e158495
  Frits van Bommel authored Jan 11, 2011
```
Factor the actual simplification out of SimplifyIndirectBrOnSelect and into a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect.
No functional change.

llvm-svn: 123234
```
  8e158495
- update memdep when an instruction is deleted. This code isn't · 193ce7c4
  Chris Lattner authored Jan 11, 2011
```
actually reached in the testcase in PR8954, but it's safe and good
practice.

llvm-svn: 123224
```
  193ce7c4
- when MergeBlockIntoPredecessor merges two blocks, update MemDep if it · e2523b28
  Chris Lattner authored Jan 11, 2011
```
is floating around in the ether.

llvm-svn: 123223
```
  e2523b28
- Fix FoldSingleEntryPHINodes to update memdep and AA when it deletes · f6ae904e
  Chris Lattner authored Jan 11, 2011
```
phi nodes.  It is called from MergeBlockIntoPredecessor which is 
called from GVN, which claims to preserve these.

I'm skeptical that this is the actual problem behind PR8954, but
this is a stab in the right direction.

llvm-svn: 123222
```
  f6ae904e
- random cleanups · dfcfcb49
  Chris Lattner authored Jan 11, 2011
```
llvm-svn: 123221
```
  dfcfcb49
- remove a bogus assertion: the latch block of a loop is not · 63fe78de
  Chris Lattner authored Jan 11, 2011
```
neccesarily an uncond branch to the header.  This fixes 
PR8955 (the assertion tripping).

llvm-svn: 123219
```
  63fe78de
- Fix a random missed optimization by making InstCombine more aggressive when... · d490c2d2
  Owen Anderson authored Jan 11, 2011
```
Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.

llvm-svn: 123203
```
  d490c2d2
Jan 10, 2011
- Teach instcombine about the rest of the SSE and SSE2 conversion · cf414cf0
  Chandler Carruth authored Jan 10, 2011
```
intrinsics element dependencies. Reviewed by Nick.

llvm-svn: 123161
```
  cf414cf0
- another random stab in the dark trying to fix llvm-gcc-i386-linux-selfhost · 88bc848a
  Chris Lattner authored Jan 10, 2011
```
llvm-svn: 123149
```
  88bc848a
- another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost · 4662bd4b
  Chris Lattner authored Jan 10, 2011
```
back to life.

llvm-svn: 123146
```
  4662bd4b
- temporarily disable memset formation from memsets in an effort to restore buildbot stability. · 1017fa67
  Chris Lattner authored Jan 09, 2011
```
llvm-svn: 123144
```
  1017fa67
Jan 09, 2011
- fix a few old bugs (found by inspection) where we would zap instructions · caf5c0d0
  Chris Lattner authored Jan 09, 2011
```
without informing memdep.  This could cause nondeterminstic weirdness 
based on where instructions happen to get allocated, and will hopefully
breath some life into some broken testers.

llvm-svn: 123124
```
  caf5c0d0
- Instcombine: Fix pattern where the sext did not dominate the icmp using it · cc21c4aa
  Tobias Grosser authored Jan 09, 2011
```
llvm-svn: 123121
```
  cc21c4aa
- LoopInstSimplify preserves LoopSimplify. · a42e5915
  Cameron Zwarich authored Jan 09, 2011
```
llvm-svn: 123117
```
  a42e5915
- reduce indentation. Print <nuw> and <nsw> when dumping SCEV AddRec's · a337f5ec
  Chris Lattner authored Jan 09, 2011
```
that have the bit set.

llvm-svn: 123104
```
  a337f5ec
Jan 08, 2011

fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't · 7d6433ae
Chris Lattner authored Jan 08, 2011
```
updating memdep when fusing stores together.  This fixes the crash optimizing
the bullet benchmark.

llvm-svn: 123091
```
7d6433ae
tryMergingIntoMemset can only handle constant length memsets. · ff6ed2ac
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123090
```
ff6ed2ac

Merge memsets followed by neighboring memsets and other stores into · 9a1d63ba

Chris Lattner authored Jan 08, 2011

larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089

9a1d63ba

fix an issue in IsPointerOffset that prevented us from recognizing that · 5120ebf1
Chris Lattner authored Jan 08, 2011
```
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
```
5120ebf1
enhance memcpyopt to merge a store and a subsequent · 4dc1fd93
Chris Lattner authored Jan 08, 2011
```
memset into a single larger memset.

llvm-svn: 123086
```
4dc1fd93

constify TargetData references. · c638147e

Chris Lattner authored Jan 08, 2011

Split memset formation logic out into its own
"tryMergingIntoMemset" helper function.

llvm-svn: 123081

c638147e

When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85

Chris Lattner authored Jan 08, 2011

to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079

59c82f85

split ssa updating code out to its own helper function. Don't bother · 30f318e5
Chris Lattner authored Jan 08, 2011
```
moving the OrigHeader block anymore: we just merge it away anyway so
its code layout doesn't matter.

llvm-svn: 123077
```
30f318e5

Implement a TODO: Enhance loopinfo to merge away the unconditional branch · 2615130e

Chris Lattner authored Jan 08, 2011

that it was leaving in loops after rotation (between the original latch
block and the original header.

With this change, it is possible for rotated loops to have just a single
basic block, which is useful.

llvm-svn: 123075

2615130e

various code cleanups, enhance MergeBlockIntoPredecessor to preserve · 930b716e
Chris Lattner authored Jan 08, 2011
```
loop info.

llvm-svn: 123074
```
930b716e