Commits · 654098f4116f6f6018c1e18a3462c2bb20012559 · Roger Ferrer / llvm-epi-0.8

Jan 12, 2011
- revert r123146 which disabled code that wasn't the root cause · 654098f4
  Chris Lattner authored Jan 12, 2011
```
of the bootstrap miscompare issue.

llvm-svn: 123299
```
  654098f4
Jan 11, 2011
- merge tests into one crash.ll test. · 054d2a85
  Chris Lattner authored Jan 11, 2011
```
llvm-svn: 123220
```
  054d2a85
- remove a bogus assertion: the latch block of a loop is not · 63fe78de
  Chris Lattner authored Jan 11, 2011
```
neccesarily an uncond branch to the header.  This fixes 
PR8955 (the assertion tripping).

llvm-svn: 123219
```
  63fe78de
- Teach constant folding to perform conversions from constant floating · b1e7f557
  Chandler Carruth authored Jan 11, 2011
```
point values to their integer representation through the SSE intrinsic
calls. This is the last part of a README.txt entry for which I have real
world examples.

llvm-svn: 123206
```
  b1e7f557
- FileCheck-ize a test, and move a no-longer calling test case to another · fdf49691
  Chandler Carruth authored Jan 11, 2011
```
file and make it actually test something...

llvm-svn: 123205
```
  fdf49691
- Fix a random missed optimization by making InstCombine more aggressive when... · d490c2d2
  Owen Anderson authored Jan 11, 2011
```
Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by
a comparison against a constant.

llvm-svn: 123203
```
  d490c2d2
Jan 10, 2011
- Teach instcombine about the rest of the SSE and SSE2 conversion · cf414cf0
  Chandler Carruth authored Jan 10, 2011
```
intrinsics element dependencies. Reviewed by Nick.

llvm-svn: 123161
```
  cf414cf0
- Fold two related tests into the newly FileCheck-ized test, migrating · 7bb282eb
  Chandler Carruth authored Jan 10, 2011
```
them to FileCheck as well.

llvm-svn: 123154
```
  7bb282eb
- Clean up and FileCheck-ize a test. · ef7aac59
  Chandler Carruth authored Jan 10, 2011
```
llvm-svn: 123153
```
  ef7aac59
- fix typo · ec1387cf
  Chris Lattner authored Jan 10, 2011
```
llvm-svn: 123148
```
  ec1387cf
- another (more) aggressive attempt to bring llvm-gcc-i386-linux-selfhost · 4662bd4b
  Chris Lattner authored Jan 10, 2011
```
back to life.

llvm-svn: 123146
```
  4662bd4b
- temporarily disable memset formation from memsets in an effort to restore buildbot stability. · 1017fa67
  Chris Lattner authored Jan 09, 2011
```
llvm-svn: 123144
```
  1017fa67
Jan 09, 2011
- Instcombine: Fix pattern where the sext did not dominate the icmp using it · cc21c4aa
  Tobias Grosser authored Jan 09, 2011
```
llvm-svn: 123121
```
  cc21c4aa
Jan 08, 2011

Merge memsets followed by neighboring memsets and other stores into · 9a1d63ba

Chris Lattner authored Jan 08, 2011

larger memsets.  Among other things, this fixes rdar://8760394 and
allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
compiling it into a single 4096-byte memset:

_mad_synth_mute:                        ## @mad_synth_mute
## BB#0:                                ## %entry
	pushq	%rax
	movl	$4096, %esi             ## imm = 0x1000
	callq	___bzero
	popq	%rax
	ret

llvm-svn: 123089

9a1d63ba

fix an issue in IsPointerOffset that prevented us from recognizing that · 5120ebf1
Chris Lattner authored Jan 08, 2011
```
P and P+1 are relative to the same base pointer.

llvm-svn: 123087
```
5120ebf1
enhance memcpyopt to merge a store and a subsequent · 4dc1fd93
Chris Lattner authored Jan 08, 2011
```
memset into a single larger memset.

llvm-svn: 123086
```
4dc1fd93
merge two tests and filecheckify · 9dbbc49f
Chris Lattner authored Jan 08, 2011
```
llvm-svn: 123082
```
9dbbc49f

When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85

Chris Lattner authored Jan 08, 2011

to be foldable into an uncond branch.  When this happens, we can make a
much simpler CFG for the loop, which is important for nested loop cases
where we want the outer loop to be aggressively optimized.

Handle this case more aggressively.  For example, previously on
phi-duplicate.ll we would get this:


define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  %cmp1 = icmp slt i64 1, 1000
  br i1 %cmp1, label %bb.nph, label %for.end

bb.nph:                                           ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %bb.nph, %for.cond
  %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.02
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.02, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.02, 1
  br label %for.cond

for.cond:                                         ; preds = %for.body
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

for.cond.for.end_crit_edge:                       ; preds = %for.cond
  br label %for.end

for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
  ret void
}

Now we get the much nicer:

define void @test(i32 %N, double* %G) nounwind ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
  %arrayidx = getelementptr inbounds double* %G, i64 %j.01
  %tmp3 = load double* %arrayidx
  %sub = sub i64 %j.01, 1
  %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
  %tmp7 = load double* %arrayidx6
  %add = fadd double %tmp3, %tmp7
  %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
  store double %add, double* %arrayidx10
  %inc = add nsw i64 %j.01, 1
  %cmp = icmp slt i64 %inc, 1000
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

With all of these recent changes, we are now able to compile:

void foo(char *X) {
 for (int i = 0; i != 100; ++i) 
   for (int j = 0; j != 100; ++j)
     X[j+i*100] = 0;
}

into a single memset of 10000 bytes.  This series of changes
should also be helpful for other nested loop scenarios as well.

llvm-svn: 123079

59c82f85

Three major changes: · 063dca0f

Chris Lattner authored Jan 08, 2011

1. Rip out LoopRotate's domfrontier updating code.  It isn't
   needed now that LICM doesn't use DF and it is super complex
   and gross.
2. Make DomTree updating code a lot simpler and faster.  The 
   old loop over all the blocks was just to find a block??
3. Change the code that inserts the new preheader to just use
   SplitCriticalEdge instead of doing an overcomplex 
   reimplementation of it.

No behavior change, except for the name of the inserted preheader.

llvm-svn: 123072

063dca0f

Fix a bug in r123034 (trying to sext/zext non-integers) and clean up a little. · 6a1fb8f2
Frits van Bommel authored Jan 08, 2011
```
llvm-svn: 123061
```
6a1fb8f2

Have loop-rotate simplify instructions (yay instsimplify!) as it clones · 8c5defd0

Chris Lattner authored Jan 08, 2011

them into the loop preheader, eliminating silly instructions like
"icmp i32 0, 100" in fixed tripcount loops. This also better exposes the
bigger problem with loop rotate that I'd like to fix: once this has been
folded, the duplicated conditional branch *often* turns into an uncond branch.

Not aggressively handling this is pessimizing later loop optimizations
somethin' fierce by making "dominates all exit blocks" checks fail.

llvm-svn: 123060

8c5defd0

Jan 07, 2011

InstCombine: Match min/max hidden by sext/zext · fc3d7f66

Tobias Grosser authored Jan 07, 2011

X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X
X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X
X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X
X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X
X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X
X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X

Instead of calculating this with mixed types promote all to the
larger type. This enables scalar evolution to analyze this
expression. PR8866

llvm-svn: 123034

fc3d7f66

Revert 122959, it needs more thought. Add it back to README.txt with additional notes. · 134cde91
Benjamin Kramer authored Jan 07, 2011
```
llvm-svn: 123030
```
134cde91

Jan 06, 2011

InstCombine: Turn _chk functions into the "unsafe" variant if length and max langth are equal. · ae67cc13
Benjamin Kramer authored Jan 06, 2011
```
This happens when we take the (non-constant) length from a malloc.

llvm-svn: 122961
```
ae67cc13
InstCombine: If we call llvm.objectsize on a malloc call we can replace it... · 799b0112
Benjamin Kramer authored Jan 06, 2011
```
InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc.

llvm-svn: 122959
```
799b0112
InstCombine: Teach llvm.objectsize folding to look through GEPs. · a76cc117
Benjamin Kramer authored Jan 06, 2011
```
llvm-svn: 122958
```
a76cc117

implement constant folding support for an exotic constant expr: · 5858e091

Chris Lattner authored Jan 06, 2011

ret i64 ptrtoint (i8* getelementptr ([1000 x i8]* @X, i64 1, i64 sub (i64 0, i64 ptrtoint ([1000 x i8]* @X to i64))) to i64)

to "ret i64 1000". This allows us to correctly compute the trip count
on a loop in PR8883, which occurs with std::fill on a char array. This
allows us to transform it into a memset with a constant size.

llvm-svn: 122950

5858e091

Jan 04, 2011

fix an off-by-one bug that caused a crash analyzing · c86e67e1
Chris Lattner authored Jan 04, 2011
```
ashr's with huge shift amounts, PR8896

llvm-svn: 122814
```
c86e67e1

Teach loop-idiom to turn a loop containing a memset into a larger memset · 8643810e

Chris Lattner authored Jan 04, 2011

when safe.

The testcase is basically this nested loop:
void foo(char *X) {
  for (int i = 0; i != 100; ++i) 
    for (int j = 0; j != 100; ++j)
      X[j+i*100] = 0;
}

which gets turned into a single memset now.  clang -O3 doesn't optimize
this yet though due to a phase ordering issue I haven't analyzed yet.

llvm-svn: 122806

8643810e

Duncan deftly points out that readnone functions aren't · bde6ec1d
Chris Lattner authored Jan 03, 2011
```
invalidated by stores, so they can be handled as 'simple'
operations.

llvm-svn: 122785
```
bde6ec1d

Jan 03, 2011
- earlycse can do trivial with-a-block dead store · 9e5e9ed7
  Chris Lattner authored Jan 03, 2011
```
elimination as well.  This deletes 60 stores in 176.gcc
that largely come from bitfield code.

llvm-svn: 122736
```
  9e5e9ed7
- now that loads are in their own table, we can implement · e0e32a9e
  Chris Lattner authored Jan 03, 2011
```
store->load forwarding.  This allows EarlyCSE to zap 600 more
loads from 176.gcc.

llvm-svn: 122732
```
  e0e32a9e
- add a testcase for readonly call CSE · 0446bb23
  Chris Lattner authored Jan 03, 2011
```
llvm-svn: 122730
```
  0446bb23
- Teach EarlyCSE to do trivial CSE of loads and read-only calls. · b9a8efc9
  Chris Lattner authored Jan 03, 2011
```
On 176.gcc, this catches 13090 loads and calls, and increases the
number of simple instructions CSE'd from 29658 to 36208.

llvm-svn: 122727
```
  b9a8efc9
- add DEBUG and -stats output to earlycse. · 8fac5db2
  Chris Lattner authored Jan 02, 2011
```
Teach it to CSE the rest of the non-side-effecting instructions.

llvm-svn: 122716
```
  8fac5db2
- Enhance earlycse to do CSE of casts, instsimplify and die. · 18ae5436
  Chris Lattner authored Jan 02, 2011
```
Add a testcase.

llvm-svn: 122715
```
  18ae5436
Jan 02, 2011

fix a miscompilation of tramp3d-v4: when forming a memcpy, we have to make · 9c69406f

Chris Lattner authored Jan 02, 2011

sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy.  Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.

llvm-svn: 122712

9c69406f

If a loop iterates exactly once (has backedge count = 0) then don't · 5702a43c
Chris Lattner authored Jan 02, 2011
```
mess with it.  We'd rather peel/unroll it than convert all of its 
stores into memsets.

llvm-svn: 122711
```
5702a43c

enhance loop idiom recognition to scan *all* unconditionally executed · 8455b6e4

Chris Lattner authored Jan 02, 2011

blocks in a loop, instead of just the header block.  This makes it more
aggressive, able to handle Duncan's Ada examples.

llvm-svn: 122704

8455b6e4

Fix PR8702 by not having LoopSimplify claim to preserve LCSSA form. As described · 64f1c0dc

Duncan Sands authored Jan 02, 2011

in the PR, the pass could break LCSSA form when inserting preheaders.  It probably
would be easy enough to fix this, but since currently we always go into LCSSA form
after running this pass, doing so is not urgent.

llvm-svn: 122695

64f1c0dc