Commits · afbc00bc381d60d041d716e0f9b8a04aadb42d23 · Roger Ferrer / llvm-epi-0.8

Nov 26, 2006
- Removed #include <iostream> and replaced with llvm_* streams. · 5dbf43c9
  Bill Wendling authored Nov 26, 2006
```
llvm-svn: 31923
```
  5dbf43c9
Nov 17, 2006

If an indvar with a variable stride is used by the exit condition, go ahead · 21eba2da

Chris Lattner authored Nov 17, 2006

and handle it like constant stride vars.  This fixes some bad codegen in
variable stride cases.  For example, it compiles this:

void foo(int k, int i) {
  for (k=i+i; k <= 8192; k+=i)
    flags2[k] = 0;
}

to:

LBB1_1: #bb.preheader
        movl %eax, %ecx
        addl %ecx, %ecx
        movl L_flags2$non_lazy_ptr, %edx
LBB1_2: #bb
        movb $0, (%edx,%ecx)
        addl %eax, %ecx
        cmpl $8192, %ecx
        jle LBB1_2      #bb
LBB1_5: #return
        ret

or (if the array is local and we are in dynamic-nonpic or static mode):

LBB3_2: #bb
        movb $0, _flags2(%ecx)
        addl %eax, %ecx
        cmpl $8192, %ecx
        jle LBB3_2      #bb

and:

        lis r2, ha16(L_flags2$non_lazy_ptr)
        lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
        slwi r3, r4, 1
LBB1_2: ;bb
        li r5, 0
        add r6, r4, r3
        stbx r5, r2, r3
        cmpwi cr0, r6, 8192
        bgt cr0, LBB1_5 ;return

instead of:

        leal (%eax,%eax,2), %ecx
        movl %eax, %edx
        addl %edx, %edx
        addl L_flags2$non_lazy_ptr, %edx
        xorl %esi, %esi
LBB1_2: #bb
        movb $0, (%edx,%esi)
        movl %eax, %edi
        addl %esi, %edi
        addl %ecx, %esi
        cmpl $8192, %esi
        jg LBB1_5       #return

and:

        lis r2, ha16(L_flags2$non_lazy_ptr)
        lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
        mulli r3, r4, 3
        slwi r5, r4, 1
        li r6, 0
        add r2, r2, r5
LBB1_2: ;bb
        li r5, 0
        add r7, r3, r6
        stbx r5, r2, r6
        add r6, r4, r6
        cmpwi cr0, r7, 8192
        ble cr0, LBB1_2 ;bb

This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and
implements LoopStrengthReduce/var_stride_used_by_compare.ll

llvm-svn: 31809

21eba2da

Nov 02, 2006

For PR786: · de46e484

Reid Spencer authored Nov 02, 2006

Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380

de46e484

Oct 28, 2006
- break edges more intelligently · a6eb7e08
  Chris Lattner authored Oct 28, 2006
```
llvm-svn: 31257
```
  a6eb7e08
- prepare for a change I'm about to make · 5191c654
  Chris Lattner authored Oct 28, 2006
```
llvm-svn: 31248
```
  5191c654
Oct 20, 2006

For PR950: · e0fc4dfc

Reid Spencer authored Oct 20, 2006

This patch implements the first increment for the Signless Types feature.
All changes pertain to removing the ConstantSInt and ConstantUInt classes
in favor of just using ConstantInt.

llvm-svn: 31063

e0fc4dfc

Aug 28, 2006
- eliminate RegisterOpt. It does the same thing as RegisterPass. · c2d3d311
  Chris Lattner authored Aug 27, 2006
```
llvm-svn: 29925
```
  c2d3d311
Aug 27, 2006
- s|llvm/Support/Visibility.h|llvm/Support/Compiler.h| · 3d27be13
  Chris Lattner authored Aug 27, 2006
```
llvm-svn: 29911
```
  3d27be13
Aug 03, 2006

· 3ff62017

Chris Lattner authored Aug 03, 2006

Changes:
  1. Update an obsolete comment.
  2. Make the sorting by base an explicit (though still N^2) step, so
     that the code is more clear on what it is doing.
  3. Partition uses so that uses inside the loop are handled before uses
     outside the loop.

Note that none of these changes currently changes the code inserted by LSR,
but they are a stepping stone to getting there.

This code is the result of some crazy pair programming with Nate. :)

llvm-svn: 29493

3ff62017

Jul 18, 2006
- Only reuse a previous IV if it would not require a type conversion. · e9c68f52
  Evan Cheng authored Jul 18, 2006
```
llvm-svn: 29186
```
  e9c68f52
Jun 29, 2006
- Use hidden visibility to make symbols in an anonymous namespace get · 996795b0
  Chris Lattner authored Jun 28, 2006
```
dropped.  This shrinks libllvmgcc.dylib another 67K

llvm-svn: 28975
```
  996795b0
Jun 09, 2006

RewriteExpr, either the new PHI node of induction variable or the · 398f7029

Evan Cheng authored Jun 09, 2006

post-increment value, should be first cast to the appropriated type (to the
type of the common expr). Otherwise, the rewrite of a use based on (common +
iv) may end up with an incorrect type.

llvm-svn: 28735

398f7029

Apr 12, 2006
- Get rid of a signed/unsigned compare warning. · 13a1a7a4
  Reid Spencer authored Apr 12, 2006
```
llvm-svn: 27625
```
  13a1a7a4
Mar 24, 2006
- Fix spello · f365f5f0
  Chris Lattner authored Mar 24, 2006
```
llvm-svn: 27052
```
  f365f5f0
Mar 22, 2006
- silence a bogus gcc warning · 7d80b4f3
  Chris Lattner authored Mar 22, 2006
```
llvm-svn: 26953
```
  7d80b4f3
Mar 18, 2006
- - Fixed a bogus if condition. · c28282bd
  Evan Cheng authored Mar 18, 2006
```
- Added more debugging info.
- Allow reuse of IV of negative stride. e.g. -4 stride == 2 * iv of -2 stride.

llvm-svn: 26841
```
  c28282bd
- Sort StrideOrder so we can process the smallest strides first. This allows · f09f0ebd
  Evan Cheng authored Mar 18, 2006
```
for more IV reuses.

llvm-svn: 26837
```
  f09f0ebd
Mar 17, 2006
- Allow users of iv / stride to be rewritten with expression that is a multiply · 45206988
  Evan Cheng authored Mar 17, 2006
```
of a smaller stride even if they have a common loop invariant expression part.

llvm-svn: 26828
```
  45206988
Mar 16, 2006

For each loop, keep track of all the IV expressions inserted indexed by · 3df447d3

Evan Cheng authored Mar 16, 2006

stride. For a set of uses of the IV of a stride which is a multiple
of another stride, do not insert a new IV expression. Rather, reuse the
previous IV and rewrite the uses as uses of IV expression multiplied by
the factor.

e.g.
x = 0 ...; x ++
y = 0 ...; y += 4
then use of y can be rewritten as use of 4*x for x86.

llvm-svn: 26803

3df447d3

Mar 14, 2006
- Added target lowering hooks which LSR consults to make more intelligent · c567c4ef
  Evan Cheng authored Mar 13, 2006
```
transformation decisions.

llvm-svn: 26738
```
  c567c4ef
Feb 04, 2006

Use SCEVExpander::InsertCastOfTo instead of our own code. This reduces · d30c4991
Chris Lattner authored Feb 04, 2006
```
#LLVM LOC, and auto-cse's cast instructions.

llvm-svn: 25974
```
d30c4991

Fix two significant bugs in LSR: · 2959f000

Chris Lattner authored Feb 04, 2006

1. When rewriting code in outer loops, sometimes we would insert code into
   inner loops that is invariant in that loop.
2. Notice that 4*(2+x) is 8+4*x and use that to simplify expressions.

This is a performance neutral change.

llvm-svn: 25964

2959f000

Jan 23, 2006
- Make iostream #inclusion explicit · c597b8a5
  Chris Lattner authored Jan 22, 2006
```
llvm-svn: 25514
```
  c597b8a5
Jan 11, 2006
- Switch these to using ETForest instead of DominatorSet to compute itself. · cb36710f
  Chris Lattner authored Jan 11, 2006
```
Patch written by Daniel Berlin!

llvm-svn: 25202
```
  cb36710f
Dec 05, 2005
- getRawValue zero extens for unsigned values, use getsextvalue so that we · 07720073
  Chris Lattner authored Dec 05, 2005
```
know that small negative values fit into the immediate field of addressing
modes.

llvm-svn: 24608
```
  07720073
Oct 21, 2005
- My previous patch was too conservative. Reject FP and void types, but do · 5df0e36e
  Chris Lattner authored Oct 21, 2005
```
allow pointer types.

llvm-svn: 23859
```
  5df0e36e
Oct 20, 2005

Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an · 0c0b38bb

Chris Lattner authored Oct 20, 2005

inner loop like this:

LBB_RateConvertMono8AltiVec_2:  ; no_exit
        lis r2, ha16(.CPI_RateConvertMono8AltiVec_0)
        lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2)
        fmr f3, f3
        fadd f0, f2, f0
        fadd f3, f0, f3
        fcmpu cr0, f3, f1
        bge cr0, LBB_RateConvertMono8AltiVec_2  ; no_exit

to an inner loop like this:

LBB_RateConvertMono8AltiVec_1:  ; no_exit
        fsub f2, f2, f1
        fcmpu cr0, f2, f1
        fmr f0, f2
        bge cr0, LBB_RateConvertMono8AltiVec_1  ; no_exit

Doh! good catch!

llvm-svn: 23838

0c0b38bb

Oct 11, 2005
- Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling · 192cd18f
  Chris Lattner authored Oct 11, 2005
```
out CSE's of base expressions it could build a result whose order was
nondet.

llvm-svn: 23698
```
  192cd18f
- Fix another problem where LSR was being nondeterminstic. Also remove elements · 5c9d63da
  Chris Lattner authored Oct 11, 2005
```
from the end of a vector instead of the beginning

llvm-svn: 23697
```
  5c9d63da
- Fix another lsr-is-nondeterministic case · b7a3894e
  Chris Lattner authored Oct 11, 2005
```
llvm-svn: 23695
```
  b7a3894e
Oct 09, 2005
- Hrm, you didn't see this. · eb4be8b9
  Chris Lattner authored Oct 09, 2005
```
llvm-svn: 23673
```
  eb4be8b9
- Fix a source of non-determinism in the backend: the order of processing · 4ea0a3ea
  Chris Lattner authored Oct 09, 2005
```
IV strides dependend on the pointer order of the strides in memory.
Non-determinism is bad.

llvm-svn: 23672
```
  4ea0a3ea
Oct 03, 2005

Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In · f07a587c

Chris Lattner authored Oct 03, 2005

particular, it should realize that phi's use their values in the pred block
not the phi block itself.  This change turns our em3d loop from this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_6    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
        addi r3, r2, 1
        blr
LBB_test_6:     ; loopexit
        or r3, r2, r2
        blr

into:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r6, r6
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        or r2, r6, r6
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r2, r2
        blr


Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow.  If it were doing its job right, it could turn the
code into this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r6, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r6, r6
        blr

... which I'll work on next. :)

llvm-svn: 23604

f07a587c

Refactor some code into a function · e4ed42a4
Chris Lattner authored Oct 03, 2005
```
llvm-svn: 23603
```
e4ed42a4

This break is bogus and I have no idea why it was there. Basically it prevents · 360928db

Chris Lattner authored Oct 03, 2005

memoizing code when IV's are used by phinodes outside of loops.  In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):

        li r6, 0
        or r7, r6, r6
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r7, r7
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r2, r7, 1
        addi r7, r7, 1
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

Now we get:

        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

this was noticed in em3d.

llvm-svn: 23602

360928db

when checking if we should move a split edge block outside of a loop, · 8fcce170

Chris Lattner authored Oct 03, 2005

check the presplit pred, not the post-split pred.  This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.

llvm-svn: 23601

8fcce170

Sep 27, 2005
- Make the pass name simpler · 92233d21
  Chris Lattner authored Sep 27, 2005
```
llvm-svn: 23476
```
  92233d21
Sep 13, 2005

Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI... · fd018c8d

Chris Lattner authored Sep 13, 2005

Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.

This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.

llvm-svn: 23327

fd018c8d

Sep 12, 2005

Fix a regression from last night, which caused this pass to create invalid · 8048b85e

Chris Lattner authored Sep 12, 2005

code for IV uses outside of loops that are not dominated by the latch block.
We should only convert these uses to use the post-inc value if they ARE
dominated by the latch block.

Also use a new LoopInfo method to simplify some code.

This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll

llvm-svn: 23318

8048b85e

_test: · a6764839

Chris Lattner authored Sep 12, 2005

        li r2, 0
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r2, 1
        stw r2, 0(r4)
        blr
[zion ~/llvm]$ cat > ~/xx
Uses of IV's outside of the loop should use hte post-incremented version
of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
which used to generate code like this (this is the code from the
dont-hoist-simple-loop-constants.ll testcase):

_test:
        li r2, 0                 **** IV starts at 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2            **** Copy for loop exit
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2           **** IV+2
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2       ****  IV+2
        stw r2, 0(r4)
        blr

And now generated code like this:

_test:
        li r2, 1               *** IV starts at 1
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701     *** IV.postinc + 0
        blt cr0, LBB_test_1
LBB_test_2:     ; loopexit.2.loopexit
        stw r2, 0(r4)          *** IV.postinc + 0
        blr

llvm-svn: 23313

a6764839