Commits · 5c91c8f18b69038000257cabab2b7b00de20b929 · Roger Ferrer / llvm-epi-0.8

Oct 09, 2004
- Use DEBUG instead of DebugFlag directly, as DebugFlag does not respect · 5c91c8f1
  Chris Lattner authored Oct 09, 2004
```
-debug-only!

llvm-svn: 16868
```
  5c91c8f1
- Fix infinite loop due to iteration · f369b38d
  Chris Lattner authored Oct 09, 2004
```
llvm-svn: 16864
```
  f369b38d
- Implement sub.ll:test17, -X/C -> X/-C · 4ad08352
  Chris Lattner authored Oct 09, 2004
```
llvm-svn: 16863
```
  4ad08352
- If we found a dead global, we should at least delete it... · 1b8d2957
  Chris Lattner authored Oct 08, 2004
```
llvm-svn: 16858
```
  1b8d2957
Oct 08, 2004

* Pull out the meat of runOnModule into another function for clarity. · 1c4bddc5

Chris Lattner authored Oct 08, 2004

* Do not lead dangling dead constants prevent optimization
* Iterate global optimization while we're making progress.

These changes allow us to be more aggressive, handling cases like
GlobalOpt/iterate.llx without a problem (turning it into 'ret int 0').

llvm-svn: 16857

1c4bddc5

We might as well delete the known-dead global sooner rather than later since · 73ad73e2
Chris Lattner authored Oct 08, 2004
```
we know it is dead.

llvm-svn: 16855
```
73ad73e2
Temporarily disable a buggy transformation until it can be fixed. This fixes · 0b41e861
Chris Lattner authored Oct 08, 2004
```
254.gap.

llvm-svn: 16853
```
0b41e861

Implement SRA for global variables. This allows the other global variable · abab0719

Chris Lattner authored Oct 08, 2004

optimizations to trigger much more often.  This allows the elimination of
several dozen more global variables in Programs/External.  Note that we only
do this for non-constant globals: constant globals will already be optimized
out if the accesses to them permit it.

This implements Transforms/GlobalOpt/globalsra.llx

llvm-svn: 16842

abab0719

Instcombine (X & FF00) + xx00 -> (X+xx00) & FF00, implementing and.ll:test27 · bff91d9a
Chris Lattner authored Oct 08, 2004
```
This comes up when doing adds to bitfield elements.

llvm-svn: 16836
```
bff91d9a
Little patch to turn (shl (add X, 123), 4) -> (add (shl X, 4), 123 << 4) · 44bd392c
Chris Lattner authored Oct 08, 2004
```
This triggers in cases of bitfield additions, opening opportunities for
future improvements.

llvm-svn: 16834
```
44bd392c

Oct 07, 2004

Improve comments, no functionality changes · 617f1a34
Chris Lattner authored Oct 07, 2004
```
llvm-svn: 16814
```
617f1a34
Fix a bug in the safety analysis routine · 02b6c918
Chris Lattner authored Oct 07, 2004
```
llvm-svn: 16804
```
02b6c918
Comment cleanups · f6479968
Chris Lattner authored Oct 07, 2004
```
llvm-svn: 16803
```
f6479968

* Rename pass to globalopt, since we do more than just constify · 25db5803

Chris Lattner authored Oct 07, 2004

* Instead of handling dead functions specially, just nuke them.
* Be more aggressive about cleaning up after constification, in
  particular, handle getelementptr instructions and constantexprs.
* Be a little bit more structured about how we process globals.

*** Delete globals that are only stored to, and never read.  These are
    clearly not useful, so they should go.  This implements deadglobal.llx

This last one triggers quite a few times.  In particular, 2208 in the
external tests, 1865 of which are in 252.eon.  This shrinks eon from
1995094 to 1732341 bytes of bytecode.

llvm-svn: 16802

25db5803

Oct 06, 2004

Implement GlobalConstifier/trivialstore.llx, and also do some · 1f849a08

Chris Lattner authored Oct 06, 2004

simplifications of the resultant program to avoid making later passes
do it all.

This allows us to constify globals that just have the same constant that
they are initialized stored into them.

Suprisingly this comes up ALL of the freaking time, dozens of times in
SPEC, 30 times in vortex alone.

For example, on 256.bzip2, it allows us to constify these two globals:

%smallMode = internal global ubyte 0             ; <ubyte*> [#uses=8]
%verbosity = internal global int 0               ; <int*> [#uses=49]

Which (with later optimizations) results in the bytecode file shrinking
from 82286 to 69686 bytes!  Lets hear it for IPO :)

For the record, it's nuking lots of "if (verbosity > 2) { do lots of stuff }"
code.

llvm-svn: 16793

1f849a08

Instcombine: -(X sdiv C) -> (X sdiv -C), tested by sub.ll:test16 · 0aee4b79
Chris Lattner authored Oct 06, 2004
```
llvm-svn: 16769
```
0aee4b79

Reduce code growth implied by the tail duplication pass by not duplicating · 2ce32df8

Chris Lattner authored Oct 06, 2004

an instruction if it can be hoisted to a common dominator of the block.
This implements: test/Regression/Transforms/TailDup/MergeTest.ll

llvm-svn: 16758

2ce32df8

Sep 30, 2004
- Add accessor function. · 33e834eb
  Brian Gaeke authored Sep 30, 2004
```
llvm-svn: 16622
```
  33e834eb
- Correct type of accessor functions. · 5a89bde5
  Brian Gaeke authored Sep 30, 2004
```
llvm-svn: 16621
```
  5a89bde5
- Namespacify. Add accessor function. · e80d4cd6
  Brian Gaeke authored Sep 30, 2004
```
llvm-svn: 16620
```
  e80d4cd6
- Disable the 'WARNING: Found global types that are not compatible' warning · 9af8efdd
  Chris Lattner authored Sep 30, 2004
```
that always prints when linking programs to libstdc++ :(

llvm-svn: 16603
```
  9af8efdd
Sep 29, 2004

Hrm, debugging printouts do not need to be in here · abae776b
Chris Lattner authored Sep 29, 2004
```
llvm-svn: 16598
```
abae776b

* Pull range optimization code out into new InsertRangeTest function. · 6862fbd2

Chris Lattner authored Sep 29, 2004

* SubOne/AddOne functions always return ConstantInt, declare them as such
* Pull code for handling setcc X, cst, where cst is at the end of the range,
  or cc is LE or GE up earlier in visitSetCondInst.  This reduces #iterations
  in some cases.
* Fold: (div X, C1) op C2 -> range check, implementing div.ll:test6 - test9.

llvm-svn: 16588

6862fbd2

Do not insert trivially dead select instructions, which allows us to · 879ce789
Chris Lattner authored Sep 29, 2004
```
potentially fold more in one pass.

llvm-svn: 16583
```
879ce789

Fold binary expressions and casts into PHI nodes that have all constant inputs. · 6a4adcda

Chris Lattner authored Sep 29, 2004

This takes something like this:

%A = phi int [ 3, %cond_false.0 ], [ 2, %endif.0.i ], [ 2, %endif.1.i ]
%B = div int %tmp.243, 4

and turns it into:

%A = phi int [ 3/4, %cond_false.0 ], [ 2/4, %endif.0.i ], [ 2/4, %endif.1.i ]

which is later simplified (in this case) into %A = 0.

This triggers thousands of times in spec, for example, 269 times in 176.gcc.

This is tested by InstCombine/add.ll:test23 and set.ll:test18.

llvm-svn: 16582

6a4adcda

Hrm, really, all tests passed without this, but it is scary to think how... · c949128b
Chris Lattner authored Sep 29, 2004
```
llvm-svn: 16568
```
c949128b

Remove debugging printout · be7a69eb

Chris Lattner authored Sep 29, 2004

Instcombine (setcc (truncate X), C1).

This occurs THOUSANDS of times in many benchmarks.  Particularlly common
seem to be things like (seteq (cast bool X to int), int 0)

This turns it into (seteq bool %X, false), which then becomes (not %X).

llvm-svn: 16567

be7a69eb

Fold (X setcc C1) | (X setcc C2) · dcf756ec
Chris Lattner authored Sep 28, 2004
```
This implements or.ll:test1[89]

llvm-svn: 16561
```
dcf756ec

Sep 28, 2004

Fold (and (setcc X, C1), (setcc X, C2)) · 623826c8

Chris Lattner authored Sep 28, 2004

This is important for several reasons:

1. Benchmarks have lots of code that looks like this (perlbmk in particular):

  %tmp.2.i = setne int %tmp.0.i, 128              ; <bool> [#uses=1]
  %tmp.6343 = seteq int %tmp.0.i, 1               ; <bool> [#uses=1]
  %tmp.63 = and bool %tmp.2.i, %tmp.6343          ; <bool> [#uses=1]

   we now fold away the setne, a clear improvement.

2. In the more important cases, such as (X >= 10) & (X < 20), we now produce
   smaller code: (X-10) < 10.

3. Perhaps the nicest effect of this patch is that it really helps out the
   code generators.  In particular, for a 'range test' like the above,
   instead of generating this on X86 (the difference on PPC is even more
   pronounced):

        cmp %EAX, 50
        setge %CL
        cmp %EAX, 100
        setl %AL
        and %CL, %AL
        cmp %CL, 0

   we now generate this:

        add %EAX, -50
        cmp %EAX, 50

   Furthermore, this causes setcc's to be folded into branches more often.

These combinations trigger dozens of times in the spec benchmarks, particularly
in 176.gcc, 186.crafty, 253.perlbmk, 254.gap, & 099.go.

llvm-svn: 16559

623826c8

Implement X / C1 / C2 folding · 272d5ca9

Chris Lattner authored Sep 28, 2004

Implement (setcc (shl X, C1), C2) folding.

The second one occurs several dozen times in spec.  The first was added
just in case.  :)

These are tested by shift.ll:test2[12], and div.ll:test5

llvm-svn: 16549

272d5ca9

shl is always zero extending, so always use a zero extending shift right. · 6afc02f8

Chris Lattner authored Sep 28, 2004

This latent bug was exposed by recent changes, and is tested as:
llvm/test/Regression/Transforms/InstCombine/2004-09-28-BadShiftAndSetCC.llx

llvm-svn: 16546

6afc02f8

Add includes and use std:: for standard library calls to make code · 20f1b0ba
Alkis Evlogimenos authored Sep 28, 2004
```
compile on windows. This patch was contributed by Paolo Invernizzi.

llvm-svn: 16539
```
20f1b0ba
Pull assignment out of for loop conditional in order for this to · 3ce42ec7
Alkis Evlogimenos authored Sep 28, 2004
```
compile under windows. Patch contributed by Paolo Invernizzi!

llvm-svn: 16534
```
3ce42ec7

Sep 27, 2004

Fix two bugs: one where a condition was mistakenly swapped, and another · bfff18a8

Chris Lattner authored Sep 27, 2004

where we folded (X & 254) -> X < 1 instead of X < 2.  These problems were
latent problems exposed by the latest patch.

llvm-svn: 16528

bfff18a8

Fold: (setcc (shr X, ShAmt), CI), where 'cc' is eq or ne. This xform · 1023b872

Chris Lattner authored Sep 27, 2004

triggers often, for example:

6x in povray, 1x in gzip, 279x in gcc, 1x in crafty, 8x in eon, 11x in perlbmk,
362x in gap, 4x in vortex, 14 in m88ksim, 211x in 126.gcc, 1x in compress,
11x in ijpeg, and 4x in 147.vortex.

llvm-svn: 16521

1023b872

Sep 24, 2004
- Implement shift-and combinations, implementing InstCombine/and.ll:test19-21 · 7e794273
  Chris Lattner authored Sep 24, 2004
```
These combinations trigger 4 times in povray, 7x in gcc, 4x in gap, and 2x in bzip2.

llvm-svn: 16508
```
  7e794273
Sep 23, 2004
- Move LHSI->hasOneUse() into the arms of the conditional, reindenting code. · e1b4d2a4
  Chris Lattner authored Sep 23, 2004
```
No functionality changes here.

llvm-svn: 16505
```
  e1b4d2a4
- Implement Transforms/InstCombine/and.ll:test18, a case that occurs 20 times · 8fc5af4d
  Chris Lattner authored Sep 23, 2004
```
in perlbmk

llvm-svn: 16504
```
  8fc5af4d
- Implement select.ll:test16: fold load (select C, X, null) -> load X · bdcf41a8
  Chris Lattner authored Sep 23, 2004
```
llvm-svn: 16499
```
  bdcf41a8
Sep 21, 2004

Do not fold (X + C1 != C2) if there are other users of the add. Doing · b121ae1c

Chris Lattner authored Sep 21, 2004

this transformation used to take a loop like this:

int Array[1000];
void test(int X) {
  int i;
  for (i = 0; i < 1000; ++i)
    Array[i] += X;
}

Compiled to LLVM is:

no_exit:                ; preds = %entry, %no_exit
        %indvar = phi uint [ 0, %entry ], [ %indvar.next, %no_exit ]            ; <uint> [#uses=2]
        %tmp.4 = getelementptr [1000 x int]* %Array, int 0, uint %indvar                ; <int*> [#uses=2]
        %tmp.7 = load int* %tmp.4               ; <int> [#uses=1]
        %tmp.9 = add int %tmp.7, %X             ; <int> [#uses=1]
        store int %tmp.9, int* %tmp.4
***     %indvar.next = add uint %indvar, 1              ; <uint> [#uses=2]
***     %exitcond = seteq uint %indvar.next, 1000               ; <bool> [#uses=1]
        br bool %exitcond, label %return, label %no_exit

and turn it into a loop like this:

no_exit:                ; preds = %entry, %no_exit
        %indvar = phi uint [ 0, %entry ], [ %indvar.next, %no_exit ]            ; <uint> [#uses=3]
        %tmp.4 = getelementptr [1000 x int]* %Array, int 0, uint %indvar                ; <int*> [#uses=2]
        %tmp.7 = load int* %tmp.4               ; <int> [#uses=1]
        %tmp.9 = add int %tmp.7, %X             ; <int> [#uses=1]
        store int %tmp.9, int* %tmp.4
***     %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
***     %exitcond = seteq uint %indvar, 999             ; <bool> [#uses=1]
        br bool %exitcond, label %return, label %no_exit

Note that indvar.next and indvar can no longer be coallesced.  In machine
code terms, this patch changes this code:

.LBBtest_1:     # no_exit
        mov %EDX, OFFSET Array
        mov %ESI, %EAX
        add %ESI, DWORD PTR [%EDX + 4*%ECX]
        mov %EDX, OFFSET Array
        mov DWORD PTR [%EDX + 4*%ECX], %ESI
        mov %EDX, %ECX
        inc %EDX
        cmp %ECX, 999
        mov %ECX, %EDX
        jne .LBBtest_1  # no_exit

into this:

.LBBtest_1:     # no_exit
        mov %EDX, OFFSET Array
        mov %ESI, %EAX
        add %ESI, DWORD PTR [%EDX + 4*%ECX]
        mov %EDX, OFFSET Array
        mov DWORD PTR [%EDX + 4*%ECX], %ESI
        inc %ECX
        cmp %ECX, 1000
        jne .LBBtest_1  # no_exit

We need better instruction selection to get this:

.LBBtest_1:     # no_exit
        add DWORD PTR [Array + 4*%ECX], EAX
        inc %ECX
        cmp %ECX, 1000
        jne .LBBtest_1  # no_exit

... but at least there is less register juggling

llvm-svn: 16473

b121ae1c