Commits · 3796974563cd60ba64c36d2b4a78bcc2432e49df · Roger Ferrer / llvm-epi-0.8

Apr 18, 2004

Implement method · e375a4fd
Chris Lattner authored Apr 18, 2004
```
llvm-svn: 13036
```
e375a4fd

After unrolling our single basic block loop, fold it into the preheader and exit · 4d52e1e4

Chris Lattner authored Apr 18, 2004

block.  The primary motivation for doing this is that we can now unroll nested loops.

This makes a pretty big difference in some cases.  For example, in 183.equake,
we are now beating the native compiler with the CBE, and we are a lot closer
with LLC.

I'm now going to play around a bit with the unroll factor and see what effect
it really has.

llvm-svn: 13034

4d52e1e4

Fix a bug: this does not preserve the CFG! · f2cc8416
Chris Lattner authored Apr 18, 2004
```
While we're at it, add support for updating loop information correctly.

llvm-svn: 13033
```
f2cc8416
Add a new method, add a check missing that caused a segfault if a loop didn't · 1472c63f
Chris Lattner authored Apr 18, 2004
```
have a canonical indvar

llvm-svn: 13032
```
1472c63f

Initial checkin of a simple loop unroller. This pass is extremely basic and · 946b2559

Chris Lattner authored Apr 18, 2004

limited. Even in it's extremely simple state (it can only *fully* unroll single
basic block loops that execute a constant number of times), it already helps improve
performance a LOT on some benchmarks, particularly with the native code generators.

llvm-svn: 13028

946b2559

Make the tail duplication threshold accessible from the command line instead of hardcoded · c14da960
Chris Lattner authored Apr 18, 2004
```
llvm-svn: 13025
```
c14da960
Fix a memory leak. We leaked the vector holding the entries in switch tables. · ca96cee6
Chris Lattner authored Apr 17, 2004
```
llvm-svn: 13023
```
ca96cee6

Add the ability to compute exit values for complex loop using unanalyzable · dd730476

Chris Lattner authored Apr 17, 2004

operations.  This allows us to compile this testcase:

int main() {
        int h = 1;
         do h = 3 * h + 1; while (h <= 256);
        printf("%d\n", h);
        return 0;
}

into this:

int %main() {
entry:
        call void %__main( )
        %tmp.6 = call int (sbyte*, ...)* %printf( sbyte* getelementptr ([4 x sbyte]*  %.str_1, long 0, long 0), int 364 )        ; <int> [#uses=0]
        ret int 0
}

This testcase was taken directly from 256.bzip2, believe it or not.

This code is not as general as I would like.  Next up is to refactor it
a bit to handle more cases.

llvm-svn: 13019

dd730476

Apr 17, 2004

If the loop executes a constant number of times, try a bit harder to replace · a8140800
Chris Lattner authored Apr 17, 2004
```
exit values.

llvm-svn: 13018
```
a8140800

Add the ability to compute trip counts that are only controlled by constants · 4021d1af

Chris Lattner authored Apr 17, 2004

even if the loop is using expressions that we can't compute as a closed-form.
This allows us to calculate that this function always returns 55:

int test() {
  double X;
  int Count = 0;
  for (X = 100; X > 1; X = sqrt(X), ++Count)
    /*empty*/;
  return Count;
}

And allows us to compute trip counts for loops like:

        int h = 1;
         do h = 3 * h + 1; while (h <= 256);

(which occurs in bzip2), and for this function, which occurs after inlining
and other optimizations:

int popcount()
{
   int x = 666;
  int result = 0;
  while (x != 0) {
    result = result + (x & 0x1);
    x = x >> 1;
  }
  return result;
}

We still cannot compute the exit values of result or h in the two loops above,
which means we cannot delete the loop, but we are getting closer.  Being able to
compute a constant trip count for these two loops will allow us to unroll them
completely though.

llvm-svn: 13017

4021d1af

Fix a HUGE pessimization on X86. The indvars pass was taking this · 1e9ac1a4

Chris Lattner authored Apr 17, 2004

(familiar) function:

int _strlen(const char *str) {
    int len = 0;
    while (*str++) len++;
    return len;
}

And transforming it to use a ulong induction variable, because the type of
the pointer index was left as a constant long.  This is obviously very bad.

The fix is to shrink long constants in getelementptr instructions to intptr_t,
making the indvars pass insert a uint induction variable, which is much more
efficient.

Here's the before code for this function:

int %_strlen(sbyte* %str) {
entry:
        %tmp.13 = load sbyte* %str              ; <sbyte> [#uses=1]
        %tmp.24 = seteq sbyte %tmp.13, 0                ; <bool> [#uses=1]
        br bool %tmp.24, label %loopexit, label %no_exit

no_exit:                ; preds = %entry, %no_exit
***     %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ]            ; <uint> [#uses=2]
***     %indvar = phi ulong [ %indvar.next, %no_exit ], [ 0, %entry ]           ; <ulong> [#uses=2]
        %indvar1 = cast ulong %indvar to uint           ; <uint> [#uses=1]
        %inc.02.sum = add uint %indvar1, 1              ; <uint> [#uses=1]
        %inc.0.0 = getelementptr sbyte* %str, uint %inc.02.sum          ; <sbyte*> [#uses=1]
        %tmp.1 = load sbyte* %inc.0.0           ; <sbyte> [#uses=1]
        %tmp.2 = seteq sbyte %tmp.1, 0          ; <bool> [#uses=1]
        %indvar.next = add ulong %indvar, 1             ; <ulong> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2, label %loopexit.loopexit, label %no_exit

loopexit.loopexit:              ; preds = %no_exit
        %indvar = cast uint %indvar to int              ; <int> [#uses=1]
        %inc.1 = add int %indvar, 1             ; <int> [#uses=1]
        ret int %inc.1

loopexit:               ; preds = %entry
        ret int 0
}


Here's the after code:

int %_strlen(sbyte* %str) {
entry:
        %inc.02 = getelementptr sbyte* %str, uint 1             ; <sbyte*> [#uses=1]
        %tmp.13 = load sbyte* %str              ; <sbyte> [#uses=1]
        %tmp.24 = seteq sbyte %tmp.13, 0                ; <bool> [#uses=1]
        br bool %tmp.24, label %loopexit, label %no_exit

no_exit:                ; preds = %entry, %no_exit
***     %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ]            ; <uint> [#uses=3]
        %indvar = cast uint %indvar to int              ; <int> [#uses=1]
        %inc.0.0 = getelementptr sbyte* %inc.02, uint %indvar           ; <sbyte*> [#uses=1]
        %inc.1 = add int %indvar, 1             ; <int> [#uses=1]
        %tmp.1 = load sbyte* %inc.0.0           ; <sbyte> [#uses=1]
        %tmp.2 = seteq sbyte %tmp.1, 0          ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2, label %loopexit, label %no_exit

loopexit:               ; preds = %entry, %no_exit
        %len.0.1 = phi int [ 0, %entry ], [ %inc.1, %no_exit ]          ; <int> [#uses=1]
        ret int %len.0.1
}

llvm-svn: 13016

1e9ac1a4

Even if there are not any induction variables in the loop, if we can compute · 885a6eb7
Chris Lattner authored Apr 17, 2004
```
the trip count for the loop, insert one so that we can canonicalize the exit
condition.

llvm-svn: 13015
```
885a6eb7
Add support for evaluation of exp/log/log10/pow · a43312d3
Chris Lattner authored Apr 16, 2004
```
llvm-svn: 13011
```
a43312d3

Apr 16, 2004
- Fix some really nasty dominance bugs that were exposed by my patch to · 284d3b03
  Chris Lattner authored Apr 16, 2004
```
make the verifier more strict.  This fixes building zlib

llvm-svn: 13002
```
  284d3b03
- Fix retriving parent Function. · ede10c91
  Misha Brukman authored Apr 16, 2004
```
llvm-svn: 13001
```
  ede10c91
- Include <cmath> for compatibility with gcc 3.0.x (the system compiler on · 174633b0
  Brian Gaeke authored Apr 16, 2004
```
Debian.)

llvm-svn: 12986
```
  174633b0
- Assert if deleting BasicBlock before removing it from Function. · 0af4a9c1
  Misha Brukman authored Apr 16, 2004
```
llvm-svn: 12983
```
  0af4a9c1
- Fix some of the strange CBE-only failures that happened last night. · 9e9b2b74
  Chris Lattner authored Apr 16, 2004
```
llvm-svn: 12980
```
  9e9b2b74
- Make sure to check for a very bad class of errors: an instruction · 0377e438
  Chris Lattner authored Apr 16, 2004
```
that does not dominate all of its users, but is in the same basic block as
its users.  This class of error is what caused the mysterious CBE only
failures last night.

llvm-svn: 12979
```
  0377e438
- Bugpoint was not correctly capturing stderr! This caused it to "find" bugs · 82876bdb
  Chris Lattner authored Apr 16, 2004
```
that didn't exist, missing the ones that do :(

llvm-svn: 12978
```
  82876bdb
- Fix Inline/2004-04-15-InlineDeletesCall.ll · 0328d75c
  Chris Lattner authored Apr 16, 2004
```
Basically we were using SimplifyCFG as a huge sledgehammer for a simple
optimization.  Because simplifycfg does so many things, we can't use it
for this purpose.

llvm-svn: 12977
```
  0328d75c
Apr 15, 2004

Fix a bug in the previous checkin: if the exit block is not the same as · d7a559e3
Chris Lattner authored Apr 15, 2004
```
the back-edge block, we must check the preincremented value.

llvm-svn: 12968
```
d7a559e3
Give SparcV9CodeEmitter a head-of-file comment and a PassName. · 5bee0f30
Brian Gaeke authored Apr 15, 2004
```
llvm-svn: 12967
```
5bee0f30

Change the canonical induction variable that we insert. · 0cec5cb9

Chris Lattner authored Apr 15, 2004

Instead of producing code like this:

Loop:
  X = phi 0, X2
  ...

  X2 = X + 1
  if (X != N-1) goto Loop

We now generate code that looks like this:

Loop:
  X = phi 0, X2
  ...

  X2 = X + 1
  if (X2 != N) goto Loop

This has two big advantages:
  1. The trip count of the loop is now explicit in the code, allowing
     the direct implementation of Loop::getTripCount()
  2. This reduces register pressure in the loop, and allows X and X2 to be
     put into the same register.

As a consequence of the second point, the code we generate for loops went
from:

.LBB2:  # no_exit.1
	...
        mov %EDI, %ESI
        inc %EDI
        cmp %ESI, 2
        mov %ESI, %EDI
        jne .LBB2 # PC rel: no_exit.1

To:

.LBB2:  # no_exit.1
	...
        inc %ESI
        cmp %ESI, 3
        jne .LBB2 # PC rel: no_exit.1

... which has two fewer moves, and uses one less register.

llvm-svn: 12961

0cec5cb9

add some helpful methods. Rearrange #includes to proper order · d9dc4253
Chris Lattner authored Apr 15, 2004
```
llvm-svn: 12960
```
d9dc4253
Factor a bunch of classes out into a public header · b4f681b4
Chris Lattner authored Apr 15, 2004
```
llvm-svn: 12958
```
b4f681b4
Unbreak the build · a4e4a638
Chris Lattner authored Apr 15, 2004
```
llvm-svn: 12956
```
a4e4a638
Implement a FIXME: if we're going to insert a cast, we might as well only · d420fe63
Chris Lattner authored Apr 14, 2004
```
insert it once!

llvm-svn: 12955
```
d420fe63

Apr 14, 2004

Remove code to adjust the iterator for llvm.readio and llvm.writeio. · e3e2c919

John Criswell authored Apr 14, 2004

The iterator is pointing at the next instruction which should not disappear
when doing the load/store replacement.

llvm-svn: 12954

e3e2c919

Fix typo. · 0174347d
Brian Gaeke authored Apr 14, 2004
```
llvm-svn: 12953
```
0174347d

This is a trivial tweak to the addrec insertion code: insert the increment · 8a9fd94c

Chris Lattner authored Apr 14, 2004

at the bottom of the loop instead of the top.  This reduces the number of
overlapping live ranges a lot, for example, eliminating a spill in an important
loop in 183.equake with linear scan.

I still need to make the exit comparison of the loop use the post-incremented
version of this variable, but this is an easy first step.

llvm-svn: 12952

8a9fd94c

Add a TargetData to the PassManager regardless of the TargetMachine. · aec2bcd6
Brian Gaeke authored Apr 14, 2004
```
This should unbreak the Sparc JIT again.

llvm-svn: 12949
```
aec2bcd6
Remove the return type check for llvm.readio. This check is done for all · c4e72c9a
John Criswell authored Apr 14, 2004
```
functions and is not needed here.
Simplify the pointer type check per Chris's suggestions.

llvm-svn: 12945
```
c4e72c9a
Added code to verify that llvm.readio's pointer argument returns something · 0c654c6a
John Criswell authored Apr 14, 2004
```
that matches its return type.

llvm-svn: 12944
```
0c654c6a
Finish adding the llvm.readio and llvm.writeio intrinsics. · 23c48d63
John Criswell authored Apr 14, 2004
```
Sorry these didn't get in yesterday.

llvm-svn: 12942
```
23c48d63
ADd a trivial instcombine: load null -> null · 6679e46b
Chris Lattner authored Apr 14, 2004
```
llvm-svn: 12940
```
6679e46b

This is the real fix for Codegen/X86/2004-04-13-FPCMOV-Crash.llx which works · 0dc099c2

Chris Lattner authored Apr 14, 2004

even when the "optimization" I added before is turned off.  It generates this
extremely pointless code:

test:
        fld QWORD PTR [%ESP + 4]
        mov %AL, 0
        test %AL, %AL
        fcmove %ST(0), %ST(0)
        ret

Good thing the optimizer will have removed this before code generation
anyway.  :)

llvm-svn: 12939

0dc099c2

Added support for the llvm.readio and llvm.writeio intrinsics. · beded72a

John Criswell authored Apr 13, 2004

On x86, memory operations occur in-order, so these are just lowered into
volatile loads and stores.

llvm-svn: 12936

beded72a

Apr 13, 2004
- Implement a small optimization, which papers over the problem in · 9042e381
  Chris Lattner authored Apr 13, 2004
```
X86/2004-04-13-FPCMOV-Crash.llx

A more robust fix is to follow.

llvm-svn: 12935
```
  9042e381
- Add SCCP support for constant folding calls, implementing: · ff9362a8
  Chris Lattner authored Apr 13, 2004
```
test/Regression/Transforms/SCCP/calltest.ll

llvm-svn: 12921
```
  ff9362a8