Commits · 0b011ec8e26372877d04c6e9eb9662df024c0484 · Roger Ferrer / llvm-epi-0.8

Sep 26, 2005
- Factor the GetGEPGlobalInitializer out of this pass and into Transforms/Utils · 0b011ec8
  Chris Lattner authored Sep 26, 2005
```
as ConstantFoldLoadThroughGEPConstantExpr.

llvm-svn: 23445
```
  0b011ec8
- Move the ConstantFoldLoadThroughGEPConstantExpr function out of the InstCombine · c13c7b93
  Chris Lattner authored Sep 26, 2005
```
pass.

llvm-svn: 23444
```
  c13c7b93
- add a comment · b009663e
  Chris Lattner authored Sep 26, 2005
```
llvm-svn: 23442
```
  b009663e
- Add support for getelementptr, load, and correctly reject volatile stores. · 4b05c322
  Chris Lattner authored Sep 26, 2005
```
llvm-svn: 23441
```
  4b05c322
- Add support for br/brcond/switch and phi · 3e9ea5ff
  Chris Lattner authored Sep 26, 2005
```
llvm-svn: 23439
```
  3e9ea5ff
- Add a simple interpreter to this code, allowing us to statically evaluate · 99e23fa7
  Chris Lattner authored Sep 26, 2005
```
global ctors that are simple enough.  This implements ctor-list-opt.ll:CTOR2.

llvm-svn: 23437
```
  99e23fa7
- factor some code into a InstallGlobalCtors method, add comments. No functionality change. · 696beefa
  Chris Lattner authored Sep 26, 2005
```
llvm-svn: 23435
```
  696beefa
- Make the global opt optimizer work on modules with a null terminator, by · 838bdc18
  Chris Lattner authored Sep 26, 2005
```
accepting the null even with a non-65535 init prio

llvm-svn: 23434
```
  838bdc18
- Factor this code out into a few methods. · 41b6a5a6
  Chris Lattner authored Sep 26, 2005
```
Implement the start of global ctor optimization.  It is currently smart
enough to remove the global ctor for cases like this:

struct foo {
  foo() {}
} x;

... saving a bit of startup time for the program.

llvm-svn: 23433
```
  41b6a5a6
Sep 25, 2005
- Fix some logic I broke that caused a regression on · f4877680
  Chris Lattner authored Sep 25, 2005
```
SimplifyLibCalls/2005-05-20-sprintf-crash.ll

llvm-svn: 23430
```
  f4877680
- Move MaskedValueIsZero up. · 0b3557f5
  Chris Lattner authored Sep 24, 2005
```
Match a bunch of idioms for sign extensions, implementing InstCombine/signext.ll

llvm-svn: 23428
```
  0b3557f5
- Simplify this code a bit by relying on recursive simplification. Support · 175463a1
  Chris Lattner authored Sep 24, 2005
```
sprintf("%s", P)'s that have uses.

s/hasNUses(0)/use_empty()/

llvm-svn: 23425
```
  175463a1
Sep 23, 2005
- remove some debugging code · 499e3364
  Chris Lattner authored Sep 23, 2005
```
llvm-svn: 23411
```
  499e3364
- Fold two consequtive branches that share a common destination between them. · c59a371d
  Chris Lattner authored Sep 23, 2005
```
This implements SimplifyCFG/branch-fold.ll, and is useful on ?:/min/max heavy
code

llvm-svn: 23410
```
  c59a371d
- simplify some logic further · 3a978bf6
  Chris Lattner authored Sep 23, 2005
```
llvm-svn: 23408
```
  3a978bf6
- pull a bunch of logic out of SimplifyCFG into a helper fn · cc14ebc1
  Chris Lattner authored Sep 23, 2005
```
llvm-svn: 23407
```
  cc14ebc1
Sep 20, 2005
- Start threading across blocks with code in them, so long as the code does · 6c701060
  Chris Lattner authored Sep 20, 2005
```
not define a value that is used outside of it's block.  This catches many
more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc.

This implements branch-phi-thread.ll:test3.ll

llvm-svn: 23397
```
  6c701060
- Implement merging of blocks with the same condition if the block has multiple · f0bd8d01
  Chris Lattner authored Sep 20, 2005
```
predecessors.  This implements branch-phi-thread.ll::test1

llvm-svn: 23395
```
  f0bd8d01
- Reject a case we don't handle yet · 049cb448
  Chris Lattner authored Sep 19, 2005
```
llvm-svn: 23393
```
  049cb448
- remove debugging code :-/ · a160924d
  Chris Lattner authored Sep 19, 2005
```
llvm-svn: 23392
```
  a160924d
- Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading · 748f9030
  Chris Lattner authored Sep 19, 2005
```
control across branches with determined outcomes.  More generality to follow.
This triggers a couple thousand times in specint.

llvm-svn: 23391
```
  748f9030
Sep 18, 2005

Refactor this code a bit and make it more general. This now compiles: · b4b2530a

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)

llvm-svn: 23386

b4b2530a

Compile · 797dee77

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23385

797dee77

Generalize this transform, using MaskedValueIsZero, allowing us to compile: · 01f56c68

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23384

01f56c68

fix typeo · 4ebc8ab4
Chris Lattner authored Sep 18, 2005
```
llvm-svn: 23383
```
4ebc8ab4
Remove unintentionally committed code · e5b23a6d
Chris Lattner authored Sep 18, 2005
```
llvm-svn: 23382
```
e5b23a6d

implement shift.ll:test25. This compiles: · 27cb9dbd

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr

llvm-svn: 23381

27cb9dbd

Implement add.ll:test29. Codegening: · af517574

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

llvm-svn: 23379

af517574

remove debug output · 027eaf01
Chris Lattner authored Sep 18, 2005
```
llvm-svn: 23377
```
027eaf01

Implement or.ll:test21. This teaches instcombine to be able to turn this: · 15212989

Chris Lattner authored Sep 18, 2005

struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret

llvm-svn: 23376

15212989

Sep 14, 2005
- Fix the regression last night compiling povray · a393e4d4
  Chris Lattner authored Sep 14, 2005
```
llvm-svn: 23348
```
  a393e4d4
Sep 13, 2005

Add a simple xform to simplify array accesses with casts in the way. · 2a893296

Chris Lattner authored Sep 13, 2005

This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.

llvm-svn: 23328

2a893296

Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI... · fd018c8d

Chris Lattner authored Sep 13, 2005

Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.

This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.

llvm-svn: 23327

fd018c8d

Add a helper function, allowing us to simplify some code a bit, changing · 567b81f0
Chris Lattner authored Sep 13, 2005
```
indentation, no functionality change

llvm-svn: 23325
```
567b81f0

Implement a simple xform to turn code like this: · 219175c8

Chris Lattner authored Sep 12, 2005

  if () { store A -> P; } else { store B -> P; }

into a PHI node with one store, in the most trival case.  This implements
load.ll:test10.

llvm-svn: 23324

219175c8

Another load-peephole optimization: do gcse when two loads are next to · e0bfdf14
Chris Lattner authored Sep 12, 2005
```
each other.  This implements InstCombine/load.ll:test9

llvm-svn: 23322
```
e0bfdf14

Implement a trivial form of store->load forwarding where the store and the · b990f7d8

Chris Lattner authored Sep 12, 2005

load are exactly consequtive.  This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).

llvm-svn: 23320

b990f7d8

Sep 12, 2005

Fix a regression from last night, which caused this pass to create invalid · 8048b85e

Chris Lattner authored Sep 12, 2005

code for IV uses outside of loops that are not dominated by the latch block.
We should only convert these uses to use the post-inc value if they ARE
dominated by the latch block.

Also use a new LoopInfo method to simplify some code.

This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll

llvm-svn: 23318

8048b85e

_test: · a6764839

Chris Lattner authored Sep 12, 2005

        li r2, 0
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r2, 1
        stw r2, 0(r4)
        blr
[zion ~/llvm]$ cat > ~/xx
Uses of IV's outside of the loop should use hte post-incremented version
of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
which used to generate code like this (this is the code from the
dont-hoist-simple-loop-constants.ll testcase):

_test:
        li r2, 0                 **** IV starts at 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2            **** Copy for loop exit
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2           **** IV+2
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2       ****  IV+2
        stw r2, 0(r4)
        blr

And now generated code like this:

_test:
        li r2, 1               *** IV starts at 1
LBB_test_1:     ; no_exit.2
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmpwi cr0, r2, 701     *** IV.postinc + 0
        blt cr0, LBB_test_1
LBB_test_2:     ; loopexit.2.loopexit
        stw r2, 0(r4)          *** IV.postinc + 0
        blr

llvm-svn: 23313

a6764839

Sep 10, 2005

implement Transforms/LoopStrengthReduce/dont-hoist-simple-loop-constants.ll. · 530fe6ab

Chris Lattner authored Sep 10, 2005

We used to emit this code for it:

_test:
        li r2, 1     ;; Value tying up a register for the whole loop
        li r5, 0
LBB_test_1:     ; no_exit.2
        or r6, r5, r5
        li r5, 0
        stw r5, 0(r3)
        addi r5, r6, 1
        addi r3, r3, 4
        add r7, r2, r5  ;; should be addi r7, r5, 1
        cmpwi cr0, r7, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r6, 2
        stw r2, 0(r4)
        blr

now we emit this:

_test:
        li r2, 0
LBB_test_1:     ; no_exit.2
        or r5, r2, r2
        li r2, 0
        stw r2, 0(r3)
        addi r3, r3, 4
        addi r2, r5, 1
        addi r6, r5, 2   ;; whoa, fold those adds!
        cmpwi cr0, r6, 701
        blt cr0, LBB_test_1     ; no_exit.2
LBB_test_2:     ; loopexit.2.loopexit
        addi r2, r5, 2
        stw r2, 0(r4)
        blr

more improvement coming.

llvm-svn: 23306

530fe6ab