Commits · bc6f0d296cd1b4ef2204a4347037b88924441e11 · Roger Ferrer / llvm-epi-0.8

Sep 20, 2005
- Start threading across blocks with code in them, so long as the code does · 6c701060
  Chris Lattner authored Sep 20, 2005
```
not define a value that is used outside of it's block.  This catches many
more simplifications, e.g. 854 in 176.gcc, 137 in vpr, etc.

This implements branch-phi-thread.ll:test3.ll

llvm-svn: 23397
```
  6c701060
- Implement merging of blocks with the same condition if the block has multiple · f0bd8d01
  Chris Lattner authored Sep 20, 2005
```
predecessors.  This implements branch-phi-thread.ll::test1

llvm-svn: 23395
```
  f0bd8d01
- Reject a case we don't handle yet · 049cb448
  Chris Lattner authored Sep 19, 2005
```
llvm-svn: 23393
```
  049cb448
- remove debugging code :-/ · a160924d
  Chris Lattner authored Sep 19, 2005
```
llvm-svn: 23392
```
  a160924d
- Implement SimplifyCFG/branch-phi-thread.ll, the most trivial case of threading · 748f9030
  Chris Lattner authored Sep 19, 2005
```
control across branches with determined outcomes.  More generality to follow.
This triggers a couple thousand times in specint.

llvm-svn: 23391
```
  748f9030
- Stub out the rest of the DAG Combiner. Just need to fill in the · c760f80f
  Nate Begeman authored Sep 19, 2005
```
select_cc bits and then wrap it in a convenience function for  use with
regular select.

llvm-svn: 23389
```
  c760f80f
Sep 19, 2005

Teach the local spiller to turn stack slot loads into register-register copies · 2f838f21

Chris Lattner authored Sep 19, 2005

when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388

2f838f21

Implement the isLoadFromStackSlot interface · de3c87a2
Chris Lattner authored Sep 19, 2005
```
llvm-svn: 23387
```
de3c87a2

Sep 18, 2005

Refactor this code a bit and make it more general. This now compiles: · b4b2530a

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) { b.j += x; }

To:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        slwi r3, r3, 6
        add r3, r4, r3
        rlwimi r3, r4, 0, 26, 14
        stw r3, 0(r2)
        blr


instead of:

_plus2:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 26, 21, 31
        add r3, r5, r3
        rlwimi r4, r3, 6, 15, 25
        stw r4, 0(r2)
        blr

by eliminating an 'and'.

I'm pretty sure this is as small as we can go :)

llvm-svn: 23386

b4b2530a

Compile · 797dee77

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus2 (unsigned int x) {
  b.j += x;
}

to:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        and %ECX, 131008
        mov %EDX, DWORD PTR [%ESP + 4]
        shl %EDX, 6
        add %EDX, %ECX
        and %EDX, 131008
        and %EAX, -131009
        or %EDX, %EAX
        mov DWORD PTR [b], %EDX
        ret

instead of:

plus2:
        mov %EAX, DWORD PTR [b]
        mov %ECX, %EAX
        shr %ECX, 6
        and %ECX, 2047
        add %ECX, DWORD PTR [%ESP + 4]
        shl %ECX, 6
        and %ECX, 131008
        and %EAX, -131009
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23385

797dee77

Generalize this transform, using MaskedValueIsZero, allowing us to compile: · 01f56c68

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) { b.k += x; }

To:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        add DWORD PTR [b], %EAX
        ret

instead of:

plus3:
        mov %EAX, DWORD PTR [%ESP + 4]
        shl %EAX, 17
        mov %ECX, DWORD PTR [b]
        add %EAX, %ECX
        and %EAX, -131072
        and %ECX, 131071
        or %ECX, %EAX
        mov DWORD PTR [b], %ECX
        ret

llvm-svn: 23384

01f56c68

fix typeo · 4ebc8ab4
Chris Lattner authored Sep 18, 2005
```
llvm-svn: 23383
```
4ebc8ab4
Remove unintentionally committed code · e5b23a6d
Chris Lattner authored Sep 18, 2005
```
llvm-svn: 23382
```
e5b23a6d

implement shift.ll:test25. This compiles: · 27cb9dbd

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus3 (unsigned int x) {
  b.k += x;
}

to:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r3, 0(r2)
        rlwinm r4, r3, 0, 0, 14
        add r4, r4, r3
        rlwimi r4, r3, 0, 15, 31
        stw r4, 0(r2)
        blr

instead of:

_plus3:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        srwi r5, r4, 17
        add r3, r5, r3
        slwi r3, r3, 17
        rlwimi r3, r4, 0, 15, 31
        stw r3, 0(r2)
        blr

llvm-svn: 23381

27cb9dbd

Implement add.ll:test29. Codegening: · af517574

Chris Lattner authored Sep 18, 2005

struct S { unsigned int i : 6, j : 11, k : 15; } b;
void plus1 (unsigned int x) {
  b.i += x;
}

as:
_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        add r3, r4, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

instead of:

_plus1:
        lis r2, ha16(L_b$non_lazy_ptr)
        lwz r2, lo16(L_b$non_lazy_ptr)(r2)
        lwz r4, 0(r2)
        rlwinm r5, r4, 0, 26, 31
        add r3, r5, r3
        rlwimi r3, r4, 0, 0, 25
        stw r3, 0(r2)
        blr

llvm-svn: 23379

af517574

remove debug output · 027eaf01
Chris Lattner authored Sep 18, 2005
```
llvm-svn: 23377
```
027eaf01

Implement or.ll:test21. This teaches instcombine to be able to turn this: · 15212989

Chris Lattner authored Sep 18, 2005

struct {
   unsigned int bit0:1;
   unsigned int ubyte:31;
} sdata;

void foo() {
  sdata.ubyte++;
}

into this:

foo:
        add DWORD PTR [sdata], 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [sdata]
        mov %ECX, %EAX
        add %ECX, 2
        and %ECX, -2
        and %EAX, 1
        or %EAX, %ECX
        mov DWORD PTR [sdata], %EAX
        ret

llvm-svn: 23376

15212989

Sep 17, 2005
- Implement hook for ppc · 4d9cf680
  Chris Lattner authored Sep 17, 2005
```
llvm-svn: 23374
```
  4d9cf680
Sep 16, 2005
- More DAG combining. Still need the branch instructions, and select_cc · 24a7eca2
  Nate Begeman authored Sep 16, 2005
```
llvm-svn: 23371
```
  24a7eca2
Sep 15, 2005
- disable this for now · 0ebec066
  Chris Lattner authored Sep 15, 2005
```
llvm-svn: 23366
```
  0ebec066
Sep 14, 2005
- Give all operands names · 9e4a4ee3
  Chris Lattner authored Sep 14, 2005
```
llvm-svn: 23357
```
  9e4a4ee3
- give all operands names · 2e84be22
  Chris Lattner authored Sep 14, 2005
```
llvm-svn: 23356
```
  2e84be22
- Fix some issues exposed by more testing. XORIS had the wrong operands · f006d15e
  Chris Lattner authored Sep 14, 2005
```
specified.  The various *imm operands defined by PPC are really all i32,
even though the actual immediate is restricted to a smaller value in it.

llvm-svn: 23352
```
  f006d15e
- Fix some bugs noticed by new checking code · 6b013fc9
  Chris Lattner authored Sep 14, 2005
```
llvm-svn: 23350
```
  6b013fc9
- Fix the regression last night compiling povray · a393e4d4
  Chris Lattner authored Sep 14, 2005
```
llvm-svn: 23348
```
  a393e4d4
- fix a major regression from my patch this afternoon · b42e962d
  Chris Lattner authored Sep 14, 2005
```
llvm-svn: 23347
```
  b42e962d
- we don't need this proto any longer · b011cb27
  Chris Lattner authored Sep 13, 2005
```
llvm-svn: 23342
```
  b011cb27
- move the #include for the generated code into the isel class body so we · 03e08eef
  Chris Lattner authored Sep 13, 2005
```
can use/define class methods

llvm-svn: 23339
```
  03e08eef
Sep 13, 2005
- Change the arg lowering code to use copyfromreg from vregs associated · 0f965a61
  Chris Lattner authored Sep 13, 2005
```
with incoming arguments instead of the pregs themselves.  This fixes
the scheduler from causing problems by moving a copyfromreg for an argument
to after a select_cc node (now it can, and bad things won't happen).

llvm-svn: 23334
```
  0f965a61
- This has been moved to the target-indep code · ee811329
  Chris Lattner authored Sep 13, 2005
```
llvm-svn: 23333
```
  ee811329
- This code is no longer needed, it is moved to the target-indep code · fb96e50b
  Chris Lattner authored Sep 13, 2005
```
llvm-svn: 23332
```
  fb96e50b
- If a function has liveins, and if the target requested that they be plopped · d4382f0a
  Chris Lattner authored Sep 13, 2005
```
into particular vregs, emit copies into the entry MBB.

llvm-svn: 23331
```
  d4382f0a
- Majik numbers are bad · 64685b4c
  Chris Lattner authored Sep 13, 2005
```
llvm-svn: 23330
```
  64685b4c
- Remove some dead vectors · aa6cbd90
  Chris Lattner authored Sep 13, 2005
```
llvm-svn: 23329
```
  aa6cbd90
- Add a simple xform to simplify array accesses with casts in the way. · 2a893296
  Chris Lattner authored Sep 13, 2005
```
This is useful for 178.galgel where resolution of dope vectors (by the
optimizer) causes the scales to become apparent.

llvm-svn: 23328
```
  2a893296
- Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI... · fd018c8d
  Chris Lattner authored Sep 13, 2005
```
Fix an issue where LSR would miss rewriting a use of an IV expression by a PHI node that is not the original PHI.

This fixes up a dot-product loop in galgel, speeding it up from 18.47s to
16.13s.

llvm-svn: 23327
```
  fd018c8d
- Add a helper function, allowing us to simplify some code a bit, changing · 567b81f0
  Chris Lattner authored Sep 13, 2005
```
indentation, no functionality change

llvm-svn: 23325
```
  567b81f0
- Implement a simple xform to turn code like this: · 219175c8
  Chris Lattner authored Sep 12, 2005
```
  if () { store A -> P; } else { store B -> P; }

into a PHI node with one store, in the most trival case.  This implements
load.ll:test10.

llvm-svn: 23324
```
  219175c8
- Another load-peephole optimization: do gcse when two loads are next to · e0bfdf14
  Chris Lattner authored Sep 12, 2005
```
each other.  This implements InstCombine/load.ll:test9

llvm-svn: 23322
```
  e0bfdf14
- Implement a trivial form of store->load forwarding where the store and the · b990f7d8
  Chris Lattner authored Sep 12, 2005
```
load are exactly consequtive.  This is picked up by other passes, but this
triggers thousands of times in fortran programs that use static locals
(and is thus a compile-time speedup).

llvm-svn: 23320
```
  b990f7d8