Commits · c8a68d08c38c771d3d22534044c275cf4fa68b72 · Roger Ferrer / llvm-epi-0.8

Nov 02, 2006
- Rename · 93cdd149
  Evan Cheng authored Nov 01, 2006
```
llvm-svn: 31364
```
  93cdd149
- Two-address instructions no longer have to be A := A op C. Now any pair of... · d8697dec
  Evan Cheng authored Nov 01, 2006
```
Two-address instructions no longer have to be A := A op C. Now any pair of dest / src operands can be tied together.

llvm-svn: 31363
```
  d8697dec
Oct 12, 2006
- restore my previous patch, now that the X86 backend bug has been fixed: · c040e533
  Chris Lattner authored Oct 12, 2006
```
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20061009/038518.html

llvm-svn: 30906
```
  c040e533
- Backing out Chris' last commit. It's breaking llvm-gcc bootstrapping. · c935741b
  Evan Cheng authored Oct 12, 2006
```
It's turning:
        movl -24(%ebp), %esp
        subl $16, %esp
        movl -24(%ebp), %ecx
into
        movl -24(%ebp), %esp
        subl $16, %esp
        movl %esp, (%esp)

llvm-svn: 30902
```
  c935741b
- If we see a load from a stack slot into a physreg, consider it as providing · 86a012ab
  Chris Lattner authored Oct 12, 2006
```
the stack slot.  This fixes PR943.

llvm-svn: 30898
```
  86a012ab
Sep 05, 2006

Fix a long-standing wart in the code generator: two-address instruction lowering · 13a5dcdd

Chris Lattner authored Sep 05, 2006

actually *removes* one of the operands, instead of just assigning both operands
the same register.  This make reasoning about instructions unnecessarily complex,
because you need to know if you are before or after register allocation to match
up operand #'s with the target description file.

Changing this also gets rid of a bunch of hacky code in various places.

This patch also includes changes to fold loads into cmp/test instructions in
the X86 backend, along with a significant simplification to the X86 spill
folding code.

llvm-svn: 30108

13a5dcdd

Aug 27, 2006
- s|llvm/Support/Visibility.h|llvm/Support/Compiler.h| · 3d27be13
  Chris Lattner authored Aug 27, 2006
```
llvm-svn: 29911
```
  3d27be13
Aug 25, 2006

Take advantage of the recent improvements to the liveintervals set (tracking · bdf12106

Chris Lattner authored Aug 24, 2006

instructions which define each value#) to simplify and improve the coallescer.
In particular, this patch:

1. Implements iterative coallescing.
2. Reverts an unsafe hack from handlePhysRegDef, superceeding it with a
   better solution.
3. Implements PR865, "coallescing" away the second copy in code like:

   A = B
   ...
   B = A

This also includes changes to symbolically print registers in intervals
when possible.

llvm-svn: 29862

bdf12106

Aug 21, 2006
- Added a check so that if we have two machine instructions in this form · 04f22464
  Bill Wendling authored Aug 21, 2006
```
    MOV R0, R1
    MOV R1, R0

the second machine instruction is removed. Added a regression test.

llvm-svn: 29792
```
  04f22464
Jul 21, 2006
- Eliminate data relocations by using NULL instead of global empty list. · 4b49c235
  Jim Laskey authored Jul 21, 2006
```
llvm-svn: 29250
```
  4b49c235
Jul 20, 2006
- Reduce number of exported symbols · c496b418
  Andrew Lenharth authored Jul 20, 2006
```
llvm-svn: 29220
```
  c496b418
Jun 29, 2006
- Shave another 27K off libllvmgcc.dylib with visibility hidden · e097e6f7
  Chris Lattner authored Jun 28, 2006
```
llvm-svn: 28973
```
  e097e6f7
May 04, 2006
- Move some methods out of MachineInstr into MachineOperand · 10d63416
  Chris Lattner authored May 04, 2006
```
llvm-svn: 28102
```
  10d63416
May 02, 2006

Fix a latent bug that my spiller patch last week exposed: we were leaving · fd0a5478

Chris Lattner authored May 01, 2006

instructions in the virtregfolded map that were deleted.  Because they
were deleted, newly allocated instructions could end up at the same address,
magically finding themselves in the map.  The solution is to remove entries
from the map when we delete the instructions.

llvm-svn: 28041

fd0a5478

May 01, 2006
- When promoting a load to a reg-reg copy, where the load was a previous · ab7dbe0c
  Chris Lattner authored May 01, 2006
```
instruction folded with spill code, make sure the remove the load from
the virt reg folded map.

llvm-svn: 28040
```
  ab7dbe0c
- Remove previous patch, which wasn't quite right. · 4dee67c2
  Chris Lattner authored May 01, 2006
```
llvm-svn: 28039
```
  4dee67c2
- Remove temp. option -spiller-check-liveout, it didn't cause any failure nor... · a6562426
  Evan Cheng authored May 01, 2006
```
Remove temp. option -spiller-check-liveout, it didn't cause any failure nor performance regressions.

llvm-svn: 28029
```
  a6562426
Apr 30, 2006

Local spiller kills a store if the folded restore is turned into a copy. · f71f0f2e

Evan Cheng authored Apr 30, 2006

But this is incorrect if the spilled value live range extends beyond the
current BB.
It is currently controlled by a temporary option -spiller-check-liveout.

llvm-svn: 28024

f71f0f2e

Apr 28, 2006

Mapping of physregs can make it so that the designated and input physregs are · 79c50d96
Chris Lattner authored Apr 28, 2006
```
the same.  In this case, don't emit a noop copy.

llvm-svn: 28008
```
79c50d96

When we have a two-address instruction where the input cannot be clobbered · 84e95d00

Chris Lattner authored Apr 28, 2006

and is already available, instead of falling back to emitting a load, fall
back to emitting a reg-reg copy.  This generates significantly better code
for some SSE testcases, as SSE has lots of two-address instructions and
none of them are read/modify/write.  As one example, this change does:

        pshufd %XMM5, XMMWORD PTR [%ESP + 84], 255
        xorps %XMM2, %XMM5
        cmpltps %XMM1, %XMM0
-       movaps XMMWORD PTR [%ESP + 52], %XMM0
-       movapd %XMM6, XMMWORD PTR [%ESP + 52]
+       movaps %XMM6, %XMM0
        cmpltps %XMM6, XMMWORD PTR [%ESP + 68]
        movapd XMMWORD PTR [%ESP + 52], %XMM6
        movaps %XMM6, %XMM0
        cmpltps %XMM6, XMMWORD PTR [%ESP + 36]
        cmpltps %XMM3, %XMM0
-       movaps XMMWORD PTR [%ESP + 20], %XMM0
-       movapd %XMM7, XMMWORD PTR [%ESP + 20]
+       movaps %XMM7, %XMM0
        cmpltps %XMM7, XMMWORD PTR [%ESP + 4]
        movapd XMMWORD PTR [%ESP + 20], %XMM7
        cmpltps %XMM4, %XMM0

... which is far better than a store followed by a load!

llvm-svn: 28001

84e95d00

Feb 25, 2006

Fix a bug that Evan exposed with some changes he's making, and that was · 7d01f95a

Chris Lattner authored Feb 25, 2006

exposed with a fastcc problem (breaking pcompress2 on x86 with -enable-x86-fastcc).

When reloading a reused reg, make sure to invalidate the reloaded reg, and
check to see if there are any other pending uses of the same register.

llvm-svn: 26369

7d01f95a

Remove debugging printout :) · 28a0b8be
Chris Lattner authored Feb 25, 2006
```
Add a minor compile time win, no codegen change.

llvm-svn: 26368
```
28a0b8be

Refactor some code from being inline to being out in a new class with methods. · 525522e4

Chris Lattner authored Feb 25, 2006

This gets rid of two gotos, which is always nice, and also adds some comments.

No functionality change, this is just a refactor.

llvm-svn: 26367

525522e4

Feb 04, 2006

Fix VC++ warning. · 57a004ab
Jeff Cohen authored Feb 04, 2006
```
llvm-svn: 25957
```
57a004ab
Handle another case exposed on X86. · c93403a7
Chris Lattner authored Feb 03, 2006
```
llvm-svn: 25949
```
c93403a7

Fix a nasty problem on two-address machines in the following situation: · 71d20c4e

Chris Lattner authored Feb 03, 2006

store EAX -> [ss#0]
[ss#0] += 1
...
use(EAX)

In this case, it is not valid to rewrite this as:


store EAX -> [ss#0]
EAX += 1
store EAX -> [ss#0]  ;;; this would also delete the store above
...
use(EAX)

... because EAX is not a dead at that point.  Keep track of which registers
we are allowed to clobber, and which ones we aren't, and don't clobber the
ones we're not supposed to.  :)

This should resolve the issues on X86 last night.

llvm-svn: 25948

71d20c4e

significantly simplify the VirtRegMap code by pulling the SpillSlotsAvailable · 507a3a7b

Chris Lattner authored Feb 03, 2006

and PhysRegsAvailable maps out into a new AvailableSpills struct.  No
functionality change.

This paves the way for a bugfix, coming up next.

llvm-svn: 25947

507a3a7b

Feb 03, 2006

Fix VC++ compilation error caused by using a std::map iterator variable to receive · 3276ff7a
Jeff Cohen authored Feb 03, 2006
```
a std::multimap iterator value.  For some reason, GCC doesn't have a problem with this.

llvm-svn: 25927
```
3276ff7a
Remove move copies and dead stuff by not clobbering the result reg of a noop copy. · e18ef0d4
Chris Lattner authored Feb 03, 2006
```
llvm-svn: 25926
```
e18ef0d4
Simplify some code · 774d4a19
Chris Lattner authored Feb 03, 2006
```
llvm-svn: 25924
```
774d4a19

Add code that checks for noop copies, which triggers when either: · 1ef239af

Chris Lattner authored Feb 03, 2006

1. a target doesn't know how to fold load/stores into copies, or
2. the spiller rewrites the input to a copy to the same register as the dest
   instead of to the reloaded reg.

This will be moved/improved in the near future, but allows elimination of
some ancient x86 hacks.  This eliminates 92 copies from SMG2000 on X86 and
163 copies from 252.eon.

llvm-svn: 25922

1ef239af

Physregs may hold multiple stack slot values at the same time. Keep track · b7f24de4

Chris Lattner authored Feb 03, 2006

of this, and use it to our advantage (bwahahah).  This allows us to eliminate another
60 instructions from smg2000 on PPC (probably significantly more on X86).  A common
old-new diff looks like this:

        stw r2, 3304(r1)
-       lwz r2, 3192(r1)
        stw r2, 3300(r1)
-       lwz r2, 3192(r1)
        stw r2, 3296(r1)
-       lwz r2, 3192(r1)
        stw r2, 3200(r1)
-       lwz r2, 3192(r1)
        stw r2, 3196(r1)
-       lwz r2, 3192(r1)
+       or r2, r2, r2
        stw r2, 3188(r1)

and

-       lwz r31, 604(r1)
-       lwz r13, 604(r1)
-       lwz r14, 604(r1)
-       lwz r15, 604(r1)
-       lwz r16, 604(r1)
-       lwz r30, 604(r1)
+       or r31, r30, r30
+       or r13, r30, r30
+       or r14, r30, r30
+       or r15, r30, r30
+       or r16, r30, r30
+       or r30, r30, r30

Removal of the R = R copies is coming next...

llvm-svn: 25919

b7f24de4

Fix a deficiency in the spiller that Evan noticed. In particular, consider · f3aef1b0

Chris Lattner authored Feb 02, 2006

this code:

  store [stack slot #0],  R10
    = add R14, [stack slot #0]

The spiller didn't know that the store made the value of [stackslot#0] available
in R10 *IF* the store came from a copy instruction with the store folded into it.

This patch teaches VirtRegMap to look at these stores and recognize the values
they make available.  In one case Evan provided, this code:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
1)      movsd QWORD PTR [%ESP + 48], %XMM1
2)      movsd %XMM1, QWORD PTR [%ESP + 48]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

turns into:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

In this case, instruction #2 was removed because of the value made
available by #1, and inst #1 was later deleted because it is now
never used before the stack slot is redefined by #3.

This occurs here and there in a lot of code with high spilling, on PPC
most of the removed loads/stores are LSU-reject-causing loads, which is
nice.

On X86, things are much better (because it spills more), where we nuke
about 1% of the instructions from SMG2000 and several hundred from eon.

More improvements to come...

llvm-svn: 25917

f3aef1b0

Feb 02, 2006

Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far... · bb53acd0

Chris Lattner authored Feb 02, 2006

Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place.  Other methods should also be moved if anyoneis interested. :)

llvm-svn: 25913

bb53acd0

Jan 23, 2006
- Add explicit #includes of <iostream> · de02d772
  Chris Lattner authored Jan 22, 2006
```
llvm-svn: 25515
```
  de02d772
Jan 04, 2006
- Add an assertion, update DefInst even though no one uses it (dangling pointers · 05110552
  Chris Lattner authored Jan 04, 2006
```
don't help anyone)

llvm-svn: 25081
```
  05110552
Oct 06, 2005

Fix the LLC regressions on X86 last night. In particular, when undoing · fabe55f1

Chris Lattner authored Oct 06, 2005

previous copy elisions and we discover we need to reload a register, make
sure to use the regclass of the original register for the reload, not the
class of the current register.  This avoid using 16-bit loads to reload 32-bit
values.

llvm-svn: 23645

fabe55f1

Oct 05, 2005

Fix a bug in the local spiller, where we could take code like this: · 55149d78

Chris Lattner authored Oct 05, 2005

  store r12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = load [ss#2]
  R4 = load [ss#1]

and turn it into this code:

  store R12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = R12
  R4 = R3    <- oops!

The problem was that promoting R3 = load[ss#2] to a copy missed the fact that
the instruction invalidated R3 at that point.

llvm-svn: 23638

55149d78

Sep 30, 2005
- Change this code ot pass register classes into the stack slot spiller/reloader · 5a6199f3
  Chris Lattner authored Sep 30, 2005
```
code.  PrologEpilogInserter hasn't been updated yet though, so targets cannot
use this info.

llvm-svn: 23536
```
  5a6199f3
Sep 19, 2005

Teach the local spiller to turn stack slot loads into register-register copies · 2f838f21

Chris Lattner authored Sep 19, 2005

when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388

2f838f21