Commits · e18ef0d4a629ed60d96372c553484e87a37f9d86 · Roger Ferrer / llvm-epi-0.8

Feb 03, 2006

Remove move copies and dead stuff by not clobbering the result reg of a noop copy. · e18ef0d4
Chris Lattner authored Feb 03, 2006
```
llvm-svn: 25926
```
e18ef0d4
Simplify some code · 774d4a19
Chris Lattner authored Feb 03, 2006
```
llvm-svn: 25924
```
774d4a19

Add code that checks for noop copies, which triggers when either: · 1ef239af

Chris Lattner authored Feb 03, 2006

1. a target doesn't know how to fold load/stores into copies, or
2. the spiller rewrites the input to a copy to the same register as the dest
   instead of to the reloaded reg.

This will be moved/improved in the near future, but allows elimination of
some ancient x86 hacks.  This eliminates 92 copies from SMG2000 on X86 and
163 copies from 252.eon.

llvm-svn: 25922

1ef239af

Added case HANDLENODE to getOperationName(). · 02b5b9cd
Evan Cheng authored Feb 03, 2006
```
llvm-svn: 25920
```
02b5b9cd

Physregs may hold multiple stack slot values at the same time. Keep track · b7f24de4

Chris Lattner authored Feb 03, 2006

of this, and use it to our advantage (bwahahah).  This allows us to eliminate another
60 instructions from smg2000 on PPC (probably significantly more on X86).  A common
old-new diff looks like this:

        stw r2, 3304(r1)
-       lwz r2, 3192(r1)
        stw r2, 3300(r1)
-       lwz r2, 3192(r1)
        stw r2, 3296(r1)
-       lwz r2, 3192(r1)
        stw r2, 3200(r1)
-       lwz r2, 3192(r1)
        stw r2, 3196(r1)
-       lwz r2, 3192(r1)
+       or r2, r2, r2
        stw r2, 3188(r1)

and

-       lwz r31, 604(r1)
-       lwz r13, 604(r1)
-       lwz r14, 604(r1)
-       lwz r15, 604(r1)
-       lwz r16, 604(r1)
-       lwz r30, 604(r1)
+       or r31, r30, r30
+       or r13, r30, r30
+       or r14, r30, r30
+       or r15, r30, r30
+       or r16, r30, r30
+       or r30, r30, r30

Removal of the R = R copies is coming next...

llvm-svn: 25919

b7f24de4

Fix a deficiency in the spiller that Evan noticed. In particular, consider · f3aef1b0

Chris Lattner authored Feb 02, 2006

this code:

  store [stack slot #0],  R10
    = add R14, [stack slot #0]

The spiller didn't know that the store made the value of [stackslot#0] available
in R10 *IF* the store came from a copy instruction with the store folded into it.

This patch teaches VirtRegMap to look at these stores and recognize the values
they make available.  In one case Evan provided, this code:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
1)      movsd QWORD PTR [%ESP + 48], %XMM1
2)      movsd %XMM1, QWORD PTR [%ESP + 48]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

turns into:

        divsd %XMM0, %XMM1
        movsd %XMM1, QWORD PTR [%ESP + 40]
        addsd %XMM1, %XMM0
3)      movsd QWORD PTR [%ESP + 48], %XMM1
        movsd QWORD PTR [%ESP + 4], %XMM0

In this case, instruction #2 was removed because of the value made
available by #1, and inst #1 was later deleted because it is now
never used before the stack slot is redefined by #3.

This occurs here and there in a lot of code with high spilling, on PPC
most of the removed loads/stores are LSU-reject-causing loads, which is
nice.

On X86, things are much better (because it spills more), where we nuke
about 1% of the instructions from SMG2000 and several hundred from eon.

More improvements to come...

llvm-svn: 25917

f3aef1b0

Feb 02, 2006

Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far... · bb53acd0

Chris Lattner authored Feb 02, 2006

Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place.  Other methods should also be moved if anyoneis interested. :)

llvm-svn: 25913

bb53acd0

Turn any_extend nodes into zero_extend nodes when it allows us to remove an · 49beaf40

Chris Lattner authored Feb 02, 2006

and instruction.  This allows us to compile stuff like this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

to this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        ret

instead of this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

This occurs quite a bit with the X86 backend.  For example, 25 times in
lambda, 30 times in 177.mesa, 14 times in galgel,  70 times in fma3d,
25 times in vpr, several hundred times in gcc, ~45 times in crafty,
~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap,
16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K
programs.

llvm-svn: 25901

49beaf40

add two dag combines: · 49ce3554

Chris Lattner authored Feb 02, 2006

(C1-X) == C2 --> X == C1-C2
(X+C1) == C2 --> X == C2-C1

This allows us to compile this:

bool %X(int %X) {
        %Y = add int %X, 14
        %Z = setne int %Y, 12345
        ret bool %Z
}

into this:

_X:
        cmpl $12331, 4(%esp)
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

not this:

_X:
        movl $14, %eax
        addl 4(%esp), %eax
        cmpl $12345, %eax
        setne %al
        movzbl %al, %eax
        andl $1, %eax
        ret

Testcase here: Regression/CodeGen/X86/compare-add.ll

nukage of the and coming up next.

llvm-svn: 25898

49ce3554

make -debug output less newliney · 0bd74558
Chris Lattner authored Feb 02, 2006
```
llvm-svn: 25895
```
0bd74558

Implement matching constraints. We can now say things like this: · 7f5880b1

Chris Lattner authored Feb 02, 2006

%C = call int asm "xyz $0, $1, $2, $3", "=r,r,r,0"(int %A, int %B, int 4)

and get:

xyz r2, r3, r4, r2

note that the r2's are pinned together.  Yaay for 2-address instructions.

2342 ----------------------------------------------------------------------

llvm-svn: 25893

7f5880b1

Feb 01, 2006

Implement smart printing of inline asm strings, handling variants and · aa23fa9f

Chris Lattner authored Feb 01, 2006

substituted operands.  For this testcase:

int %test(int %A, int %B) {
  %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B)
  ret int %C
}

we now emit:

_test:
        or r2, r3, r3
        or r3, r4, r4
        xyz r2, r2, r3  ;; look here
        or r3, r2, r2
        blr

... note the substituted operands. :)

llvm-svn: 25886

aa23fa9f

*** empty log message *** · 01bd9d99
Nate Begeman authored Feb 01, 2006
```
llvm-svn: 25879
```
01bd9d99

Implement simple register assignment for inline asms. This allows us to compile: · 1558fc64

Chris Lattner authored Feb 01, 2006

int %test(int %A, int %B) {
  %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B)
  ret int %C
}

into:

 (0x8906130, LLVM BB @0x8902220):
        %r2 = OR4 %r3, %r3
        %r3 = OR4 %r4, %r4
        INLINEASM <es:xyz $0, $1, $2>, %r2<def>, %r2, %r3
        %r3 = OR4 %r2, %r2
        BLR

which asmprints as:

_test:
        or r2, r3, r3
        or r3, r4, r4
        xyz $0, $1, $2      ;; need to print the operands now :)
        or r3, r2, r2
        blr

llvm-svn: 25878

1558fc64

Fix some of the stuff in the PPC README file, and clean up legalization · 7e7f439f
Nate Begeman authored Feb 01, 2006
```
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes.

llvm-svn: 25875
```
7e7f439f
adjust to changes in InlineAsm interface. Fix a few minor bugs. · 3a5ed551
Chris Lattner authored Feb 01, 2006
```
llvm-svn: 25865
```
3a5ed551

Jan 31, 2006

Allow the specification of explicit alignments for constant pool entries. · 32be2dc0
Evan Cheng authored Jan 31, 2006
```
llvm-svn: 25855
```
32be2dc0
Allow custom lowering of fabs. I forgot to check in this change which · 2443ab93
Evan Cheng authored Jan 31, 2006
```
caused several test failures.

llvm-svn: 25852
```
2443ab93
Only insert an AND when converting from BR_COND to BRCC if needed. · e9721b29
Chris Lattner authored Jan 31, 2006
```
llvm-svn: 25832
```
e9721b29

Handle physreg input/outputs. We now compile this: · 2e56e894

Chris Lattner authored Jan 31, 2006

int %test_cpuid(int %op) {
        %B = alloca int
        %C = alloca int
        %D = alloca int
        %A = call int asm "cpuid", "=eax,==ebx,==ecx,==edx,eax"(int* %B, int* %C, int* %D, int %op)
        %Bv = load int* %B
        %Cv = load int* %C
        %Dv = load int* %D
        %x = add int %A, %Bv
        %y = add int %x, %Cv
        %z = add int %y, %Dv
        ret int %z
}

to this:

_test_cpuid:
        sub %ESP, 16
        mov DWORD PTR [%ESP], %EBX
        mov %EAX, DWORD PTR [%ESP + 20]
        cpuid
        mov DWORD PTR [%ESP + 8], %ECX
        mov DWORD PTR [%ESP + 12], %EBX
        mov DWORD PTR [%ESP + 4], %EDX
        mov %ECX, DWORD PTR [%ESP + 12]
        add %EAX, %ECX
        mov %ECX, DWORD PTR [%ESP + 8]
        add %EAX, %ECX
        mov %ECX, DWORD PTR [%ESP + 4]
        add %EAX, %ECX
        mov %EBX, DWORD PTR [%ESP]
        add %ESP, 16
        ret

... note the proper register allocation.  :)

it is unclear to me why the loads aren't folded into the adds.

llvm-svn: 25827

2e56e894

Print the most trivial inline asms. · 57ecb561
Chris Lattner authored Jan 30, 2006
```
llvm-svn: 25822
```
57ecb561

Jan 30, 2006
- Fix a bug in my legalizer reworking that caused the X86 backend to not get · f263a237
  Chris Lattner authored Jan 30, 2006
```
a chance to custom legalize setcc, which broke a bunch of C++ Codes.
Testcase here: CodeGen/X86/2006-01-30-LongSetcc.ll

llvm-svn: 25821
```
  f263a237
- don't insert an and node if it isn't needed here, this can prevent folding · d6f5ae44
  Chris Lattner authored Jan 30, 2006
```
of lowered target nodes.

llvm-svn: 25804
```
  d6f5ae44
- Move MaskedValueIsZero from the DAGCombiner to the TargetLowering... · f0b24d2d
  Chris Lattner authored Jan 30, 2006
```
Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler.

llvm-svn: 25803
```
  f0b24d2d
- pass the address of MaskedValueIsZero into isMaskedValueZeroForTargetNode, · 3b40e64a
  Chris Lattner authored Jan 30, 2006
```
to permit recursion

llvm-svn: 25799
```
  3b40e64a
Jan 29, 2006
- Fix RET of promoted values on targets that custom expand RET to a target node. · 4d1ea71a
  Chris Lattner authored Jan 29, 2006
```
llvm-svn: 25794
```
  4d1ea71a
- cleanups to the ValueTypeActions interface · 2c748afd
  Chris Lattner authored Jan 29, 2006
```
llvm-svn: 25785
```
  2c748afd
- Remove some special case hacks for CALLSEQ_*, using UpdateNodeOperands · ccb4476c
  Chris Lattner authored Jan 29, 2006
```
instead.

llvm-svn: 25780
```
  ccb4476c
- Allow custom expansion of ConstantVec nodes. PPC will use this in the future. · 2f292789
  Chris Lattner authored Jan 29, 2006
```
llvm-svn: 25774
```
  2f292789
- Legalize ConstantFP into TargetConstantFP when the target allows. Implement · 758b0ac5
  Chris Lattner authored Jan 29, 2006
```
custom expansion of ConstantFP nodes.

llvm-svn: 25772
```
  758b0ac5
- eliminate uses of SelectionDAG::getBR2Way_CC · 678da988
  Chris Lattner authored Jan 29, 2006
```
llvm-svn: 25767
```
  678da988
Jan 28, 2006

Use the new "UpdateNodeOperands" method to simplify LegalizeDAG and make it · d02b0547

Chris Lattner authored Jan 28, 2006

faster.  This cuts about 120 lines of code out of the legalizer (mostly code
checking to see if operands have changed).

It also fixes an ugly performance issue, where the legalizer cloned the entire
graph after any change.  Now the "UpdateNodeOperands" method gives it a chance
to reuse nodes if the operands of a node change but not its opcode or valuetypes.

This speeds up instruction selection time on kimwitu++ by about 8.2% with a
release build.

llvm-svn: 25746

d02b0547

add another method variant · 580b12ad
Chris Lattner authored Jan 28, 2006
```
llvm-svn: 25744
```
580b12ad
add some methods for updating nodes · f34156e8
Chris Lattner authored Jan 28, 2006
```
llvm-svn: 25742
```
f34156e8
minor tweaks · eb637514
Chris Lattner authored Jan 28, 2006
```
llvm-svn: 25740
```
eb637514
move a bunch of code, no other change. · 689bdcc9
Chris Lattner authored Jan 28, 2006
```
llvm-svn: 25739
```
689bdcc9
remove a couple more now-extraneous legalizeop's · fcfda5a1
Chris Lattner authored Jan 28, 2006
```
llvm-svn: 25738
```
fcfda5a1
fix a bug · 364b89a7
Chris Lattner authored Jan 28, 2006
```
llvm-svn: 25737
```
364b89a7

Several major changes: · 9dcce6da

Chris Lattner authored Jan 28, 2006

1. Pull out the expand cases for BSWAP and CT* into a separate function,
   reducing the size of LegalizeOp.
2. Fix a bug where expand(bswap i64) was wrong when i64 is legal.
3. Changed LegalizeOp/PromoteOp so that the legalizer never needs to be
   iterative.  It now operates in a single pass over the nodes.
4. Simplify a LOT of code, with a net reduction of ~280 lines.

llvm-svn: 25736

9dcce6da

Eliminate the need for ExpandOp to set 'needsanotheriteration', as it already · fd4a7f76

Chris Lattner authored Jan 28, 2006

relegalizes the stuff it returns.

Add the ability to custom expand ADD/SUB, so that targets don't need to deal
with ADD_PARTS/SUB_PARTS if they don't want.

Fix some obscure potential bugs and simplify code.

llvm-svn: 25732

fd4a7f76