- Feb 03, 2006
-
-
Chris Lattner authored
llvm-svn: 25926
-
Chris Lattner authored
llvm-svn: 25924
-
Chris Lattner authored
1. a target doesn't know how to fold load/stores into copies, or 2. the spiller rewrites the input to a copy to the same register as the dest instead of to the reloaded reg. This will be moved/improved in the near future, but allows elimination of some ancient x86 hacks. This eliminates 92 copies from SMG2000 on X86 and 163 copies from 252.eon. llvm-svn: 25922
-
Evan Cheng authored
llvm-svn: 25920
-
Chris Lattner authored
of this, and use it to our advantage (bwahahah). This allows us to eliminate another 60 instructions from smg2000 on PPC (probably significantly more on X86). A common old-new diff looks like this: stw r2, 3304(r1) - lwz r2, 3192(r1) stw r2, 3300(r1) - lwz r2, 3192(r1) stw r2, 3296(r1) - lwz r2, 3192(r1) stw r2, 3200(r1) - lwz r2, 3192(r1) stw r2, 3196(r1) - lwz r2, 3192(r1) + or r2, r2, r2 stw r2, 3188(r1) and - lwz r31, 604(r1) - lwz r13, 604(r1) - lwz r14, 604(r1) - lwz r15, 604(r1) - lwz r16, 604(r1) - lwz r30, 604(r1) + or r31, r30, r30 + or r13, r30, r30 + or r14, r30, r30 + or r15, r30, r30 + or r16, r30, r30 + or r30, r30, r30 Removal of the R = R copies is coming next... llvm-svn: 25919
-
Chris Lattner authored
this code: store [stack slot #0], R10 = add R14, [stack slot #0] The spiller didn't know that the store made the value of [stackslot#0] available in R10 *IF* the store came from a copy instruction with the store folded into it. This patch teaches VirtRegMap to look at these stores and recognize the values they make available. In one case Evan provided, this code: divsd %XMM0, %XMM1 movsd %XMM1, QWORD PTR [%ESP + 40] 1) movsd QWORD PTR [%ESP + 48], %XMM1 2) movsd %XMM1, QWORD PTR [%ESP + 48] addsd %XMM1, %XMM0 3) movsd QWORD PTR [%ESP + 48], %XMM1 movsd QWORD PTR [%ESP + 4], %XMM0 turns into: divsd %XMM0, %XMM1 movsd %XMM1, QWORD PTR [%ESP + 40] addsd %XMM1, %XMM0 3) movsd QWORD PTR [%ESP + 48], %XMM1 movsd QWORD PTR [%ESP + 4], %XMM0 In this case, instruction #2 was removed because of the value made available by #1, and inst #1 was later deleted because it is now never used before the stack slot is redefined by #3. This occurs here and there in a lot of code with high spilling, on PPC most of the removed loads/stores are LSU-reject-causing loads, which is nice. On X86, things are much better (because it spills more), where we nuke about 1% of the instructions from SMG2000 and several hundred from eon. More improvements to come... llvm-svn: 25917
-
- Feb 02, 2006
-
-
Chris Lattner authored
Move isLoadFrom/StoreToStackSlot from MRegisterInfo to TargetInstrInfo,a far more logical place. Other methods should also be moved if anyoneis interested. :) llvm-svn: 25913
-
Chris Lattner authored
and instruction. This allows us to compile stuff like this: bool %X(int %X) { %Y = add int %X, 14 %Z = setne int %Y, 12345 ret bool %Z } to this: _X: cmpl $12331, 4(%esp) setne %al movzbl %al, %eax ret instead of this: _X: cmpl $12331, 4(%esp) setne %al movzbl %al, %eax andl $1, %eax ret This occurs quite a bit with the X86 backend. For example, 25 times in lambda, 30 times in 177.mesa, 14 times in galgel, 70 times in fma3d, 25 times in vpr, several hundred times in gcc, ~45 times in crafty, ~60 times in parser, ~140 times in eon, 110 times in perlbmk, 55 on gap, 16 times on bzip2, 14 times on twolf, and 1-2 times in many other SPEC2K programs. llvm-svn: 25901
-
Chris Lattner authored
(C1-X) == C2 --> X == C1-C2 (X+C1) == C2 --> X == C2-C1 This allows us to compile this: bool %X(int %X) { %Y = add int %X, 14 %Z = setne int %Y, 12345 ret bool %Z } into this: _X: cmpl $12331, 4(%esp) setne %al movzbl %al, %eax andl $1, %eax ret not this: _X: movl $14, %eax addl 4(%esp), %eax cmpl $12345, %eax setne %al movzbl %al, %eax andl $1, %eax ret Testcase here: Regression/CodeGen/X86/compare-add.ll nukage of the and coming up next. llvm-svn: 25898
-
Chris Lattner authored
llvm-svn: 25895
-
Chris Lattner authored
%C = call int asm "xyz $0, $1, $2, $3", "=r,r,r,0"(int %A, int %B, int 4) and get: xyz r2, r3, r4, r2 note that the r2's are pinned together. Yaay for 2-address instructions. 2342 ---------------------------------------------------------------------- llvm-svn: 25893
-
- Feb 01, 2006
-
-
Chris Lattner authored
substituted operands. For this testcase: int %test(int %A, int %B) { %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B) ret int %C } we now emit: _test: or r2, r3, r3 or r3, r4, r4 xyz r2, r2, r3 ;; look here or r3, r2, r2 blr ... note the substituted operands. :) llvm-svn: 25886
-
Nate Begeman authored
llvm-svn: 25879
-
Chris Lattner authored
int %test(int %A, int %B) { %C = call int asm "xyz $0, $1, $2", "=r,r,r"(int %A, int %B) ret int %C } into: (0x8906130, LLVM BB @0x8902220): %r2 = OR4 %r3, %r3 %r3 = OR4 %r4, %r4 INLINEASM <es:xyz $0, $1, $2>, %r2<def>, %r2, %r3 %r3 = OR4 %r2, %r2 BLR which asmprints as: _test: or r2, r3, r3 or r3, r4, r4 xyz $0, $1, $2 ;; need to print the operands now :) or r3, r2, r2 blr llvm-svn: 25878
-
Nate Begeman authored
of the SELECT_CC, BR_CC, and BRTWOWAY_CC nodes. llvm-svn: 25875
-
Chris Lattner authored
llvm-svn: 25865
-
- Jan 31, 2006
-
-
Evan Cheng authored
llvm-svn: 25855
-
Evan Cheng authored
caused several test failures. llvm-svn: 25852
-
Chris Lattner authored
llvm-svn: 25832
-
Chris Lattner authored
int %test_cpuid(int %op) { %B = alloca int %C = alloca int %D = alloca int %A = call int asm "cpuid", "=eax,==ebx,==ecx,==edx,eax"(int* %B, int* %C, int* %D, int %op) %Bv = load int* %B %Cv = load int* %C %Dv = load int* %D %x = add int %A, %Bv %y = add int %x, %Cv %z = add int %y, %Dv ret int %z } to this: _test_cpuid: sub %ESP, 16 mov DWORD PTR [%ESP], %EBX mov %EAX, DWORD PTR [%ESP + 20] cpuid mov DWORD PTR [%ESP + 8], %ECX mov DWORD PTR [%ESP + 12], %EBX mov DWORD PTR [%ESP + 4], %EDX mov %ECX, DWORD PTR [%ESP + 12] add %EAX, %ECX mov %ECX, DWORD PTR [%ESP + 8] add %EAX, %ECX mov %ECX, DWORD PTR [%ESP + 4] add %EAX, %ECX mov %EBX, DWORD PTR [%ESP] add %ESP, 16 ret ... note the proper register allocation. :) it is unclear to me why the loads aren't folded into the adds. llvm-svn: 25827
-
Chris Lattner authored
llvm-svn: 25822
-
- Jan 30, 2006
-
-
Chris Lattner authored
a chance to custom legalize setcc, which broke a bunch of C++ Codes. Testcase here: CodeGen/X86/2006-01-30-LongSetcc.ll llvm-svn: 25821
-
Chris Lattner authored
of lowered target nodes. llvm-svn: 25804
-
Chris Lattner authored
Move MaskedValueIsZero from the DAGCombiner to the TargetLowering interface,making isMaskedValueZeroForTargetNode simpler, and useable from other partsof the compiler. llvm-svn: 25803
-
Chris Lattner authored
to permit recursion llvm-svn: 25799
-
- Jan 29, 2006
-
-
Chris Lattner authored
llvm-svn: 25794
-
Chris Lattner authored
llvm-svn: 25785
-
Chris Lattner authored
instead. llvm-svn: 25780
-
Chris Lattner authored
llvm-svn: 25774
-
Chris Lattner authored
custom expansion of ConstantFP nodes. llvm-svn: 25772
-
Chris Lattner authored
llvm-svn: 25767
-
- Jan 28, 2006
-
-
Chris Lattner authored
faster. This cuts about 120 lines of code out of the legalizer (mostly code checking to see if operands have changed). It also fixes an ugly performance issue, where the legalizer cloned the entire graph after any change. Now the "UpdateNodeOperands" method gives it a chance to reuse nodes if the operands of a node change but not its opcode or valuetypes. This speeds up instruction selection time on kimwitu++ by about 8.2% with a release build. llvm-svn: 25746
-
Chris Lattner authored
llvm-svn: 25744
-
Chris Lattner authored
llvm-svn: 25742
-
Chris Lattner authored
llvm-svn: 25740
-
Chris Lattner authored
llvm-svn: 25739
-
Chris Lattner authored
llvm-svn: 25738
-
Chris Lattner authored
llvm-svn: 25737
-
Chris Lattner authored
1. Pull out the expand cases for BSWAP and CT* into a separate function, reducing the size of LegalizeOp. 2. Fix a bug where expand(bswap i64) was wrong when i64 is legal. 3. Changed LegalizeOp/PromoteOp so that the legalizer never needs to be iterative. It now operates in a single pass over the nodes. 4. Simplify a LOT of code, with a net reduction of ~280 lines. llvm-svn: 25736
-
Chris Lattner authored
relegalizes the stuff it returns. Add the ability to custom expand ADD/SUB, so that targets don't need to deal with ADD_PARTS/SUB_PARTS if they don't want. Fix some obscure potential bugs and simplify code. llvm-svn: 25732
-