- Jan 19, 2005
-
-
Chris Lattner authored
llvm-svn: 19690
-
Chris Lattner authored
llvm-svn: 19689
-
Chris Lattner authored
llvm-svn: 19688
-
Chris Lattner authored
llvm-svn: 19687
-
Chris Lattner authored
This allows us to generate this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %EDX, DWORD PTR [%ESP + 8] shld %EDX, %EDX, 2 shl %EAX, 2 ret instead of this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] mov %EDX, %EAX shrd %EDX, %ECX, 30 shl %EAX, 2 ret Note the magically transmogrifying immediate. llvm-svn: 19686
-
Chris Lattner authored
instead of doing it manually. llvm-svn: 19685
-
Chris Lattner authored
Add default impl of commuteInstruction Add notes about ugly V9 code. llvm-svn: 19684
-
Chris Lattner authored
llvm-svn: 19683
-
Chris Lattner authored
llvm-svn: 19682
-
Chris Lattner authored
foo: mov %EAX, DWORD PTR [%ESP + 4] mov %EDX, DWORD PTR [%ESP + 8] shrd %EAX, %EDX, 2 sar %EDX, 2 ret instead of this: test1: mov %ECX, DWORD PTR [%ESP + 4] shr %ECX, 2 mov %EDX, DWORD PTR [%ESP + 8] mov %EAX, %EDX shl %EAX, 30 or %EAX, %ECX sar %EDX, 2 ret and long << 2 to this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, DWORD PTR [%ESP + 8] *** mov %EDX, %EAX shrd %EDX, %ECX, 30 shl %EAX, 2 ret instead of this: foo: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX shr %ECX, 30 mov %EDX, DWORD PTR [%ESP + 8] shl %EDX, 2 or %EDX, %ECX shl %EAX, 2 ret The extra copy (marked ***) can be eliminated when I teach the code generator that shrd32rri8 is really commutative. llvm-svn: 19681
-
Jeff Cohen authored
llvm-svn: 19680
-
Chris Lattner authored
select operations or to shifts that are by a constant. This automatically implements (with no special code) all of the special cases for shift by 32, shift by < 32 and shift by > 32. llvm-svn: 19679
-
Chris Lattner authored
llvm-svn: 19678
-
Chris Lattner authored
range. Either they are undefined (the default), they mask the shift amount to the size of the register (X86, Alpha, etc), or they extend the shift (PPC). This defaults to undefined, which is conservatively correct. llvm-svn: 19677
-
Chris Lattner authored
Add a hook to find out how the target handles shift amounts that are out of range. Either they are undefined (the default), they mask the shift amount to the size of the register (X86, Alpha, etc), or they extend the shift (PPC). This defaults to undefined, which is conservatively correct. llvm-svn: 19676
-
- Jan 18, 2005
-
-
Chris Lattner authored
llvm-svn: 19675
-
Chris Lattner authored
FP_EXTEND from! llvm-svn: 19674
-
Chris Lattner authored
llvm-svn: 19673
-
Chris Lattner authored
don't need to even think about F32 in the X86 code anymore. llvm-svn: 19672
-
Chris Lattner authored
of zero and sign extends. llvm-svn: 19671
-
Chris Lattner authored
llvm-svn: 19670
-
Chris Lattner authored
llvm-svn: 19669
-
Chris Lattner authored
do it. This results in better code on X86 for floats (because if strict precision is not required, we can elide some more expensive double -> float conversions like the old isel did), and allows other targets to emit CopyFromRegs that are not legal for arguments. llvm-svn: 19668
-
Chris Lattner authored
llvm-svn: 19667
-
Jeff Cohen authored
llvm-svn: 19665
-
Jeff Cohen authored
llvm-svn: 19664
-
Jeff Cohen authored
llvm-svn: 19663
-
Jeff Cohen authored
llvm-svn: 19662
-
Chris Lattner authored
llvm-svn: 19661
-
Tanya Lattner authored
llvm-svn: 19660
-
Chris Lattner authored
llvm-svn: 19659
-
Chris Lattner authored
* Insert some really pedantic assertions that will notice when we emit the same loads more than one time, exposing bugs. This turns a miscompilation in bzip2 into a compile-fail. yaay. llvm-svn: 19658
-
Chris Lattner authored
llvm-svn: 19657
-
Chris Lattner authored
llvm-svn: 19656
-
Chris Lattner authored
llvm-svn: 19655
-
Chris Lattner authored
match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index register, then there is no place to put the Z. llvm-svn: 19652
-
Chris Lattner authored
llvm-svn: 19651
-
Chris Lattner authored
emitted too early. In particular, this fixes Regression/CodeGen/X86/regpressure.ll:regpressure3. This also improves the 2nd basic block in 164.gzip:flush_block, which went from .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree + 20] movzx %ECX, WORD PTR [dyn_ltree + 16] mov DWORD PTR [%ESP + 32], %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] movzx %EDX, WORD PTR [dyn_ltree + 8] movzx %EBX, WORD PTR [dyn_ltree + 4] mov DWORD PTR [%ESP + 36], %EBX movzx %EBX, WORD PTR [dyn_ltree] add DWORD PTR [%ESP + 36], %EBX add %EDX, DWORD PTR [%ESP + 36] add %ECX, %EDX add DWORD PTR [%ESP + 32], %ECX add %EAX, DWORD PTR [%ESP + 32] movzx %ECX, WORD PTR [dyn_ltree + 24] add %EAX, %ECX mov %ECX, 0 mov %EDX, %ECX to .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree] movzx %ECX, WORD PTR [dyn_ltree + 4] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 8] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 16] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 20] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 24] add %ECX, %EAX mov %EAX, 0 mov %EDX, %EAX ... which results in less spilling in the function. This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc. The default isel takes 37.31s. llvm-svn: 19650
-
Chris Lattner authored
llvm-svn: 19649
-
Chris Lattner authored
before other ops, causing it to spill like mad. This occurs in 164.gzip:flush_block. llvm-svn: 19648
-