- Jan 18, 2005
-
-
Chris Lattner authored
llvm-svn: 19667
-
Jeff Cohen authored
llvm-svn: 19665
-
Jeff Cohen authored
llvm-svn: 19664
-
Jeff Cohen authored
llvm-svn: 19663
-
Jeff Cohen authored
llvm-svn: 19662
-
Chris Lattner authored
llvm-svn: 19661
-
Tanya Lattner authored
llvm-svn: 19660
-
Chris Lattner authored
llvm-svn: 19659
-
Chris Lattner authored
* Insert some really pedantic assertions that will notice when we emit the same loads more than one time, exposing bugs. This turns a miscompilation in bzip2 into a compile-fail. yaay. llvm-svn: 19658
-
Chris Lattner authored
llvm-svn: 19657
-
Chris Lattner authored
llvm-svn: 19656
-
Chris Lattner authored
llvm-svn: 19655
-
Chris Lattner authored
match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index register, then there is no place to put the Z. llvm-svn: 19652
-
Chris Lattner authored
llvm-svn: 19651
-
Chris Lattner authored
emitted too early. In particular, this fixes Regression/CodeGen/X86/regpressure.ll:regpressure3. This also improves the 2nd basic block in 164.gzip:flush_block, which went from .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree + 20] movzx %ECX, WORD PTR [dyn_ltree + 16] mov DWORD PTR [%ESP + 32], %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] movzx %EDX, WORD PTR [dyn_ltree + 8] movzx %EBX, WORD PTR [dyn_ltree + 4] mov DWORD PTR [%ESP + 36], %EBX movzx %EBX, WORD PTR [dyn_ltree] add DWORD PTR [%ESP + 36], %EBX add %EDX, DWORD PTR [%ESP + 36] add %ECX, %EDX add DWORD PTR [%ESP + 32], %ECX add %EAX, DWORD PTR [%ESP + 32] movzx %ECX, WORD PTR [dyn_ltree + 24] add %EAX, %ECX mov %ECX, 0 mov %EDX, %ECX to .LBBflush_block_1: # loopentry.1.i movzx %EAX, WORD PTR [dyn_ltree] movzx %ECX, WORD PTR [dyn_ltree + 4] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 8] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 12] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 16] add %EAX, %ECX movzx %ECX, WORD PTR [dyn_ltree + 20] add %ECX, %EAX movzx %EAX, WORD PTR [dyn_ltree + 24] add %ECX, %EAX mov %EAX, 0 mov %EDX, %EAX ... which results in less spilling in the function. This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc. The default isel takes 37.31s. llvm-svn: 19650
-
Chris Lattner authored
llvm-svn: 19649
-
Chris Lattner authored
before other ops, causing it to spill like mad. This occurs in 164.gzip:flush_block. llvm-svn: 19648
-
Chris Lattner authored
llvm-svn: 19647
-
- Jan 17, 2005
-
-
Chris Lattner authored
llvm-svn: 19645
-
Chris Lattner authored
X86/reg-pressure.ll again, and allows us to do nice things in other cases. For example, we now codegen this sort of thing: int %loadload(int *%X, int* %Y) { %Z = load int* %Y %Y = load int* %X ;; load between %Z and store %Q = add int %Z, 1 store int %Q, int* %Y ret int %Y } Into this: loadload: mov %EAX, DWORD PTR [%ESP + 4] mov %EAX, DWORD PTR [%EAX] mov %ECX, DWORD PTR [%ESP + 8] inc DWORD PTR [%ECX] ret where we weren't able to form the 'inc [mem]' before. This also lets the instruction selector emit loads in any order it wants to, which can be good for register pressure as well. llvm-svn: 19644
-
Chris Lattner authored
1. Fold [mem] += (1|-1) into inc [mem]/dec [mem] to save some icache space. 2. Do not let token factor nodes prevent forming '[mem] op= val' folds. llvm-svn: 19643
-
Chris Lattner authored
llvm-svn: 19642
-
Chris Lattner authored
llvm-svn: 19641
-
Chris Lattner authored
the basic block that uses them if possible. This is a big win on X86, as it lets us fold the argument loads into instructions and reduce register pressure (by not loading all of the arguments in the entry block). For this (contrived to show the optimization) testcase: int %argtest(int %A, int %B) { %X = sub int 12345, %A br label %L L: %Y = add int %X, %B ret int %Y } we used to produce: argtest: mov %ECX, DWORD PTR [%ESP + 4] mov %EAX, 12345 sub %EAX, %ECX mov %EDX, DWORD PTR [%ESP + 8] .LBBargtest_1: # L add %EAX, %EDX ret now we produce: argtest: mov %EAX, 12345 sub %EAX, DWORD PTR [%ESP + 4] .LBBargtest_1: # L add %EAX, DWORD PTR [%ESP + 8] ret This also fixes the FIXME in the code. BTW, this occurs in real code. 164.gzip shrinks from 8623 to 8608 lines of .s file. The stack frame in huft_build shrinks from 1644->1628 bytes, inflate_codes shrinks from 116->108 bytes, and inflate_block from 2620->2612, due to fewer spills. Take that alkis. :-) llvm-svn: 19639
-
Chris Lattner authored
operations. The body of the if is less indented but unmodified in this patch. llvm-svn: 19638
-
Chris Lattner authored
llvm-svn: 19635
-
Chris Lattner authored
llvm-svn: 19634
-
Chris Lattner authored
int %foo(int %X) { %T = add int %X, 13 %S = mul int %T, 3 ret int %S } as this: mov %ECX, DWORD PTR [%ESP + 4] lea %EAX, DWORD PTR [%ECX + 2*%ECX + 39] ret instead of this: mov %ECX, DWORD PTR [%ESP + 4] mov %EAX, %ECX add %EAX, 13 imul %EAX, %EAX, 3 ret llvm-svn: 19633
-
Tanya Lattner authored
llvm-svn: 19632
-
Chris Lattner authored
Do not fold a load into an operation if it will induce a cycle in the DAG. Repeat after me: dAg. llvm-svn: 19631
-
Chris Lattner authored
llvm-svn: 19630
-
Chris Lattner authored
useness. llvm-svn: 19629
-
Chris Lattner authored
Disable the xform for < > cases. It turns out that the following is being miscompiled: bool %test(sbyte %S) { %T = cast sbyte %S to uint %V = setgt uint %T, 255 ret bool %V } llvm-svn: 19628
-
Chris Lattner authored
llvm-svn: 19627
-
Chris Lattner authored
The comparison will probably be folded, so this is not ok to do. This fixed 197.parser. llvm-svn: 19624
-
Reid Spencer authored
llvm-svn: 19623
-
Chris Lattner authored
of the bytereg. This fixes yacr2, 300.twolf and probably others. llvm-svn: 19622
-
Chris Lattner authored
llvm-svn: 19621
-
Chris Lattner authored
If we emit a load because we followed a token chain to get to it, try to fold it into its single user if possible. llvm-svn: 19620
-
Chris Lattner authored
llvm-svn: 19619
-