- Jun 20, 2004
-
-
Chris Lattner authored
llvm-svn: 14266
-
- Jun 18, 2004
-
-
Chris Lattner authored
mov REG, C sub REG, X generate: neg X add X, C which uses one less reg llvm-svn: 14213
-
Chris Lattner authored
the setcc. llvm-svn: 14212
-
Chris Lattner authored
we do not want to fold the load in cases like this: X = load = add A, X = add B, X llvm-svn: 14204
-
- Jun 17, 2004
-
-
Chris Lattner authored
llvm-svn: 14201
-
- Jun 15, 2004
-
-
Chris Lattner authored
llvm-svn: 14189
-
Chris Lattner authored
llvm-svn: 14185
-
- Jun 11, 2004
-
-
Chris Lattner authored
comparisons. In an 'isunordered' predicate, which looks like this at the LLVM level: %a = call bool %llvm.isnan(double %X) %b = call bool %llvm.isnan(double %Y) %COM = or bool %a, %b We used to generate this code: fxch %ST(1) fucomip %ST(0), %ST(0) setp %AL fucomip %ST(0), %ST(0) setp %AH or %AL, %AH With this patch, we generate this code: fucomip %ST(0), %ST(1) fstp %ST(0) setp %AL Which should make alkis happy. Tested as X86/compare_folding.llx:test1 llvm-svn: 14148
-
Chris Lattner authored
we can get rid of the FpUCOM/FpUCOMi pseudo instructions, which makes stuff simpler and faster. llvm-svn: 14144
-
Chris Lattner authored
testcase llvm-svn: 14141
-
- Jun 09, 2004
-
-
John Criswell authored
that cast to bool. llvm-svn: 14096
-
- Jun 02, 2004
-
-
Chris Lattner authored
llvm-svn: 13952
-
- May 23, 2004
-
-
Chris Lattner authored
llvm-svn: 13695
-
- May 14, 2004
-
-
Brian Gaeke authored
LLVM BasicBlock operands. llvm-svn: 13566
-
- May 13, 2004
-
-
Chris Lattner authored
and passing a null pointer into a function. For this testcase: void %test(int** %X) { store int* null, int** %X call void %test(int** null) ret void } we now generate this: test: sub %ESP, 12 mov %EAX, DWORD PTR [%ESP + 16] mov DWORD PTR [%EAX], 0 mov DWORD PTR [%ESP], 0 call test add %ESP, 12 ret instead of this: test: sub %ESP, 12 mov %EAX, DWORD PTR [%ESP + 16] mov %ECX, 0 mov DWORD PTR [%EAX], %ECX mov %EAX, 0 mov DWORD PTR [%ESP], %EAX call test add %ESP, 12 ret llvm-svn: 13558
-
Chris Lattner authored
the alloca address into common operations like loads/stores. In a simple testcase like this (which is just designed to excersize the alloca A, nothing more): int %test(int %X, bool %C) { %A = alloca int store int %X, int* %A store int* %A, int** %G br bool %C, label %T, label %F T: call int %test(int 1, bool false) %V = load int* %A ret int %V F: call int %test(int 123, bool true) %V2 = load int* %A ret int %V2 } We now generate: test: sub %ESP, 12 mov %EAX, DWORD PTR [%ESP + 16] mov %CL, BYTE PTR [%ESP + 20] *** mov DWORD PTR [%ESP + 8], %EAX mov %EAX, OFFSET G lea %EDX, DWORD PTR [%ESP + 8] mov DWORD PTR [%EAX], %EDX test %CL, %CL je .LBB2 # PC rel: F .LBB1: # T mov DWORD PTR [%ESP], 1 mov DWORD PTR [%ESP + 4], 0 call test *** mov %EAX, DWORD PTR [%ESP + 8] add %ESP, 12 ret .LBB2: # F mov DWORD PTR [%ESP], 123 mov DWORD PTR [%ESP + 4], 1 call test *** mov %EAX, DWORD PTR [%ESP + 8] add %ESP, 12 ret Instead of: test: sub %ESP, 20 mov %EAX, DWORD PTR [%ESP + 24] mov %CL, BYTE PTR [%ESP + 28] *** lea %EDX, DWORD PTR [%ESP + 16] *** mov DWORD PTR [%EDX], %EAX mov %EAX, OFFSET G mov DWORD PTR [%EAX], %EDX test %CL, %CL *** mov DWORD PTR [%ESP + 12], %EDX je .LBB2 # PC rel: F .LBB1: # T mov DWORD PTR [%ESP], 1 mov %EAX, 0 mov DWORD PTR [%ESP + 4], %EAX call test *** mov %EAX, DWORD PTR [%ESP + 12] *** mov %EAX, DWORD PTR [%EAX] add %ESP, 20 ret .LBB2: # F mov DWORD PTR [%ESP], 123 mov %EAX, 1 mov DWORD PTR [%ESP + 4], %EAX call test *** mov %EAX, DWORD PTR [%ESP + 12] *** mov %EAX, DWORD PTR [%EAX] add %ESP, 20 ret llvm-svn: 13557
-
Chris Lattner authored
sized allocas in the entry block). Instead of generating code like this: entry: reg1024 = ESP+1234 ... (much later) *reg1024 = 17 Generate code that looks like this: entry: (no code generated) ... (much later) t = ESP+1234 *t = 17 The advantage being that we DRAMATICALLY reduce the register pressure for these silly temporaries (they were all being spilled to the stack, resulting in very silly code). This is actually a manual implementation of rematerialization :) I have a patch to fold the alloca address computation into loads & stores, which will make this much better still, but just getting this right took way too much time and I'm sleepy. llvm-svn: 13554
-
- May 12, 2004
-
-
Chris Lattner authored
mov DWORD PTR [%ESP + 4], 1 instead of: mov %EAX, 1 mov DWORD PTR [%ESP + 4], %EAX llvm-svn: 13494
-
- May 10, 2004
-
-
Chris Lattner authored
compiling things like 'add long %X, 1'. The problem is that we were switching the order of the operands for longs even though we can't fold them yet. llvm-svn: 13451
-
Chris Lattner authored
llvm-svn: 13440
-
Chris Lattner authored
llvm-svn: 13439
-
- May 07, 2004
-
-
Chris Lattner authored
allows us to compile: store float 10.0, float* %P into: mov DWORD PTR [%EAX], 1092616192 instead of: .CPItest_0: # float 0x4024000000000000 .long 1092616192 # float 10 ... fld DWORD PTR [.CPItest_0] fstp DWORD PTR [%EAX] llvm-svn: 13409
-
Chris Lattner authored
against zero. In particular, don't emit: mov %ESI, 0 cmp %ECX, %ESI instead, emit: test %ECX, %ECX llvm-svn: 13407
-
- May 04, 2004
-
-
Chris Lattner authored
llvm-svn: 13355
-
Chris Lattner authored
div: mov %EDX, DWORD PTR [%ESP + 4] mov %ECX, 64 mov %EAX, %EDX sar %EDX, 31 idiv %ECX ret to this: div: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 5 shr %ECX, 26 mov %EDX, %EAX add %EDX, %ECX sar %EAX, 6 ret Note that the intel compiler is currently making this: div: movl 4(%esp), %edx #3.5 movl %edx, %eax #4.14 sarl $5, %eax #4.14 shrl $26, %eax #4.14 addl %edx, %eax #4.14 sarl $6, %eax #4.14 ret #4.14 Which has one less register->register copy. (hint hint alkis :) llvm-svn: 13354
-
Chris Lattner authored
llvm-svn: 13342
-
- May 01, 2004
-
-
Chris Lattner authored
llvm-svn: 13304
-
- Apr 28, 2004
-
-
Brian Gaeke authored
In InsertFPRegKills(), just check the MachineBasicBlock for successors instead of its corresponding BasicBlock. llvm-svn: 13213
-
Brian Gaeke authored
LLVM CFG when trying to find the successors of BB. llvm-svn: 13212
-
Brian Gaeke authored
llvm-svn: 13211
-
- Apr 14, 2004
-
-
John Criswell authored
The iterator is pointing at the next instruction which should not disappear when doing the load/store replacement. llvm-svn: 12954
-
John Criswell authored
On x86, memory operations occur in-order, so these are just lowered into volatile loads and stores. llvm-svn: 12936
-
- Apr 13, 2004
-
-
Chris Lattner authored
X86/2004-04-13-FPCMOV-Crash.llx A more robust fix is to follow. llvm-svn: 12935
-
Chris Lattner authored
Fix several bugs in the intrinsics: 1. Make sure to copy the input registers before the instructions that use them 2. Make sure to copy the value returned by 'in' out of EAX into the register it is supposed to be in. This fixes assertions when using in/out and linear scan. llvm-svn: 12896
-
- Apr 12, 2004
-
-
Chris Lattner authored
llvm-svn: 12855
-
Chris Lattner authored
of the fucom[p][p] instructions. This allows us to code generate this function bool %test(double %X, double %Y) { %C = setlt double %Y, %X ret bool %C } ... into: test: fld QWORD PTR [%ESP + 4] fld QWORD PTR [%ESP + 12] fucomip %ST(1) fstp %ST(0) setb %AL movsx %EAX, %AL ret where before we generated: test: fld QWORD PTR [%ESP + 4] fld QWORD PTR [%ESP + 12] fucompp ** fnstsw ** sahf setb %AL movsx %EAX, %AL ret The two marked instructions (which are the ones eliminated) are very bad, because they serialize execution of the processor. These instructions are available on the PPRO and later, but since we already use cmov's we aren't losing any portability. I retained the old code for the day when we decide we want to support back to the 386. llvm-svn: 12852
-
Chris Lattner authored
llvm-svn: 12849
-
Chris Lattner authored
llvm-svn: 12848
-
Chris Lattner authored
If the source of the cast is a load, we can just use the source memory location, without having to create a temporary stack slot entry. Before we code generated this: double %int(int* %P) { %V = load int* %P %V2 = cast int %V to double ret double %V2 } into: int: sub %ESP, 4 mov %EAX, DWORD PTR [%ESP + 8] mov %EAX, DWORD PTR [%EAX] mov DWORD PTR [%ESP], %EAX fild DWORD PTR [%ESP] add %ESP, 4 ret Now we produce this: int: mov %EAX, DWORD PTR [%ESP + 4] fild DWORD PTR [%EAX] ret ... which is nicer. llvm-svn: 12846
-
Chris Lattner authored
test/Regression/CodeGen/X86/fp_load_fold.llx llvm-svn: 12844
-