- Jun 20, 2004
-
-
Chris Lattner authored
llvm-svn: 14266
-
- Jun 18, 2004
-
-
Chris Lattner authored
mov REG, C sub REG, X generate: neg X add X, C which uses one less reg llvm-svn: 14213
-
Chris Lattner authored
the setcc. llvm-svn: 14212
-
Chris Lattner authored
we do not want to fold the load in cases like this: X = load = add A, X = add B, X llvm-svn: 14204
-
- Jun 17, 2004
-
-
Chris Lattner authored
llvm-svn: 14201
-
- Jun 15, 2004
-
-
Chris Lattner authored
llvm-svn: 14189
-
Chris Lattner authored
llvm-svn: 14185
-
- Jun 11, 2004
-
-
Chris Lattner authored
comparisons. In an 'isunordered' predicate, which looks like this at the LLVM level: %a = call bool %llvm.isnan(double %X) %b = call bool %llvm.isnan(double %Y) %COM = or bool %a, %b We used to generate this code: fxch %ST(1) fucomip %ST(0), %ST(0) setp %AL fucomip %ST(0), %ST(0) setp %AH or %AL, %AH With this patch, we generate this code: fucomip %ST(0), %ST(1) fstp %ST(0) setp %AL Which should make alkis happy. Tested as X86/compare_folding.llx:test1 llvm-svn: 14148
-
Chris Lattner authored
llvm-svn: 14146
-
Chris Lattner authored
llvm-svn: 14145
-
Chris Lattner authored
we can get rid of the FpUCOM/FpUCOMi pseudo instructions, which makes stuff simpler and faster. llvm-svn: 14144
-
Chris Lattner authored
twoarg cases. llvm-svn: 14143
-
Chris Lattner authored
testcase llvm-svn: 14141
-
Chris Lattner authored
llvm-svn: 14140
-
Chris Lattner authored
This makes the code much simpler, and the two cases really do belong apart. Once we do it, it's pretty obvious how flawed the logic was for A != A case, so I fixed it (fixing PR369). This also uses freeStackSlotAfter instead of inserting an fxchg then popStackAfter'ing in the case where there is a dead result (unlikely, but possible), producing better code. llvm-svn: 14139
-
- Jun 10, 2004
-
-
Chris Lattner authored
llvm-svn: 14110
-
- Jun 09, 2004
-
-
John Criswell authored
that cast to bool. llvm-svn: 14096
-
- Jun 04, 2004
-
-
Chris Lattner authored
llvm-svn: 14005
-
- Jun 02, 2004
-
-
Chris Lattner authored
llvm-svn: 13952
-
- May 23, 2004
-
-
Chris Lattner authored
llvm-svn: 13696
-
Chris Lattner authored
llvm-svn: 13695
-
Chris Lattner authored
llvm-svn: 13694
-
- May 14, 2004
-
-
Brian Gaeke authored
MachineBasicBlocks instead. llvm-svn: 13568
-
Brian Gaeke authored
Get rid of separate numbering for LLVM BasicBlocks; use the automatically generated MachineBasicBlock numbering. llvm-svn: 13567
-
Brian Gaeke authored
LLVM BasicBlock operands. llvm-svn: 13566
-
- May 13, 2004
-
-
Chris Lattner authored
and passing a null pointer into a function. For this testcase: void %test(int** %X) { store int* null, int** %X call void %test(int** null) ret void } we now generate this: test: sub %ESP, 12 mov %EAX, DWORD PTR [%ESP + 16] mov DWORD PTR [%EAX], 0 mov DWORD PTR [%ESP], 0 call test add %ESP, 12 ret instead of this: test: sub %ESP, 12 mov %EAX, DWORD PTR [%ESP + 16] mov %ECX, 0 mov DWORD PTR [%EAX], %ECX mov %EAX, 0 mov DWORD PTR [%ESP], %EAX call test add %ESP, 12 ret llvm-svn: 13558
-
Chris Lattner authored
the alloca address into common operations like loads/stores. In a simple testcase like this (which is just designed to excersize the alloca A, nothing more): int %test(int %X, bool %C) { %A = alloca int store int %X, int* %A store int* %A, int** %G br bool %C, label %T, label %F T: call int %test(int 1, bool false) %V = load int* %A ret int %V F: call int %test(int 123, bool true) %V2 = load int* %A ret int %V2 } We now generate: test: sub %ESP, 12 mov %EAX, DWORD PTR [%ESP + 16] mov %CL, BYTE PTR [%ESP + 20] *** mov DWORD PTR [%ESP + 8], %EAX mov %EAX, OFFSET G lea %EDX, DWORD PTR [%ESP + 8] mov DWORD PTR [%EAX], %EDX test %CL, %CL je .LBB2 # PC rel: F .LBB1: # T mov DWORD PTR [%ESP], 1 mov DWORD PTR [%ESP + 4], 0 call test *** mov %EAX, DWORD PTR [%ESP + 8] add %ESP, 12 ret .LBB2: # F mov DWORD PTR [%ESP], 123 mov DWORD PTR [%ESP + 4], 1 call test *** mov %EAX, DWORD PTR [%ESP + 8] add %ESP, 12 ret Instead of: test: sub %ESP, 20 mov %EAX, DWORD PTR [%ESP + 24] mov %CL, BYTE PTR [%ESP + 28] *** lea %EDX, DWORD PTR [%ESP + 16] *** mov DWORD PTR [%EDX], %EAX mov %EAX, OFFSET G mov DWORD PTR [%EAX], %EDX test %CL, %CL *** mov DWORD PTR [%ESP + 12], %EDX je .LBB2 # PC rel: F .LBB1: # T mov DWORD PTR [%ESP], 1 mov %EAX, 0 mov DWORD PTR [%ESP + 4], %EAX call test *** mov %EAX, DWORD PTR [%ESP + 12] *** mov %EAX, DWORD PTR [%EAX] add %ESP, 20 ret .LBB2: # F mov DWORD PTR [%ESP], 123 mov %EAX, 1 mov DWORD PTR [%ESP + 4], %EAX call test *** mov %EAX, DWORD PTR [%ESP + 12] *** mov %EAX, DWORD PTR [%EAX] add %ESP, 20 ret llvm-svn: 13557
-
Chris Lattner authored
sized allocas in the entry block). Instead of generating code like this: entry: reg1024 = ESP+1234 ... (much later) *reg1024 = 17 Generate code that looks like this: entry: (no code generated) ... (much later) t = ESP+1234 *t = 17 The advantage being that we DRAMATICALLY reduce the register pressure for these silly temporaries (they were all being spilled to the stack, resulting in very silly code). This is actually a manual implementation of rematerialization :) I have a patch to fold the alloca address computation into loads & stores, which will make this much better still, but just getting this right took way too much time and I'm sleepy. llvm-svn: 13554
-
- May 12, 2004
-
-
Chris Lattner authored
mov DWORD PTR [%ESP + 4], 1 instead of: mov %EAX, 1 mov DWORD PTR [%ESP + 4], %EAX llvm-svn: 13494
-
- May 10, 2004
-
-
Chris Lattner authored
compiling things like 'add long %X, 1'. The problem is that we were switching the order of the operands for longs even though we can't fold them yet. llvm-svn: 13451
-
Chris Lattner authored
llvm-svn: 13440
-
Chris Lattner authored
llvm-svn: 13439
-
- May 07, 2004
-
-
Chris Lattner authored
allows us to compile: store float 10.0, float* %P into: mov DWORD PTR [%EAX], 1092616192 instead of: .CPItest_0: # float 0x4024000000000000 .long 1092616192 # float 10 ... fld DWORD PTR [.CPItest_0] fstp DWORD PTR [%EAX] llvm-svn: 13409
-
Chris Lattner authored
against zero. In particular, don't emit: mov %ESI, 0 cmp %ECX, %ESI instead, emit: test %ECX, %ECX llvm-svn: 13407
-
- May 04, 2004
-
-
Chris Lattner authored
llvm-svn: 13355
-
Chris Lattner authored
div: mov %EDX, DWORD PTR [%ESP + 4] mov %ECX, 64 mov %EAX, %EDX sar %EDX, 31 idiv %ECX ret to this: div: mov %EAX, DWORD PTR [%ESP + 4] mov %ECX, %EAX sar %ECX, 5 shr %ECX, 26 mov %EDX, %EAX add %EDX, %ECX sar %EAX, 6 ret Note that the intel compiler is currently making this: div: movl 4(%esp), %edx #3.5 movl %edx, %eax #4.14 sarl $5, %eax #4.14 shrl $26, %eax #4.14 addl %edx, %eax #4.14 sarl $6, %eax #4.14 ret #4.14 Which has one less register->register copy. (hint hint alkis :) llvm-svn: 13354
-
Chris Lattner authored
llvm-svn: 13342
-
- May 01, 2004
-
-
Chris Lattner authored
llvm-svn: 13304
-
Chris Lattner authored
Look at all of the pretty minuses. :) llvm-svn: 13303
-
- Apr 28, 2004
-
-
Brian Gaeke authored
In InsertFPRegKills(), just check the MachineBasicBlock for successors instead of its corresponding BasicBlock. llvm-svn: 13213
-