- Aug 04, 2005
-
-
Nate Begeman authored
Scalar SSE: a < b ? c : 0.0 -> cmpss, andps Scalar SSE: float -> i16 needs to be promoted llvm-svn: 22637
-
- Aug 02, 2005
-
-
Chris Lattner authored
Patch contributed by Jim Laskey! llvm-svn: 22594
-
- Jul 30, 2005
-
-
Jeff Cohen authored
llvm-svn: 22565
-
Chris Lattner authored
llvm-svn: 22561
-
Chris Lattner authored
1 byte loads and other operations. This is bad for store-forwarding on common CPUs. We now do this: fnstcw WORD PTR [%ESP] mov %AX, WORD PTR [%ESP] instead of: fnstcw WORD PTR [%ESP] mov %AL, BYTE PTR [%ESP + 1] llvm-svn: 22559
-
Chris Lattner authored
FP-to-int-in-memory: this exposes the load from the stored slot to the selection dag, allowing it to be folded into other operaions. llvm-svn: 22556
-
Andrew Lenharth authored
llvm-svn: 22553
-
- Jul 29, 2005
-
-
Chris Lattner authored
that the X86 does not support to the legalizer. This allows it to be better optimized, etc, and will help with SSE support. llvm-svn: 22551
-
Chris Lattner authored
llvm-svn: 22550
-
Chris Lattner authored
long %test4(double %X) { %tmp.1 = cast double %X to long ; <long> [#uses=1] ret long %tmp.1 } to this: _test4: sub %ESP, 12 fld QWORD PTR [%ESP + 16] fistp QWORD PTR [%ESP] mov %EDX, DWORD PTR [%ESP + 4] mov %EAX, DWORD PTR [%ESP] add %ESP, 12 ret instead of this: _test4: sub %ESP, 28 fld QWORD PTR [%ESP + 32] fstp QWORD PTR [%ESP] call ___fixdfdi add %ESP, 28 ret llvm-svn: 22549
-
- Jul 27, 2005
-
-
Jeff Cohen authored
llvm-svn: 22523
-
Jeff Cohen authored
llvm-svn: 22520
-
- Jul 22, 2005
-
-
Andrew Lenharth authored
llvm-svn: 22498
-
- Jul 19, 2005
-
-
Reid Spencer authored
This is the first incremental patch to implement this feature. It adds no functionality to LLVM but setup up the information needed from targets in order to implement the optimization correctly. Each target needs to specify the maximum number of store operations for conversion of the llvm.memset, llvm.memcpy, and llvm.memmove intrinsics into a sequence of store operations. The limit needs to be chosen at the threshold of performance for such an optimization (generally smallish). The target also needs to specify whether the target can support unaligned stores for multi-byte store operations. This helps ensure the optimization doesn't generate code that will trap on an alignment errors. More patches to follow. llvm-svn: 22468
-
- Jul 16, 2005
-
-
Nate Begeman authored
the target natively supports. This eliminates some special-case code from the x86 backend and generates better code as well. For an i8 to f64 conversion, before & after: _x87 before: subl $2, %esp movb 6(%esp), %al movsbw %al, %ax movw %ax, (%esp) filds (%esp) addl $2, %esp ret _x87 after: subl $2, %esp movsbw 6(%esp), %ax movw %ax, (%esp) filds (%esp) addl $2, %esp ret _sse before: subl $12, %esp movb 16(%esp), %al movsbl %al, %eax cvtsi2sd %eax, %xmm0 addl $12, %esp ret _sse after: subl $12, %esp movsbl 16(%esp), %eax cvtsi2sd %eax, %xmm0 addl $12, %esp ret llvm-svn: 22452
-
Nate Begeman authored
llvm-svn: 22451
-
Nate Begeman authored
llvm-svn: 22450
-
Chris Lattner authored
legalizer to eliminate them. With this comes the expected code quality improvements, such as, for this: double foo(unsigned short X) { return X; } we now generate this: _foo: subl $4, %esp movzwl 8(%esp), %eax movl %eax, (%esp) fildl (%esp) addl $4, %esp ret instead of this: _foo: subl $4, %esp movw 8(%esp), %ax movzwl %ax, %eax ;; Load not folded into this. movl %eax, (%esp) fildl (%esp) addl $4, %esp ret -Chris llvm-svn: 22449
-
- Jul 15, 2005
-
-
Nate Begeman authored
working, and Olden/power. llvm-svn: 22441
-
Nate Begeman authored
llvm-svn: 22440
-
- Jul 12, 2005
-
-
Nate Begeman authored
working before modifying the asm printer to use the subtarget info. llvm-svn: 22408
-
Nate Begeman authored
to the constructor. llvm-svn: 22392
-
Chris Lattner authored
llvm-svn: 22391
-
Chris Lattner authored
llvm-svn: 22390
-
Nate Begeman authored
Implement the X86 Subtarget. This consolidates the checks for target triple, and setting options based on target triple into one place. This allows us to convert the asm printer and isel over from being littered with "forDarwin", "forCygwin", etc. into just having the appropriate flags for each subtarget feature controlling the code for that feature. This patch also implements indirect external and weak references in the X86 pattern isel, for darwin. Next up is to convert over the asm printers to use this new interface. llvm-svn: 22389
-
Nate Begeman authored
llvm-svn: 22388
-
- Jul 11, 2005
-
-
Chris Lattner authored
llvm-svn: 22381
-
Chris Lattner authored
llvm-svn: 22380
-
Chris Lattner authored
after itself. llvm-svn: 22376
-
Chris Lattner authored
llvm-svn: 22372
-
- Jul 10, 2005
-
-
Chris Lattner authored
This is the last MVTSDNode. This allows us to eliminate a bunch of special case code for handling MVTSDNodes. Also, remove some uses of dyn_cast that should really be cast (which is cheaper in a release build). llvm-svn: 22368
-
Chris Lattner authored
llvm-svn: 22366
-
- Jul 08, 2005
-
-
Nate Begeman authored
Add support for running bugpoint on mac os x for intel llvm-svn: 22351
-
- Jul 07, 2005
-
-
Chris Lattner authored
This fixes the regressions from last night. llvm-svn: 22344
-
Nate Begeman authored
llvm-svn: 22341
-
- Jul 06, 2005
-
-
Nate Begeman authored
XMM registers. There are many known deficiencies and fixmes, which will be addressed ASAP. The major benefit of this work is that it will allow the LLVM register allocator to allocate FP registers across basic blocks. The x86 backend will still default to x87 style FP. To enable this work, you must pass -enable-sse-scalar-fp and either -sse2 or -sse3 to llc. An example before and after would be for: double foo(double *P) { double Sum = 0; int i; for (i = 0; i < 1000; ++i) Sum += P[i]; return Sum; } The inner loop looks like the following: x87: .LBB_foo_1: # no_exit fldl (%esp) faddl (%eax,%ecx,8) fstpl (%esp) incl %ecx cmpl $1000, %ecx #FP_REG_KILL jne .LBB_foo_1 # no_exit SSE2: addsd (%eax,%ecx,8), %xmm0 incl %ecx cmpl $1000, %ecx #FP_REG_KILL jne .LBB_foo_1 # no_exit llvm-svn: 22340
-
- Jul 05, 2005
-
-
Chris Lattner authored
1. Pass Value*'s into lowering methods so that the proper pointers can be added to load/stores from the valist 2. Intrinsics that return void should only return a token chain, not a token chain/retval pair. 3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone. 4. Now that we have Value*'s available in the lowering methods, pass them into any load/stores from the valist that are emitted llvm-svn: 22339
-
Chris Lattner authored
llvm-svn: 22336
-
- Jul 03, 2005
-
-
Chris Lattner authored
llvm-svn: 22330
-
- Jul 02, 2005
-
-
Nate Begeman authored
llvm-svn: 22327
-