Commits · 8d394eb703206b24ea9b92afd9703c7d2293cec5 · Roger Ferrer / llvm-epi-0.8

Aug 04, 2005

Scalar SSE: load +0.0 -> xorps/xorpd · 8d394eb7

Nate Begeman authored Aug 03, 2005

Scalar SSE: a < b ? c : 0.0 -> cmpss, andps
Scalar SSE: float -> i16 needs to be promoted

llvm-svn: 22637

8d394eb7

Aug 02, 2005
- Update to use the new MathExtras.h support for log2 computation. · 6667bdba
  Chris Lattner authored Aug 02, 2005
```
Patch contributed by Jim Laskey!

llvm-svn: 22594
```
  6667bdba
Jul 30, 2005
- Keep tabs and trailing spaces out. · 546fd594
  Jeff Cohen authored Jul 30, 2005
```
llvm-svn: 22565
```
  546fd594
- fix a typeo · 49134575
  Chris Lattner authored Jul 30, 2005
```
llvm-svn: 22561
```
  49134575
- Change the fp to integer code to not perform 2-byte stores followed by · aeef51b6
  Chris Lattner authored Jul 30, 2005
```
1 byte loads and other operations.  This is bad for store-forwarding on
common CPUs.  We now do this:

fnstcw WORD PTR [%ESP]
mov %AX, WORD PTR [%ESP]

instead of:

fnstcw WORD PTR [%ESP]
mov %AL, BYTE PTR [%ESP + 1]

llvm-svn: 22559
```
  aeef51b6
- Use a custom expander for all FP to int conversions, as the X86 only has · 4738d1b5
  Chris Lattner authored Jul 30, 2005
```
FP-to-int-in-memory: this exposes the load from the stored slot to the
selection dag, allowing it to be folded into other operaions.

llvm-svn: 22556
```
  4738d1b5
- turn off GOT on archs that didn't use it (not that it appeard to harm them much with it on) · 2f9c52e1
  Andrew Lenharth authored Jul 29, 2005
```
llvm-svn: 22553
```
  2f9c52e1
Jul 29, 2005

Implement a FIXME: move a bunch of cruft for handling FP_TO_*INT operations · bc85c32c

Chris Lattner authored Jul 29, 2005

that the X86 does not support to the legalizer.  This allows it to be better
optimized, etc, and will help with SSE support.

llvm-svn: 22551

bc85c32c

Don't forget to diddle with the control word when performing an FISTP64. · 6dc60e85
Chris Lattner authored Jul 29, 2005
```
llvm-svn: 22550
```
6dc60e85

Use a custom expander to compile this: · 67756e2e

Chris Lattner authored Jul 29, 2005

long %test4(double %X) {
        %tmp.1 = cast double %X to long         ; <long> [#uses=1]
        ret long %tmp.1
}

to this:

_test4:
        sub %ESP, 12
        fld QWORD PTR [%ESP + 16]
        fistp QWORD PTR [%ESP]
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%ESP]
        add %ESP, 12
        ret

instead of this:

_test4:
        sub %ESP, 28
        fld QWORD PTR [%ESP + 32]
        fstp QWORD PTR [%ESP]
        call ___fixdfdi
        add %ESP, 28
        ret

llvm-svn: 22549

67756e2e

Jul 27, 2005
- Eliminate all remaining tabs and trailing spaces. · 5f4ef3c5
  Jeff Cohen authored Jul 27, 2005
```
llvm-svn: 22523
```
  5f4ef3c5
- Eliminate tabs and trailing spaces. · 33a030e3
  Jeff Cohen authored Jul 27, 2005
```
llvm-svn: 22520
```
  33a030e3
Jul 22, 2005
- update interface · 111e5e64
  Andrew Lenharth authored Jul 22, 2005
```
llvm-svn: 22498
```
  111e5e64
Jul 19, 2005

For: memory operations -> stores · d37d854c

Reid Spencer authored Jul 19, 2005

This is the first incremental patch to implement this feature. It adds no
functionality to LLVM but setup up the information needed from targets in
order to implement the optimization correctly. Each target needs to specify
the maximum number of store operations for conversion of the llvm.memset,
llvm.memcpy, and llvm.memmove intrinsics into a sequence of store operations.
The limit needs to be chosen at the threshold of performance for such an
optimization (generally smallish). The target also needs to specify whether
the target can support unaligned stores for multi-byte store operations.
This helps ensure the optimization doesn't generate code that will trap on
an alignment errors.
More patches to follow.

llvm-svn: 22468

d37d854c

Jul 16, 2005

Teach the legalizer how to promote SINT_TO_FP to a wider SINT_TO_FP that · 7e74c834

Nate Begeman authored Jul 16, 2005

the target natively supports.  This eliminates some special-case code from
the x86 backend and generates better code as well.

For an i8 to f64 conversion, before & after:

_x87 before:
        subl $2, %esp
        movb 6(%esp), %al
        movsbw %al, %ax
        movw %ax, (%esp)
        filds (%esp)
        addl $2, %esp
        ret

_x87 after:
        subl $2, %esp
        movsbw 6(%esp), %ax
        movw %ax, (%esp)
        filds (%esp)
        addl $2, %esp
        ret

_sse before:
        subl $12, %esp
        movb 16(%esp), %al
        movsbl %al, %eax
        cvtsi2sd %eax, %xmm0
        addl $12, %esp
        ret

_sse after:
        subl $12, %esp
        movsbl 16(%esp), %eax
        cvtsi2sd %eax, %xmm0
        addl $12, %esp
        ret

llvm-svn: 22452

7e74c834

Teach the register allocator that movaps is also a move instruction · 8293d0e2
Nate Begeman authored Jul 16, 2005
```
llvm-svn: 22451
```
8293d0e2
A couple more darwinisms · 57b9ed52
Nate Begeman authored Jul 16, 2005
```
llvm-svn: 22450
```
57b9ed52

Remove all knowledge of UINT_TO_FP from the X86 backend, relying on the · 507a2759

Chris Lattner authored Jul 16, 2005

legalizer to eliminate them.  With this comes the expected code quality
improvements, such as, for this:

double foo(unsigned short X) { return X; }

we now generate this:

_foo:
        subl $4, %esp
        movzwl 8(%esp), %eax
        movl %eax, (%esp)
        fildl (%esp)
        addl $4, %esp
        ret

instead of this:

_foo:
        subl $4, %esp
        movw 8(%esp), %ax
        movzwl %ax, %eax   ;; Load not folded into this.
        movl %eax, (%esp)
        fildl (%esp)
        addl $4, %esp
        ret

-Chris

llvm-svn: 22449

507a2759

Jul 15, 2005
- Get closer to fully working scalar FP in SSE regs. This gets singlesource · a0b5e035
  Nate Begeman authored Jul 15, 2005
```
working, and Olden/power.

llvm-svn: 22441
```
  a0b5e035
- Add support for printing the sse scalar comparison instruction mnemonics. · 0f38dc49
  Nate Begeman authored Jul 14, 2005
```
llvm-svn: 22440
```
  0f38dc49
Jul 12, 2005

Check in the last of the darwin-specific code necessary to get shootout · 8dd96ec7
Nate Begeman authored Jul 12, 2005
```
working before modifying the asm printer to use the subtarget info.

llvm-svn: 22408
```
8dd96ec7
Clean up the TargetSubtarget class a bit, removing an unnecessary argument · df8946de
Nate Begeman authored Jul 12, 2005
```
to the constructor.

llvm-svn: 22392
```
df8946de
Minor changes to improve comments and fix the build on _WIN32 systems. · 351817b1
Chris Lattner authored Jul 12, 2005
```
llvm-svn: 22391
```
351817b1
Add a note · f873f4d5
Chris Lattner authored Jul 12, 2005
```
llvm-svn: 22390
```
f873f4d5

Implement Subtarget support · f26625e1

Nate Begeman authored Jul 12, 2005

Implement the X86 Subtarget.

This consolidates the checks for target triple, and setting options based
on target triple into one place.  This allows us to convert the asm printer
and isel over from being littered with "forDarwin", "forCygwin", etc. into
just having the appropriate flags for each subtarget feature controlling
the code for that feature.

This patch also implements indirect external and weak references in the
X86 pattern isel, for darwin.  Next up is to convert over the asm printers
to use this new interface.

llvm-svn: 22389

f26625e1

Commit some pending darwin changes before subtarget support. · 83b492b8
Nate Begeman authored Jul 12, 2005
```
llvm-svn: 22388
```
83b492b8

Jul 11, 2005
- Output .size directives to tell the assembler the size of each function. · 9bdb1c38
  Chris Lattner authored Jul 11, 2005
```
llvm-svn: 22381
```
  9bdb1c38
- Fix crazy indentation · 0d2f043c
  Chris Lattner authored Jul 11, 2005
```
llvm-svn: 22380
```
  0d2f043c
- Refactor things a bit to allow the ELF code emitter to run the X86 machine code emitter · d831209c
  Chris Lattner authored Jul 11, 2005
```
after itself.

llvm-svn: 22376
```
  d831209c
- Remove prototype for non-existant function · c3e38f79
  Chris Lattner authored Jul 11, 2005
```
llvm-svn: 22372
```
  c3e38f79
Jul 10, 2005
- Change *EXTLOAD to use an VTSDNode operand instead of being an MVTSDNode. · 53676dfd
  Chris Lattner authored Jul 10, 2005
```
This is the last MVTSDNode.

This allows us to eliminate a bunch of special case code for handling
MVTSDNodes.

Also, remove some uses of dyn_cast that should really be cast (which is
cheaper in a release build).

llvm-svn: 22368
```
  53676dfd
- Change TRUNCSTORE to use a VTSDNode operand instead of being an MVTSTDNode · 36db1ed0
  Chris Lattner authored Jul 10, 2005
```
llvm-svn: 22366
```
  36db1ed0
Jul 08, 2005
- Add support for assembling .s files on mac os x for intel · b62a4c8d
  Nate Begeman authored Jul 08, 2005
```
Add support for running bugpoint on mac os x for intel

llvm-svn: 22351
```
  b62a4c8d
Jul 07, 2005
- Restore some code that was accidentally removed by Nate's patch yesterday. · 2e81f65e
  Chris Lattner authored Jul 07, 2005
```
This fixes the regressions from last night.

llvm-svn: 22344
```
  2e81f65e
- Fix a typo in my checkin today that caused regressions. Oops! · fcd2f76c
  Nate Begeman authored Jul 07, 2005
```
llvm-svn: 22341
```
  fcd2f76c
Jul 06, 2005

First round of support for doing scalar FP using the SSE2 ISA extension and · 8a093360

Nate Begeman authored Jul 06, 2005

XMM registers.  There are many known deficiencies and fixmes, which will be
addressed ASAP.  The major benefit of this work is that it will allow the
LLVM register allocator to allocate FP registers across basic blocks.

The x86 backend will still default to x87 style FP.  To enable this work,
you must pass -enable-sse-scalar-fp and either -sse2 or -sse3 to llc.

An example before and after would be for:
double foo(double *P) { double Sum = 0; int i; for (i = 0; i < 1000; ++i)
                        Sum += P[i]; return Sum; }

The inner loop looks like the following:
x87:
.LBB_foo_1:     # no_exit
        fldl (%esp)
        faddl (%eax,%ecx,8)
        fstpl (%esp)
        incl %ecx
        cmpl $1000, %ecx
        #FP_REG_KILL
        jne .LBB_foo_1  # no_exit

SSE2:
        addsd (%eax,%ecx,8), %xmm0
        incl %ecx
        cmpl $1000, %ecx
        #FP_REG_KILL
        jne .LBB_foo_1  # no_exit

llvm-svn: 22340

8a093360

Jul 05, 2005

Make several cleanups to Andrews varargs change: · a7220851

Chris Lattner authored Jul 05, 2005

1. Pass Value*'s into lowering methods so that the proper pointers can be
   added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
   chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.
4. Now that we have Value*'s available in the lowering methods, pass them
   into any load/stores from the valist that are emitted

llvm-svn: 22339

a7220851

Fit to 80 columns · 91ae129b
Chris Lattner authored Jul 05, 2005
```
llvm-svn: 22336
```
91ae129b

Jul 03, 2005
- Percolate the call up to the right superclass · 9f6ce0eb
  Chris Lattner authored Jul 03, 2005
```
llvm-svn: 22330
```
  9f6ce0eb
Jul 02, 2005
- The statistic needs to be in the correct namespace. · 9a1dc727
  Nate Begeman authored Jul 01, 2005
```
llvm-svn: 22327
```
  9a1dc727