Commits · 0f45b6fda5988719f55728cbfc70bb553c5733dd · Roger Ferrer / llvm-epi-0.8

May 13, 2008

Instead of a vector load, shuffle and then extract an element. Load the... · 1120279a

Evan Cheng authored May 13, 2008

Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset.
        pshufd $1, (%rdi), %xmm0
        movd %xmm0, %eax
=>
        movl 4(%rdi), %eax

llvm-svn: 51026

1120279a

Xform bitconvert(build_pair(load a, load b)) to a single load if the load... · b980f6fb

Evan Cheng authored May 12, 2008

Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.

llvm-svn: 51008

b980f6fb

May 12, 2008
- Initial X86 codegen support for VSETCC. · d875c3e2
  Nate Begeman authored May 12, 2008
```
llvm-svn: 51000
```
  d875c3e2
- Refactor isConsecutiveLoad from X86 to TargetLowering so DAG combiner can make use of it. · 2609d5e7
  Evan Cheng authored May 12, 2008
```
llvm-svn: 50991
```
  2609d5e7
- Fix a compile error on compilers that still want a return value · 906716c4
  Dan Gohman authored May 12, 2008
```
in a non-void function that calls abort.

llvm-svn: 50969
```
  906716c4
May 10, 2008
- When transforming a vector_shuffle to a load, the base address must not be an undef. · 71b9afb0
  Evan Cheng authored May 10, 2008
```
llvm-svn: 50940
```
  71b9afb0
- For now, abort when an ISD::VAARG is encountered on x86-64, rather · 3c0e11af
  Dan Gohman authored May 10, 2008
```
than silently generate invalid code.

llvm-gcc does not currently use VAArgInst; it lowers va_arg in the
front-end.

llvm-svn: 50930
```
  3c0e11af
- If movl top bits are undef, let it be selected to movlps, etc. · bb48d55a
  Evan Cheng authored May 10, 2008
```
llvm-svn: 50928
```
  bb48d55a
May 09, 2008

Handle a few more cases of folding load i64 into xmm and zero top bits. · 961339bb

Evan Cheng authored May 09, 2008

Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch.

llvm-svn: 50918

961339bb

May 08, 2008

Handle vector move / load which zero the destination register top bits (i.e.... · 78af38c3

Evan Cheng authored May 08, 2008

Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine.

llvm-svn: 50838

78af38c3

May 06, 2008
- Improved generated code for atomic operators · 310a38d5
  Mon P Wang authored May 05, 2008
```
llvm-svn: 50677
```
  310a38d5
- Code clean up. No functionality change. · dbfcce37
  Evan Cheng authored May 05, 2008
```
llvm-svn: 50675
```
  dbfcce37
May 05, 2008
- Added addition atomic instrinsics and, or, xor, min, and max. · 3e58393c
  Mon P Wang authored May 05, 2008
```
llvm-svn: 50663
```
  3e58393c
May 04, 2008
- Add General Dynamic TLS model for X86-64. Some parts looks really ugly (look for tlsaddr pattern), · 9205c856
  Anton Korobeynikov authored May 04, 2008
```
but should work. Work is in progress, more models will follow

llvm-svn: 50630
```
  9205c856
- Select vector shift with non-immediate i32 shift amount operand by first... · d9481366
  Evan Cheng authored May 04, 2008
```
Select vector shift with non-immediate i32 shift amount operand by first moving the operand into the right register.

llvm-svn: 50619
```
  d9481366
Apr 30, 2008

Tail call optimization improvements: · be0de34e

Arnold Schwaighofer authored Apr 30, 2008

Move platform independent code (lowering of possibly overwritten
arguments, check for tail call optimization eligibility) from
target X86ISelectionLowering.cpp to TargetLowering.h and
SelectionDAGISel.cpp.

Initial PowerPC tail call implementation:

Support ppc32 implemented and tested (passes my tests and
test-suite llvm-test).  
Support ppc64 implemented and half tested (passes my tests).
On ppc tail call optimization is performed if 
  caller and callee are fastcc
  call is a tail call (in tail call position, call followed by ret)
  no variable argument lists or byval arguments
  option -tailcallopt is enabled
Supported:
 * non pic tail calls on linux/darwin
 * module-local tail calls on linux(PIC/GOT)/darwin(PIC)
 * inter-module tail calls on darwin(PIC)
If constraints are not met a normal call will be emitted.

A test checking the argument lowering behaviour on x86-64 was added.

llvm-svn: 50477

be0de34e

Apr 28, 2008
- Fix the SVOffset values for loads and stores produced by · da440548
  Dan Gohman authored Apr 28, 2008
```
memcpy/memset expansion. It was a bug for the SVOffset value
to be used in the actual address calculations.

llvm-svn: 50359
```
  da440548
- Properly lower vararg's FORMAL_ARGUMENTS node on win64 · e183b3cd
  Anton Korobeynikov authored Apr 27, 2008
```
llvm-svn: 50325
```
  e183b3cd
Apr 27, 2008

A few inline asm cleanups: · 724539c0

Chris Lattner authored Apr 26, 2008

  - Make targetlowering.h fit in 80 cols.
  - Make LowerAsmOperandForConstraint const.
  - Make lowerXConstraint -> LowerXConstraint
  - Make LowerXConstraint return a const char* instead of taking a string byref.

llvm-svn: 50312

724539c0

Apr 25, 2008

Extract the lower 64-bit if a MMX value is passed in a XMM register. · 1e78184a
Evan Cheng authored Apr 25, 2008
```
llvm-svn: 50292
```
1e78184a
Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers. · ccde6dd0
Evan Cheng authored Apr 25, 2008
```
llvm-svn: 50289
```
ccde6dd0

MMX argument passing fixes: · df38b35a

Evan Cheng authored Apr 25, 2008

On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].
On Darwin / Linux x86-32, v1i64 values are passed in memory.
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.

llvm-svn: 50257

df38b35a

Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not... · 9165e165

Evan Cheng authored Apr 25, 2008

Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.

llvm-svn: 50239

9165e165

Apr 22, 2008

Implement an x86-64 ABI detail of passing structs by hidden first · f166d2d0

Dan Gohman authored Apr 21, 2008

argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.

Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.

llvm-svn: 50075

f166d2d0

Apr 20, 2008
- Switch to using Simplified ConstantFP::get API. · 3b18762f
  Chris Lattner authored Apr 20, 2008
```
llvm-svn: 49977
```
  3b18762f
Apr 18, 2008
- Fix the handling of va_copy on x86-64. As of llvm-gcc r49920 · ad4071a9
  Dan Gohman authored Apr 18, 2008
```
llvm-gcc is now lowering va_copy on x86-64, so this completes
the fix for PR2230.

llvm-svn: 49922
```
  ad4071a9
Apr 16, 2008

Ongoing work on improving the instruction selection infrastructure: · a3ee1a38

Roman Levenstein authored Apr 16, 2008

Rename SDOperandImpl back to SDOperand.
Introduce the SDUse class that represents a use of the SDNode referred by
an SDOperand. Now it is more similar to Use/Value classes.

Patch is approved by Dan Gohman.

llvm-svn: 49795

a3ee1a38

Add support for the form of the SSE41 extractps instruction that · d43d3bee
Dan Gohman authored Apr 16, 2008
```
puts its result in a 32-bit GPR.

llvm-svn: 49762
```
d43d3bee

Recreate the size SDNode instead of reusing the old one in the x86 · 8c99ccaf

Dan Gohman authored Apr 16, 2008

memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.

llvm-svn: 49761

8c99ccaf

Apr 14, 2008
- Fix const-correctness issues with the SrcValue handling in the · 2505d867
  Dan Gohman authored Apr 14, 2008
```
memory intrinsic expansion code.

llvm-svn: 49666
```
  2505d867
Apr 12, 2008

This patch corrects the handling of byval arguments for tailcall · 634fc9a3

Arnold Schwaighofer authored Apr 12, 2008

optimized x86-64 (and x86) calls so that they work (... at least for
my test cases).

Should fix the following problems:

Problem 1: When i introduced the optimized handling of arguments for
tail called functions (using a sequence of copyto/copyfrom virtual
registers instead of always lowering to top of the stack) i did not
handle byval arguments correctly e.g they did not work at all :).

Problem 2: On x86-64 after the arguments of the tail called function
are moved to their registers (which include ESI/RSI etc), tail call
optimization performs byval lowering which causes xSI,xDI, xCX
registers to be overwritten. This is handled in this patch by moving
the arguments to virtual registers first and after the byval lowering
the arguments are moved from those virtual registers back to
RSI/RDI/RCX.

llvm-svn: 49584

634fc9a3

Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal · 544ab2c5

Dan Gohman authored Apr 12, 2008

on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.

Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.

This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.

Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.

This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.

llvm-svn: 49572

544ab2c5

Fix a bug that prevented x86-64 from using rep.movsq for · 8c7cf88f
Dan Gohman authored Apr 12, 2008
```
8-byte-aligned data.

llvm-svn: 49571
```
8c7cf88f

Apr 09, 2008
- Make isVectorClearMaskLegal's operand list const. · 33b33001
  Dan Gohman authored Apr 09, 2008
```
llvm-svn: 49446
```
  33b33001
Apr 07, 2008
- Re-commit of the r48822, where the infinite looping problem discovered · 51f532f9
  Roman Levenstein authored Apr 07, 2008
```
by Dan Gohman is fixed.

llvm-svn: 49330
```
  51f532f9
Apr 05, 2008
- Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps. · f77b5ef3
  Evan Cheng authored Apr 05, 2008
```
llvm-svn: 49244
```
  f77b5ef3
Apr 03, 2008
- Backing out 48222 temporarily. · 025cea11
  Evan Cheng authored Apr 03, 2008
```
llvm-svn: 49124
```
  025cea11
Apr 01, 2008
- Don't use __bzero for memset if the second argument isn't zero. · cb9f8f6e
  Dan Gohman authored Apr 01, 2008
```
llvm-svn: 49050
```
  cb9f8f6e
- Speculatively micro-optimize memory-zeroing calls on Darwin 10. · 980d7200
  Dan Gohman authored Apr 01, 2008
```
llvm-svn: 49048
```
  980d7200
- Accept 'y' constraint (MMX) in inline asm. · efa81a69
  Dale Johannesen authored Apr 01, 2008
```
llvm-svn: 49011
```
  efa81a69