Commits · ec04a37edd65efa2e74c4e825e211f4528a55668 · Roger Ferrer / llvm-epi-0.8

Apr 27, 2006
- A couple of new entries. · ec04a37e
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27993
```
  ec04a37e
- Support for passing 128-bit vector arguments via XMM registers. · 89001ad7
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27992
```
  89001ad7
- Oops · a0374e1b
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27989
```
  a0374e1b
- Bug fix: not updating NumIntRegs. · 24eb3f47
  Evan Cheng authored Apr 27, 2006
```
llvm-svn: 27988
```
  24eb3f47
- - Clean up formal argument lowering code. Prepare for vector pass by value work. · 48940d16
  Evan Cheng authored Apr 27, 2006
```
- Fixed vararg support.

llvm-svn: 27985
```
  48940d16
Apr 26, 2006
- Fix fastcc failures. · 1c399032
  Evan Cheng authored Apr 26, 2006
```
llvm-svn: 27980
```
  1c399032
- Switching over FORMAL_ARGUMENTS mechanism to lower call arguments. · e0bcfbe8
  Evan Cheng authored Apr 26, 2006
```
llvm-svn: 27975
```
  e0bcfbe8
Apr 25, 2006
- Keep the stack from on darwin 16-byte aligned. This fixes many JIT · 4530327c
  Nate Begeman authored Apr 25, 2006
```
failres.

llvm-svn: 27973
```
  4530327c
- Separate LowerOperation() into multiple functions, one per opcode. · a9467aab
  Evan Cheng authored Apr 25, 2006
```
llvm-svn: 27972
```
  a9467aab
- Fix a typo. · 4cc3e0b0
  Evan Cheng authored Apr 25, 2006
```
llvm-svn: 27968
```
  4cc3e0b0
- Explicitly specify result type for def : Pat<> patterns (if it produces a vector · fb46b2bf
  Evan Cheng authored Apr 25, 2006
```
result). Otherwise tblgen will pick the default (v16i8 for 128-bit vector).

llvm-svn: 27965
```
  fb46b2bf
- Added X86 SSE2 intrinsics which can be represented as vector_shuffles. This is · 25b09295
  Evan Cheng authored Apr 24, 2006
```
a temporary workaround for the 2-wide vector_shuffle problem (i.e. its mask
would have type v2i32 which is not legal).

llvm-svn: 27964
```
  25b09295
- Add a new entry. · d03631ee
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27963
```
  d03631ee
- Special case handling two wide build_vector(0, x). · 5c2bfb06
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27961
```
  5c2bfb06
Apr 24, 2006
- Some missing movlps, movhps, movlpd, and movhpd patterns. · 63bd4d37
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27960
```
  63bd4d37
- A little bit more build_vector enhancement for v8i16 cases. · b0461080
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27959
```
  b0461080
- Remove a completed entry. · 2f9b0bcb
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27958
```
  2f9b0bcb
- MakeMIInst() should handle jump table index operands. · ab0ee634
  Evan Cheng authored Apr 24, 2006
```
llvm-svn: 27955
```
  ab0ee634
Apr 23, 2006
- Add a note · f110527a
  Chris Lattner authored Apr 23, 2006
```
llvm-svn: 27954
```
  f110527a
- MOVL shuffle (i.e. movd or movss / movsd from memory) of undef, V2 == V2 · b4f31dd1
  Evan Cheng authored Apr 23, 2006
```
llvm-svn: 27953
```
  b4f31dd1
- Optimized stores to the constant pool, while cool, are unnecessary. · 9f0b13c8
  Nate Begeman authored Apr 22, 2006
```
llvm-svn: 27948
```
  9f0b13c8
Apr 22, 2006

JumpTable support! What this represents is working asm and jit support for · 4ca2ea5b

Nate Begeman authored Apr 22, 2006

x86 and ppc for 100% dense switch statements when relocations are non-PIC.
This support will be extended and enhanced in the coming days to support
PIC, and less dense forms of jump tables.

llvm-svn: 27947

4ca2ea5b

Don't do all the lowering stuff for 2-wide build_vector's. Also, minor... · e728efdf

Evan Cheng authored Apr 22, 2006

Don't do all the lowering stuff for 2-wide build_vector's. Also, minor optimization for shuffle of undef.

llvm-svn: 27946

e728efdf

Fix a performance regression. Use {p}shuf* when there are only two distinct... · 16ef94f4
Evan Cheng authored Apr 22, 2006
```
Fix a performance regression. Use {p}shuf* when there are only two distinct elements in a build_vector.

llvm-svn: 27945
```
16ef94f4

Revamp build_vector lowering to take advantage of movss and movd instructions. · 14215c36

Evan Cheng authored Apr 21, 2006

movd always clear the top 96 bits and movss does so when it's loading the
value from memory.
The net result is codegen for 4-wide shuffles is much improved. It is near
optimal if one or more elements is a zero. e.g.

__m128i test(int a, int b) {
  return _mm_set_epi32(0, 0, b, a);
}

compiles to

_test:
	movd 8(%esp), %xmm1
	movd 4(%esp), %xmm0
	punpckldq %xmm1, %xmm0
	ret

compare to gcc:

_test:
	subl	$12, %esp
	movd	20(%esp), %xmm0
	movd	16(%esp), %xmm1
	punpckldq	%xmm0, %xmm1
	movq	%xmm1, %xmm0
	movhps	LC0, %xmm0
	addl	$12, %esp
	ret

or icc:

_test:
        movd      4(%esp), %xmm0                                #5.10
        movd      8(%esp), %xmm3                                #5.10
        xorl      %eax, %eax                                    #5.10
        movd      %eax, %xmm1                                   #5.10
        punpckldq %xmm1, %xmm0                                  #5.10
        movd      %eax, %xmm2                                   #5.10
        punpckldq %xmm2, %xmm3                                  #5.10
        punpckldq %xmm3, %xmm0                                  #5.10
        ret                                                     #5.10

There are still room for improvement, for example the FP variant of the above example:

__m128 test(float a, float b) {
  return _mm_set_ps(0.0, 0.0, b, a);
}

_test:
	movss 8(%esp), %xmm1
	movss 4(%esp), %xmm0
	unpcklps %xmm1, %xmm0
	xorps %xmm1, %xmm1
	movlhps %xmm1, %xmm0
	ret

The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle.

llvm-svn: 27939

14215c36

Apr 21, 2006

fix thinko · 3e62d4b2
Chris Lattner authored Apr 21, 2006
```
llvm-svn: 27935
```
3e62d4b2
add some low-prio notes · e1f9ab7d
Chris Lattner authored Apr 21, 2006
```
llvm-svn: 27934
```
e1f9ab7d

Now generating perfect (I think) code for "vector set" with a single non-zero · e8b51800

Evan Cheng authored Apr 21, 2006

scalar value.

e.g.
        _mm_set_epi32(0, a, 0, 0);
==>
	movd 4(%esp), %xmm0
	pshufd $69, %xmm0, %xmm0

        _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
==>
	movzbw 4(%esp), %ax
	movzwl %ax, %eax
	pxor %xmm0, %xmm0
	pinsrw $5, %eax, %xmm0

llvm-svn: 27923

e8b51800

Apr 20, 2006
- - Added support to turn "vector clear elements", e.g. pand V, <-1, -1, 0, -1> · 60f0b899
  Evan Cheng authored Apr 20, 2006
```
to a vector shuffle.
- VECTOR_SHUFFLE lowering change in preparation for more efficient codegen
of vector shuffle with zero (or any splat) vector.

llvm-svn: 27875
```
  60f0b899
- Handle v2i64 BUILD_VECTOR custom lowering correctly. v2i64 is a legal type, · 15c264b7
  Evan Cheng authored Apr 20, 2006
```
but i64 is not. If possible, change a i64 op to a f64 (e.g. load, constant)
and then cast it back.

llvm-svn: 27849
```
  15c264b7
- isSplatMask() bug: first element can be an undef. · 4a1b0d32
  Evan Cheng authored Apr 19, 2006
```
llvm-svn: 27847
```
  4a1b0d32
- - Added support to do aribitrary 4 wide shuffle with no more than three · a3caaee5
  Evan Cheng authored Apr 19, 2006
```
  instructions.
- Fixed a commute vector_shuff bug.

llvm-svn: 27845
```
  a3caaee5
Apr 19, 2006
- Prefer {p}unpack* and mov*dup over {p}shuf* as well. · 6d5297da
  Evan Cheng authored Apr 19, 2006
```
llvm-svn: 27844
```
  6d5297da
- - Renamed AddedCost to AddedComplexity. · b416a251
  Evan Cheng authored Apr 19, 2006
```
- Added more movhlps and movlhps patterns.

llvm-svn: 27842
```
  b416a251
- Commute vector_shuffle to match more movlhps, movlp{s|d} cases. · 7855e4d0
  Evan Cheng authored Apr 19, 2006
```
llvm-svn: 27840
```
  7855e4d0
- More mov{h|l}p{d|s} patterns. · cc7abc6c
  Evan Cheng authored Apr 19, 2006
```
llvm-svn: 27836
```
  cc7abc6c
- - More mov{h|l}ps patterns. · aeb09ccd
  Evan Cheng authored Apr 19, 2006
```
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These
  are preferred over shufps in most cases.

llvm-svn: 27835
```
  aeb09ccd
- Add a note. · bfab8281
  Chris Lattner authored Apr 19, 2006
```
llvm-svn: 27827
```
  bfab8281
Apr 18, 2006
- - PEXTRW cannot take a memory location as its first source operand. · 3823aa1d
  Evan Cheng authored Apr 18, 2006
```
- PINSRWrmi encoding bug.

llvm-svn: 27818
```
  3823aa1d
- SHUFP{S|D}, PSHUF* encoding bugs. Left out the mask immediate operand. · 43f4ef4f
  Evan Cheng authored Apr 18, 2006
```
llvm-svn: 27817
```
  43f4ef4f