Commits · e4e4be3885e206a9733809ae8afaf2697cd740c2 · Roger Ferrer / llvm-epi-0.8

Sep 02, 2010
- Move condition out to prepare for more matching · e4e4be38
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112805
```
  e4e4be38
- Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it · bf7fd146
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112804
```
  bf7fd146
- become more strict about when it's safe to use X86ISD::MOVLPS · 6a7f6344
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112799
```
  6a7f6344
- Revert r112689, avoid those kind of checks cause they mess up with mmx · 04c25c15
  Bruno Cardoso Lopes authored Sep 01, 2010
```
llvm-svn: 112760
```
  04c25c15
Sep 01, 2010
- Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching... · b3825216
  Bruno Cardoso Lopes authored Sep 01, 2010
```
Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment

llvm-svn: 112694
```
  b3825216
- minor change, simplify some logic · 6aaebe87
  Bruno Cardoso Lopes authored Sep 01, 2010
```
llvm-svn: 112689
```
  6aaebe87
- Move some functions around so they can be used for some other to come function · 2b025707
  Bruno Cardoso Lopes authored Sep 01, 2010
```
llvm-svn: 112687
```
  2b025707
- Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes · 4b56d872
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112661
```
  4b56d872
- Use x86 specific MOVSHDUP node and add more patterns to match it · 61996ef8
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112657
```
  61996ef8
Aug 31, 2010
- Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments · 5de15ce4
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112644
```
  5de15ce4
- Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes · 03e4c353
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112642
```
  03e4c353
- Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the... · dfd9dd5d
  Bruno Cardoso Lopes authored Aug 31, 2010
```
Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles

llvm-svn: 112570
```
  dfd9dd5d
Aug 28, 2010

fix the buildvector->insertp[sd] logic to not always create a redundant · 94656b1c

Chris Lattner authored Aug 28, 2010

insertp[sd] $0, which is a noop.  Before:

_f32:                                   ## @f32
	pshufd	$1, %xmm1, %xmm2
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm2, %xmm3
	addss	%xmm1, %xmm0
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
	insertps	$0, %xmm0, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

after:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movdqa	%xmm2, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

The extra movs are due to a random (poor) scheduling decision.

llvm-svn: 112379

94656b1c

fix the BuildVector -> unpcklps logic to not do pointless shuffles · bcb6090a

Chris Lattner authored Aug 28, 2010

when the top elements of a vector are undefined.  This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined.  For example, on:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

We used to produce (with SSE2, SSE4.1+ uses insertps):

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

We now produce:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movaps	%xmm2, %xmm0
	unpcklps	%xmm3, %xmm0
	ret

This implements rdar://8368414

llvm-svn: 112378

bcb6090a

improve comments in the unpcklps generating logic, introduce · 96db6e66

Chris Lattner authored Aug 28, 2010

a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose.  No functionality change.

llvm-svn: 112377

96db6e66

Clean up the logic of vector shuffles -> vector shifts. · a982aa24

Bruno Cardoso Lopes authored Aug 28, 2010

Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.

llvm-svn: 112348

a982aa24

Aug 27, 2010
- Properly handle passing of FP stuff to varargs function on Win64: · c0b36921
  Anton Korobeynikov authored Aug 27, 2010
```
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
```
  c0b36921
Aug 26, 2010
- zap the now unused MVT::getIntVectorWithNumElements · e25ba0c7
  Bruno Cardoso Lopes authored Aug 26, 2010
```
llvm-svn: 112218
```
  e25ba0c7
- implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. · eb2cc0ce
  Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112171
```
  eb2cc0ce
- fix sse1 only codegen in x86-64 mode, which is something we · cc60609c
  Chris Lattner authored Aug 26, 2010
```
apparently try to support.

llvm-svn: 112168
```
  cc60609c
Aug 25, 2010
- Revert this for now, PUNPCKLDQ dont operate on v4f32 · d4085f6e
  Bruno Cardoso Lopes authored Aug 25, 2010
```
llvm-svn: 112090
```
  d4085f6e
- Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there. · b3b53eca
  Anton Korobeynikov authored Aug 25, 2010
```
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.

llvm-svn: 112034
```
  b3b53eca
- PUNPCKLDQ should also be used for v4f32 · 0770d257
  Bruno Cardoso Lopes authored Aug 25, 2010
```
llvm-svn: 112020
```
  0770d257
- teach lowering to get target specific nodes for pshufd, emulating the same... · 2e45d522
  Bruno Cardoso Lopes authored Aug 25, 2010
```
teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests

llvm-svn: 112017
```
  2e45d522
Aug 24, 2010
- Fix X86's isLegalAddressingMode to recognize that static addresses · c88fda47
  Dan Gohman authored Aug 24, 2010
```
need not be RIP-relative in small mode.

llvm-svn: 111917
```
  c88fda47
- Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments · 758d7b1f
  Bruno Cardoso Lopes authored Aug 24, 2010
```
llvm-svn: 111890
```
  758d7b1f
Aug 23, 2010
- Start using target speficic nodes for shuffles: pshufhw and pshuflw · 264d90ff
  Bruno Cardoso Lopes authored Aug 23, 2010
```
llvm-svn: 111837
```
  264d90ff
- Revert invalid r111792. Jump tables are not broken on x86-64 / coff, · cbbe4501
  Anton Korobeynikov authored Aug 23, 2010
```
it's COFF emitter which does not support differences of two symbols
(and needs to be fixed). GAS is pretty fine with code produced.

llvm-svn: 111801
```
  cbbe4501
- Workaround broken jump tables on x86-64 COFF. · e8723123
  Michael J. Spencer authored Aug 23, 2010
```
llvm-svn: 111792
```
  e8723123
Aug 21, 2010

Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly · 9f20e7a1
Bruno Cardoso Lopes authored Aug 21, 2010
```
llvm-svn: 111704
```
9f20e7a1

This is the first step towards refactoring the x86 vector shuffle code. The · 6f3b38a8

Bruno Cardoso Lopes authored Aug 20, 2010

general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.

llvm-svn: 111691

6f3b38a8

Aug 17, 2010

More fixes for win64: · 231ab847

Anton Korobeynikov authored Aug 17, 2010

  - Do not clobber al during variadic calls, this is AMD64 ABI-only feature
  - Emit wincall64, where necessary
Patch by Cameron Esfahani!

llvm-svn: 111289

231ab847

Aug 14, 2010
- Rework how the non-sse2 memory barrier is lowered so that the · 54194bd1
  Eric Christopher authored Aug 14, 2010
```
encoding is correct for the built-in assembler.

Based on a patch from Chris.

llvm-svn: 111083
```
  54194bd1
- improve indentation · 2f6c3434
  Chris Lattner authored Aug 14, 2010
```
llvm-svn: 111073
```
  2f6c3434
Aug 13, 2010
- Fix comment to reflect code, and remove an unused argument · 081861b6
  Bruno Cardoso Lopes authored Aug 13, 2010
```
llvm-svn: 111022
```
  081861b6
Aug 12, 2010

Begin to support some vector operations for AVX 256-bit intructions. The long · 7306c868

Bruno Cardoso Lopes authored Aug 12, 2010

term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.

llvm-svn: 110897

7306c868

Aug 11, 2010

Use ISD::ADD instead of ISD::SUB with a negated constant. This · 5531aa4d

Dan Gohman authored Aug 11, 2010

avoids trouble if the return type of TD->getPointerSize() is
changed to something which doesn't promote to a signed type,
and is simpler anyway.

Also, use getCopyFromReg instead of getRegister to read a
physical register's value.

llvm-svn: 110835

5531aa4d

Add AVX matching patterns to Packed Bit Test intrinsics. · 91d61df3

Bruno Cardoso Lopes authored Aug 10, 2010

Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.

llvm-svn: 110744

91d61df3

Aug 10, 2010
- Support AVX 256-bit load and store intrinsics · 85da72a8
  Bruno Cardoso Lopes authored Aug 10, 2010
```
llvm-svn: 110645
```
  85da72a8
Aug 06, 2010

Support very basic (doesn't include ABI support in the front-end, varags, ...)... · 77954bdf

Bruno Cardoso Lopes authored Aug 05, 2010

Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX

llvm-svn: 110394

77954bdf