- Sep 02, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 112805
-
Bruno Cardoso Lopes authored
llvm-svn: 112804
-
Bruno Cardoso Lopes authored
llvm-svn: 112799
-
Bruno Cardoso Lopes authored
llvm-svn: 112760
-
- Sep 01, 2010
-
-
Bruno Cardoso Lopes authored
Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment llvm-svn: 112694
-
Bruno Cardoso Lopes authored
llvm-svn: 112689
-
Bruno Cardoso Lopes authored
llvm-svn: 112687
-
Bruno Cardoso Lopes authored
llvm-svn: 112661
-
Bruno Cardoso Lopes authored
llvm-svn: 112657
-
- Aug 31, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 112644
-
Bruno Cardoso Lopes authored
llvm-svn: 112642
-
Bruno Cardoso Lopes authored
Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles llvm-svn: 112570
-
- Aug 28, 2010
-
-
Chris Lattner authored
insertp[sd] $0, which is a noop. Before: _f32: ## @f32 pshufd $1, %xmm1, %xmm2 pshufd $1, %xmm0, %xmm3 addss %xmm2, %xmm3 addss %xmm1, %xmm0 ## kill: XMM0<def> XMM0<kill> XMM0<def> insertps $0, %xmm0, %xmm0 insertps $16, %xmm3, %xmm0 ret after: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movdqa %xmm2, %xmm0 insertps $16, %xmm3, %xmm0 ret The extra movs are due to a random (poor) scheduling decision. llvm-svn: 112379
-
Chris Lattner authored
when the top elements of a vector are undefined. This happens all the time for X86-64 ABI stuff because only the low 2 elements of a 4 element vector are defined. For example, on: _Complex float f32(_Complex float A, _Complex float B) { return A+B; } We used to produce (with SSE2, SSE4.1+ uses insertps): _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $16, %xmm2, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm0 addss %xmm1, %xmm0 pshufd $16, %xmm0, %xmm1 movdqa %xmm2, %xmm0 unpcklps %xmm1, %xmm0 ret We now produce: _f32: ## @f32 movdqa %xmm0, %xmm2 addss %xmm1, %xmm2 pshufd $1, %xmm1, %xmm1 pshufd $1, %xmm0, %xmm3 addss %xmm1, %xmm3 movaps %xmm2, %xmm0 unpcklps %xmm3, %xmm0 ret This implements rdar://8368414 llvm-svn: 112378
-
Chris Lattner authored
a new EltStride variable instead of reusing NumElems variable for a non-obvious purpose. No functionality change. llvm-svn: 112377
-
Bruno Cardoso Lopes authored
Also teach this logic how to handle target specific shuffles if needed, this is necessary while searching recursively for zeroed scalar elements in vector shuffle operands. llvm-svn: 112348
-
- Aug 27, 2010
-
-
Anton Korobeynikov authored
value should be copied to the corresponding shadow reg as well. Patch by Cameron Esfahani! llvm-svn: 112262
-
- Aug 26, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 112218
-
Chris Lattner authored
llvm-svn: 112171
-
Chris Lattner authored
apparently try to support. llvm-svn: 112168
-
- Aug 25, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 112090
-
Anton Korobeynikov authored
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove other flags-clobberring stuff (e.g. cmp instructions) occuring after _alloca call. llvm-svn: 112034
-
Bruno Cardoso Lopes authored
llvm-svn: 112020
-
Bruno Cardoso Lopes authored
teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests llvm-svn: 112017
-
- Aug 24, 2010
-
-
Dan Gohman authored
need not be RIP-relative in small mode. llvm-svn: 111917
-
Bruno Cardoso Lopes authored
llvm-svn: 111890
-
- Aug 23, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 111837
-
Anton Korobeynikov authored
it's COFF emitter which does not support differences of two symbols (and needs to be fixed). GAS is pretty fine with code produced. llvm-svn: 111801
-
Michael J. Spencer authored
llvm-svn: 111792
-
- Aug 21, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 111704
-
Bruno Cardoso Lopes authored
general idea here is to have a group of x86 target specific nodes which are going to be selected during lowering and then directly matched in isel. The commit includes the addition of those specific nodes and a *bunch* of patterns, and incrementally we're going to switch between them and what we have right now. Both the patterns and target specific nodes can change as we move forward with this work. llvm-svn: 111691
-
- Aug 17, 2010
-
-
Anton Korobeynikov authored
- Do not clobber al during variadic calls, this is AMD64 ABI-only feature - Emit wincall64, where necessary Patch by Cameron Esfahani! llvm-svn: 111289
-
- Aug 14, 2010
-
-
Eric Christopher authored
encoding is correct for the built-in assembler. Based on a patch from Chris. llvm-svn: 111083
-
Chris Lattner authored
llvm-svn: 111073
-
- Aug 13, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 111022
-
- Aug 12, 2010
-
-
Bruno Cardoso Lopes authored
term goal here is to be able to match enough of vector_shuffle and build_vector so all avx intrinsics which aren't mapped to their own built-ins but to shufflevector calls can be codegen'd. This is the first (baby) step, support building zeroed vectors. llvm-svn: 110897
-
- Aug 11, 2010
-
-
Dan Gohman authored
avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835
-
Bruno Cardoso Lopes authored
Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744
-
- Aug 10, 2010
-
-
Bruno Cardoso Lopes authored
llvm-svn: 110645
-
- Aug 06, 2010
-
-
Bruno Cardoso Lopes authored
Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394
-