Commits · 08aede25384d4e95882df6320f900f9406d583a8 · Roger Ferrer / llvm-epi-0.8

Sep 03, 2010
- Don't call Predicate_* from X86 target. · 08aede25
  Jakob Stoklund Olesen authored Sep 03, 2010
```
llvm-svn: 112921
```
  08aede25
- Properly emit __chkstk call instead of __alloca on non-mingw windows targets. · a5a64555
  Anton Korobeynikov authored Sep 02, 2010
```
Patch by Cameron Esfahani!

llvm-svn: 112902
```
  a5a64555
- Move insertps mask decoding to header file · 02a05a6a
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112896
```
  02a05a6a
- Revert win64 changes. They seem to be incomplete · a689c5b2
  Anton Korobeynikov authored Sep 02, 2010
```
llvm-svn: 112885
```
  a689c5b2
- Properly allocate win64 shadow reg area. · 56291f7e
  Anton Korobeynikov authored Sep 02, 2010
```
Patch by Jan Sjodin!

llvm-svn: 112875
```
  56291f7e
Sep 02, 2010
- Move decoding of insertps back to avoid unused warnings in x86 isel lowering,... · 814a69c3
  Bruno Cardoso Lopes authored Sep 02, 2010
```
Move decoding of insertps back to avoid unused warnings in x86 isel lowering, and fix movlhps/movhlps to decode 4 elements shuffles

llvm-svn: 112869
```
  814a69c3
- Don't narrow the load and store in a load+twiddle+store sequence unless · 3c9b5f39
  Dan Gohman authored Sep 02, 2010
```
there are clearly no stores between the load and the store. This fixes
this miscompile reported as PR7833.

This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is
safe, but awkward to prove safe. Move it to X86's README.txt.

llvm-svn: 112861
```
  3c9b5f39
- Move x86 specific shuffle mask decoding to its own header, it's also going to... · c79f5017
  Bruno Cardoso Lopes authored Sep 02, 2010
```
Move x86 specific shuffle mask decoding to its own header, it's also going to be used elsewhere. Also trim trailing whitespaces

llvm-svn: 112846
```
  c79f5017
- Replace unpckl_undef and unpckh_undef matching with target specific opcodes · 489613f1
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112806
```
  489613f1
- Move condition out to prepare for more matching · e4e4be38
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112805
```
  e4e4be38
- Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it · bf7fd146
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112804
```
  bf7fd146
- become more strict about when it's safe to use X86ISD::MOVLPS · 6a7f6344
  Bruno Cardoso Lopes authored Sep 02, 2010
```
llvm-svn: 112799
```
  6a7f6344
- Revert r112689, avoid those kind of checks cause they mess up with mmx · 04c25c15
  Bruno Cardoso Lopes authored Sep 01, 2010
```
llvm-svn: 112760
```
  04c25c15
- Using target specific nodes for shuffle nodes makes the mask · fea81b48
  Bruno Cardoso Lopes authored Sep 01, 2010
```
check more strict, breaking some cases not checked in the
testsuite, but also exposes some foldings not done before,
as this example:

  movaps  (%rdi), %xmm0
  movaps  (%rax), %xmm1
  movaps  %xmm0, %xmm2
  movss %xmm1, %xmm2
  shufps  $36, %xmm2, %xmm0

now is generated as:

  movaps  (%rdi), %xmm0
  movaps  %xmm0, %xmm1
  movlps  (%rax), %xmm1
  shufps  $36, %xmm1, %xmm0

llvm-svn: 112753
```
  fea81b48
Sep 01, 2010
- Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching... · b3825216
  Bruno Cardoso Lopes authored Sep 01, 2010
```
Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment

llvm-svn: 112694
```
  b3825216
- minor change, simplify some logic · 6aaebe87
  Bruno Cardoso Lopes authored Sep 01, 2010
```
llvm-svn: 112689
```
  6aaebe87
- Move some functions around so they can be used for some other to come function · 2b025707
  Bruno Cardoso Lopes authored Sep 01, 2010
```
llvm-svn: 112687
```
  2b025707
- Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes · 4b56d872
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112661
```
  4b56d872
- Use x86 specific MOVSHDUP node and add more patterns to match it · 61996ef8
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112657
```
  61996ef8
Aug 31, 2010
- Make %EFLAGS unallocatable. · 33e9fce2
  Jakob Stoklund Olesen authored Aug 31, 2010
```
No CCR virtual registers should exist, and %EFLAGS is used in ways that can
surprise RegAllocFast.

llvm-svn: 112650
```
  33e9fce2
- Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments · 5de15ce4
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112644
```
  5de15ce4
- Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes · 03e4c353
  Bruno Cardoso Lopes authored Aug 31, 2010
```
llvm-svn: 112642
```
  03e4c353
- Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the... · dfd9dd5d
  Bruno Cardoso Lopes authored Aug 31, 2010
```
Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles

llvm-svn: 112570
```
  dfd9dd5d
Aug 29, 2010
- A couple of small missed optimizations. · f75de6ea
  Eli Friedman authored Aug 29, 2010
```
llvm-svn: 112411
```
  f75de6ea
- add a bunch more common shuffles to the instprinter. · 38ccc8b8
  Chris Lattner authored Aug 29, 2010
```
llvm-svn: 112397
```
  38ccc8b8
Aug 28, 2010

I have manually decoded the imm field of an insertps one too many · 7a05e6dc

Chris Lattner authored Aug 28, 2010

times.  This patch causes llc and llvm-mc (which both default to
verbose-asm) to print out comments after a few common shuffle 
instructions which indicates the shuffle mask, e.g.:

	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]

This is carefully factored to keep the information extraction (of the
shuffle mask) separate from the printing logic.  I plan to move the
extraction part out somewhere else at some point for other parts of
the x86 backend that want to introspect on the behavior of shuffles.

llvm-svn: 112387

7a05e6dc

fix the buildvector->insertp[sd] logic to not always create a redundant · 94656b1c

Chris Lattner authored Aug 28, 2010

insertp[sd] $0, which is a noop.  Before:

_f32:                                   ## @f32
	pshufd	$1, %xmm1, %xmm2
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm2, %xmm3
	addss	%xmm1, %xmm0
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
	insertps	$0, %xmm0, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

after:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movdqa	%xmm2, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

The extra movs are due to a random (poor) scheduling decision.

llvm-svn: 112379

94656b1c

fix the BuildVector -> unpcklps logic to not do pointless shuffles · bcb6090a

Chris Lattner authored Aug 28, 2010

when the top elements of a vector are undefined.  This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined.  For example, on:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

We used to produce (with SSE2, SSE4.1+ uses insertps):

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

We now produce:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movaps	%xmm2, %xmm0
	unpcklps	%xmm3, %xmm0
	ret

This implements rdar://8368414

llvm-svn: 112378

bcb6090a

improve comments in the unpcklps generating logic, introduce · 96db6e66

Chris Lattner authored Aug 28, 2010

a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose.  No functionality change.

llvm-svn: 112377

96db6e66

Clean up the logic of vector shuffles -> vector shifts. · a982aa24

Bruno Cardoso Lopes authored Aug 28, 2010

Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.

llvm-svn: 112348

a982aa24

Aug 27, 2010
- Properly handle passing of FP stuff to varargs function on Win64: · c0b36921
  Anton Korobeynikov authored Aug 27, 2010
```
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
```
  c0b36921
- X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to... · 1844a71e
  Daniel Dunbar authored Aug 27, 2010
```
X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler.

llvm-svn: 112250
```
  1844a71e
- Simplify eliminateFrameIndex() interface back down now that PEI doesn't need · 6a770669
  Jim Grosbach authored Aug 26, 2010
```
to try to re-use scavenged frame index reference registers. rdar://8277890

llvm-svn: 112241
```
  6a770669
Aug 26, 2010
- zap the now unused MVT::getIntVectorWithNumElements · e25ba0c7
  Bruno Cardoso Lopes authored Aug 26, 2010
```
llvm-svn: 112218
```
  e25ba0c7
- Fix comment typos. · a967c42a
  Bob Wilson authored Aug 26, 2010
```
llvm-svn: 112202
```
  a967c42a
- implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1. · eb2cc0ce
  Chris Lattner authored Aug 26, 2010
```
llvm-svn: 112171
```
  eb2cc0ce
- fix sse1 only codegen in x86-64 mode, which is something we · cc60609c
  Chris Lattner authored Aug 26, 2010
```
apparently try to support.

llvm-svn: 112168
```
  cc60609c
- Fix PR7748 without using microsoft extensions · 184eaea8
  Bruno Cardoso Lopes authored Aug 26, 2010
```
llvm-svn: 112128
```
  184eaea8
- we should pattern match the SSE complex arithmetic ops. · aecf47a5
  Chris Lattner authored Aug 25, 2010
```
llvm-svn: 112109
```
  aecf47a5
Aug 25, 2010
- Revert this for now, PUNPCKLDQ dont operate on v4f32 · d4085f6e
  Bruno Cardoso Lopes authored Aug 25, 2010
```
llvm-svn: 112090
```
  d4085f6e