Commits · 572c9aaf537b5600e37ea65aee593cf540de1809 · Roger Ferrer / llvm-epi-0.8

Aug 11, 2011

Use the splat index to generate the desired shuffle. Otherwise we · 572c9aaf
Bruno Cardoso Lopes authored Aug 11, 2011
```
could only get undefs and the vector shuffle becomes an undef,
generating wrong code.

llvm-svn: 137295
```
572c9aaf

Fix X86TargetLowering::LowerExternalSymbol so that it actually works in... · 3ae39f8a

Eli Friedman authored Aug 11, 2011

Fix X86TargetLowering::LowerExternalSymbol so that it actually works in non-trivial cases. This hasn't been an issue before because the function isn't normally called (but apparently is used to generate a tail-call to sin() on ELF x86-32 with PIC and SSE2).

Fixes PR9693.

llvm-svn: 137292

3ae39f8a

Aug 10, 2011
- When performing a truncating store, it is sometimes possible to rearrange the · 410a11fe
  Nadav Rotem authored Aug 10, 2011
  
  data in-register prior to saving to memory. When we reorder the data in memory we prevent the need to save multiple scalars to memory, making a single regular store. llvm-svn: 137238
  410a11fe
- Fix a bug in vpermilps mask checking. Fix PR10560 · 278ffd7d
  Bruno Cardoso Lopes authored Aug 10, 2011
  
  llvm-svn: 137194
  278ffd7d
- Add 256-bit support for v8i32, v4i64 and v4f64 ISD::SELECT. Fix PR10556 · 72323966
  Bruno Cardoso Lopes authored Aug 09, 2011
  
  llvm-svn: 137179
  72323966
- Use fp unpack instructions to unpack int types. Until we have AVX2, this · 6963062a
  Bruno Cardoso Lopes authored Aug 09, 2011
  
  is the best we can do for these patterns. This fix PR10554. llvm-svn: 137161
  6963062a
Aug 09, 2011
- Revert r137114 · 24dd1d4a
  Bruno Cardoso Lopes authored Aug 09, 2011
  
  llvm-svn: 137127
  24dd1d4a
- Handle sitofp between v4f64 <- v4i32. Fix PR10559 · ad3453cf
  Bruno Cardoso Lopes authored Aug 09, 2011
  
  llvm-svn: 137114
  ad3453cf
- Make LowerVSETCC aware of AVX types and add patterns to match them. · af6a8548
  Bruno Cardoso Lopes authored Aug 09, 2011
  
  llvm-svn: 137090
  af6a8548
Aug 08, 2011
- Add support for several vector shifts operations while in AVX mode. Fix PR10581 · c96953c1
  Bruno Cardoso Lopes authored Aug 08, 2011
  
  llvm-svn: 137067
  c96953c1
Aug 04, 2011
- Fix an obvious type. Patch by Ivan Krasin. · 19e3f805
  Evan Cheng authored Aug 04, 2011
  
  llvm-svn: 136899
  19e3f805
- Only access both operands of an INSERT_SUBVECTOR if it is an INSERT_SUBVECTOR. · e234f6ae
  Bill Wendling authored Aug 04, 2011
  
  Fixes PR10527. llvm-svn: 136853
  e234f6ae
Aug 03, 2011
- Remove unused variables. · 103e2ec2
  Benjamin Kramer authored Aug 03, 2011
  
  llvm-svn: 136803
  103e2ec2
Aug 02, 2011

Don't create a ridiculous EXTRACT_ELEMENT. PR10563. · 04c5025c

Eli Friedman authored Aug 02, 2011

The testcase looks extremely fragile, so I'm adding an assertion which should catch any cases like this.

llvm-svn: 136711

04c5025c

Make this kind of lowering to be supported by 256-bit instructions: · 5ada9081

Bruno Cardoso Lopes authored Aug 02, 2011

  shuffle (scalar_to_vector (load (ptr + 4))), undef, <0, 0, 0, 0>
To:
  shuffle (vload ptr)), undef, <1, 1, 1, 1>
Fix PR10494

llvm-svn: 136691

5ada9081

Aug 01, 2011
- Add v4f64 -> v2f32 fp_round support. Also add a testcase to exercise · a8e36738
  Bruno Cardoso Lopes authored Aug 01, 2011
  
  the legalizer. This commit together with the two previous ones fixes PR10495. llvm-svn: 136654
  a8e36738
- Teach PreprocessISelDAG to be aware of vector types and to not process them. · 616fe605
  Bruno Cardoso Lopes authored Aug 01, 2011
  
  llvm-svn: 136653
  616fe605
- Lower CONCAT_VECTORS to use two VINSERTF128 instructions instead of · bd30a4b5
  Bruno Cardoso Lopes authored Aug 01, 2011
  
  using a stack store. llvm-svn: 136652
  bd30a4b5
- Since vectors with all ones can't be created with a 256-bit instruction, · 7513939d
  Bruno Cardoso Lopes authored Aug 01, 2011
  
  avoid returning early for v8i32 types, which would only be valid for vector with all zeros. Also split the handling of zeros and ones into separate checking logic since they are handled differently. This fixes PR10547 llvm-svn: 136642
  7513939d
Jul 29, 2011
- Misc optimizer+codegen work for 'cmpxchg' and 'atomicrmw'. They appear to be · adec587d
  Eli Friedman authored Jul 29, 2011
  
  working on x86 (at least for trivial testcases); other architectures will need more work so that they actually emit the appropriate instructions for orderings stricter than 'monotonic'. (As far as I can tell, the ARM, PPC, Mips, and Alpha backends need such changes.) llvm-svn: 136457
  adec587d
- Fix two tests that I crashed in the previous commits. The mask elts · 65ce5ea3
  Bruno Cardoso Lopes authored Jul 29, 2011
  
  on the second half must be reindexed. llvm-svn: 136454
  65ce5ea3
- Match VPERMIL masks more strictly and update the target specific mask · 81eb193f
  Bruno Cardoso Lopes authored Jul 29, 2011
  
  generation to always catch the weird cases. llvm-svn: 136453
  81eb193f
- Add DecodeShuffle shuffle support for VPERMIPD variantes · 795f5585
  Bruno Cardoso Lopes authored Jul 29, 2011
  
  llvm-svn: 136452
  795f5585
- Fix a bug while generating target specific VPERMIL masks: skip · c00f6728
  Bruno Cardoso Lopes authored Jul 29, 2011
  
  undef mask elements. This fixes PR10529. llvm-svn: 136450
  c00f6728
- Enable usage of SSE4 extracts and inserts in their 128-bit AVX forms. · b9ba465d
  Bruno Cardoso Lopes authored Jul 29, 2011
  
  Also tidy up code a bit. llvm-svn: 136449
  b9ba465d
- Cleanup PALIGNR handling and remove the old palign pattern fragment. · 6aee3884
  Bruno Cardoso Lopes authored Jul 29, 2011
  
  Also make PALIGNR masks to don't match 256-bits, which isn't supported It's also a step to solve PR10489 llvm-svn: 136448
  6aee3884
Jul 28, 2011
- Invert the subvector insertion to be more likely to be taken as a COPY · 8c19a8b5
  Bruno Cardoso Lopes authored Jul 28, 2011
  
  llvm-svn: 136324
  8c19a8b5
- Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move · 9e2a3012
  Bruno Cardoso Lopes authored Jul 28, 2011
  
  a convert pattern close to the instruction definition. llvm-svn: 136320
  9e2a3012
- Code generation for 'fence' instruction. · 26a48485
  Eli Friedman authored Jul 27, 2011
  
  llvm-svn: 136283
  26a48485
Jul 27, 2011

Explicitly cast narrowing conversions inside {}s that will become errors in · 6381c010
Jeffrey Yasskin authored Jul 27, 2011
```
C++0x.

llvm-svn: 136211
```
6381c010
Move some code around to open opportunity for more shuffle matching · f9324f4f
Bruno Cardoso Lopes authored Jul 27, 2011
```
llvm-svn: 136201
```
f9324f4f

The vpermilps and vpermilpd have different behaviour regarding the · 27a30a77

Bruno Cardoso Lopes authored Jul 27, 2011

usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

llvm-svn: 136200

27a30a77

Add a neat little two's complement hack for x86. · 124ac2b9

Benjamin Kramer authored Jul 26, 2011

On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can
fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add,
which can be commuted and encoded efficiently.

This code is generated for __builtin_clz and friends.

llvm-svn: 136167

124ac2b9

Recognize unpckh* masks and match 256-bit versions. The new versions are · f8fe47bd
Bruno Cardoso Lopes authored Jul 26, 2011
```
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

llvm-svn: 136157
```
f8fe47bd

Jul 26, 2011
- Prevent x86-specific DAGCombine from creating nodes with illegal type (which... · 93dc04d5
  Eli Friedman authored Jul 26, 2011
  
  Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148
  93dc04d5
- More movsldup/movshdup cleanup. Rewrite the mask matching function and add · d77b3831
  Bruno Cardoso Lopes authored Jul 26, 2011
  
  support for 256-bit versions (but no instruction selection yet, coming next). llvm-svn: 136050
  d77b3831
- More cleanup, subtarget info isn't used here. · 5b268a4b
  Bruno Cardoso Lopes authored Jul 26, 2011
  
  llvm-svn: 136049
  5b268a4b
- Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 · 9212bf27
  Bruno Cardoso Lopes authored Jul 25, 2011
  
  This also fixes PR10452 llvm-svn: 136004
  9212bf27
- - Handle special scalar_to_vector case: splats. Using a native 128-bit · 123dff0f
  Bruno Cardoso Lopes authored Jul 25, 2011
  
  shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002
  123dff0f
- Reintroduce r135730, this is indeed the right approach, there is no · 276eb8de
  Bruno Cardoso Lopes authored Jul 25, 2011
  
  native 256-bit vector instruction to do scalar_to_vector. llvm-svn: 136001
  276eb8de