Commits · 81eb193f2ec80fa46c9caa7bc9f4164e9f580256 · Roger Ferrer / llvm-epi-0.8

Jul 29, 2011
- Match VPERMIL masks more strictly and update the target specific mask · 81eb193f
  Bruno Cardoso Lopes authored Jul 29, 2011
```
generation to always catch the weird cases.

llvm-svn: 136453
```
  81eb193f
- Add DecodeShuffle shuffle support for VPERMIPD variantes · 795f5585
  Bruno Cardoso Lopes authored Jul 29, 2011
```
llvm-svn: 136452
```
  795f5585
- Fix a bug while generating target specific VPERMIL masks: skip · c00f6728
  Bruno Cardoso Lopes authored Jul 29, 2011
```
undef mask elements. This fixes PR10529.

llvm-svn: 136450
```
  c00f6728
- Enable usage of SSE4 extracts and inserts in their 128-bit AVX forms. · b9ba465d
  Bruno Cardoso Lopes authored Jul 29, 2011
```
Also tidy up code a bit.

llvm-svn: 136449
```
  b9ba465d
- Cleanup PALIGNR handling and remove the old palign pattern fragment. · 6aee3884
  Bruno Cardoso Lopes authored Jul 29, 2011
```
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489

llvm-svn: 136448
```
  6aee3884
Jul 28, 2011
- Invert the subvector insertion to be more likely to be taken as a COPY · 8c19a8b5
  Bruno Cardoso Lopes authored Jul 28, 2011
```
llvm-svn: 136324
```
  8c19a8b5
- Add SINT_TO_FP and FP_TO_SINT support for v8i32 types. Also move · 9e2a3012
  Bruno Cardoso Lopes authored Jul 28, 2011
```
a convert pattern close to the instruction definition.

llvm-svn: 136320
```
  9e2a3012
- Code generation for 'fence' instruction. · 26a48485
  Eli Friedman authored Jul 27, 2011
```
llvm-svn: 136283
```
  26a48485
Jul 27, 2011

Explicitly cast narrowing conversions inside {}s that will become errors in · 6381c010
Jeffrey Yasskin authored Jul 27, 2011
```
C++0x.

llvm-svn: 136211
```
6381c010
Move some code around to open opportunity for more shuffle matching · f9324f4f
Bruno Cardoso Lopes authored Jul 27, 2011
```
llvm-svn: 136201
```
f9324f4f

The vpermilps and vpermilpd have different behaviour regarding the · 27a30a77

Bruno Cardoso Lopes authored Jul 27, 2011

usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

llvm-svn: 136200

27a30a77

Add a neat little two's complement hack for x86. · 124ac2b9

Benjamin Kramer authored Jul 26, 2011

On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can
fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add,
which can be commuted and encoded efficiently.

This code is generated for __builtin_clz and friends.

llvm-svn: 136167

124ac2b9

Recognize unpckh* masks and match 256-bit versions. The new versions are · f8fe47bd
Bruno Cardoso Lopes authored Jul 26, 2011
```
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

llvm-svn: 136157
```
f8fe47bd

Jul 26, 2011
- Prevent x86-specific DAGCombine from creating nodes with illegal type (which... · 93dc04d5
  Eli Friedman authored Jul 26, 2011
```
Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected).  Fixes a minor isel issue that was breaking the testcase from r136130.

llvm-svn: 136148
```
  93dc04d5
- More movsldup/movshdup cleanup. Rewrite the mask matching function and add · d77b3831
  Bruno Cardoso Lopes authored Jul 26, 2011
```
support for 256-bit versions (but no instruction selection yet, coming next).

llvm-svn: 136050
```
  d77b3831
- More cleanup, subtarget info isn't used here. · 5b268a4b
  Bruno Cardoso Lopes authored Jul 26, 2011
```
llvm-svn: 136049
```
  5b268a4b
- Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 · 9212bf27
  Bruno Cardoso Lopes authored Jul 25, 2011
```
This also fixes PR10452

llvm-svn: 136004
```
  9212bf27
- - Handle special scalar_to_vector case: splats. Using a native 128-bit · 123dff0f
  Bruno Cardoso Lopes authored Jul 25, 2011
```
shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

llvm-svn: 136002
```
  123dff0f
- Reintroduce r135730, this is indeed the right approach, there is no · 276eb8de
  Bruno Cardoso Lopes authored Jul 25, 2011
```
native 256-bit vector instruction to do scalar_to_vector.

llvm-svn: 136001
```
  276eb8de
Jul 25, 2011
- Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. · ea8c66fe
  Eli Friedman authored Jul 25, 2011
```
Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle.

llvm-svn: 135980
```
  ea8c66fe
Jul 22, 2011

Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 · 77242dd5
Rafael Espindola authored Jul 22, 2011
```
too. Patch by Jeff Muizelaar.

llvm-svn: 135789
```
77242dd5

Fix x86's XALUO lowering to return its replacement values instead · c535278c

Dan Gohman authored Jul 22, 2011

of doing the RAUW calls for the overflow value itself. This makes
it more consistent with how the rest of LegalizeDAG works.

llvm-svn: 135788

c535278c

GCC complains about the angle of this line. · 959b7e9d
Benjamin Kramer authored Jul 22, 2011
```
Remove the escaped newline.

llvm-svn: 135739
```
959b7e9d

Remove the 128-bit special handling from SCALAR_TO_VECTOR. This isn't · 18721738

Bruno Cardoso Lopes authored Jul 22, 2011

the way to go. Doing this here will prevent several node matches later,
and would have to force looking all the way through several
VINSERTF128/VEXTRACTF128 chains to optimize simple things.

llvm-svn: 135730

18721738

-Inspected a AVX code block added by someone in early Feb. This was never used · 612e5617

Bruno Cardoso Lopes authored Jul 22, 2011

and was actually very wrong, fix it and make it simpler. Also remove the
ConcatVectors function, which is unused now.

- Fix a introduction of useless nodes in r126664 and r126264. The
VUNPCKL* should never be introduced cause we don't want duplicate
nodes for 128 AVX and non-AVX modes, the actual instruction
difference only exists during isel, but not for target specific DAG
nodes. We only introduce V* target nodes when there is no 128-bit
version already there.

- Fix a fragile test and make it more useful.

llvm-svn: 135729

612e5617

Add a DAGCombine for transforming 128->256 casts into a simple · 91eff514
Bruno Cardoso Lopes authored Jul 22, 2011
```
vxorps + vinsertf128 pair of instructions

llvm-svn: 135727
```
91eff514
Introduce a new function to lower 256-bit vectors which are not · dbebd012
Bruno Cardoso Lopes authored Jul 22, 2011
```
direclty supported and should be promoted and handled by smaller
shuffles

llvm-svn: 135726
```
dbebd012
Rename function to be more specific and be more strict about its usage · 95d03772
Bruno Cardoso Lopes authored Jul 22, 2011
```
llvm-svn: 135725
```
95d03772

Jul 21, 2011

- Register v16i16 as valid VR256 register class · 178fb406

Bruno Cardoso Lopes authored Jul 21, 2011

- Add more bitcasts for v16i16
- Since 135661 and 135662 already added the splat logic,
just add one more splat test for v16i16

llvm-svn: 135663

178fb406

Add support for 256-bit versions of VPERMIL instruction. This is a new · b878caa5

Bruno Cardoso Lopes authored Jul 21, 2011

instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

llvm-svn: 135662

b878caa5

Improve splat promotion to handle AVX types: v32i8 and v16i16. Also · fb4920eb

Bruno Cardoso Lopes authored Jul 21, 2011

refactor the code and add a bunch of comments. The final shuffle
emitted by handling 256-bit types is suitable for the VPERM shuffle
instruction which is going to be introduced in a next commit (with
a testcase which cover this commit)

llvm-svn: 135661

fb4920eb

Tidy up code · 0bdeacf0
Bruno Cardoso Lopes authored Jul 21, 2011
```
llvm-svn: 135656
```
0bdeacf0

Jul 20, 2011

Goodbye TargetAsmInfo. This eliminate last bit of CodeGen and Target in llvm-mc. · bbf3b0de

Evan Cheng authored Jul 20, 2011

There is still a bit more refactoring left to do in Targets. But we are now very
close to fixing all the layering issues in MC.

llvm-svn: 135611

bbf3b0de

Jul 18, 2011
- Sink getDwarfRegNum, getLLVMRegNum, getSEHRegNum from TargetRegisterInfo down · d60fa58b
  Evan Cheng authored Jul 18, 2011
```
to MCRegisterInfo. Also initialize the mapping at construction time.

This patch eliminate TargetRegisterInfo from TargetAsmInfo. It's another step
towards fixing the layering violation.

llvm-svn: 135424
```
  d60fa58b
- land David Blaikie's patch to de-constify Type, with a few tweaks. · 229907cd
  Chris Lattner authored Jul 18, 2011
```
llvm-svn: 135375
```
  229907cd
Jul 16, 2011

Fix a couple of things: · 8df9cfc2

Bruno Cardoso Lopes authored Jul 15, 2011

1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us
canonize the loads and handle things the same way we use to handle
for 128-bit registers. Despite of what one of the removed comments
explained, the load promotion would not mess with VPERM, it's only a
matter of doing the appropriate bitcasts when this instructions comes
to be introduced. Also make LOAD v8i32 legal.

2) Doing 1) exposed two bugs:
- v4i64 was being promoted to itself for several opcodes (introduced
in r124447 by David Greene) causing endless recursion and the stack to
explode.
- there was no support for allOnes BUILD_VECTORs and ANDNP would fail to
match because it was generating early target constant pools during
lowering.

3) The testcases are already checked-in, doing 1) exposed the
bugs in the current testcases.

4) Tidy up code to be more clear and explicit about AVX.

llvm-svn: 135313

8df9cfc2

Jul 14, 2011

Check register class matching instead of width of type matching · 92464be2

Eric Christopher authored Jul 14, 2011

when determining validity of matching constraint. Allow i1
types access to the GR8 reg class for x86.

Fixes PR10352 and rdar://9777108

llvm-svn: 135180

92464be2

· 771f2967

Nadav Rotem authored Jul 14, 2011

[VECTOR-SELECT]
During type legalization we often use the SIGN_EXTEND_INREG SDNode.
When this SDNode is legalized during the LegalizeVector phase, it is
scalarized because non-simple types are automatically marked to be expanded.
In this patch we add support for lowering SIGN_EXTEND_INREG manually.
This fixes CodeGen/X86/vec_sext.ll when running with the '-promote-elements'
flag.

llvm-svn: 135144

771f2967

Jul 13, 2011
- Make X86ISD::ANDNP more general and Codegen 256-bit VANDNP. A more · 9613b649
  Bruno Cardoso Lopes authored Jul 13, 2011
```
general version of X86ISD::ANDNP also opened the room for a little bit
of refactoring.

llvm-svn: 135088
```
  9613b649
- The target specific node PANDN name is misleading. That happens because · 7ba479d2
  Bruno Cardoso Lopes authored Jul 13, 2011
```
it's later selected to a ANDNPD/ANDNPS instruction instead of the PANDN
instruction. Rename it.

llvm-svn: 135087
```
  7ba479d2