Commits · ca29bcfc10c9a9d0c32ff0e0d3211892ca7decde · Roger Ferrer / llvm-epi-0.8

Jan 30, 2012

Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic... · ca29bcfc

Craig Topper authored Jan 30, 2012

Move some XOP patterns into instruction definition. Replae VPCMOV intrinsic patterns with custom lowering to a target specific nodes.

llvm-svn: 149216

ca29bcfc

Jan 25, 2012

Custom lower PSIGN and PSHUFB intrinsics to their corresponding target... · 78349009

Craig Topper authored Jan 25, 2012

Custom lower PSIGN and PSHUFB intrinsics to their corresponding target specific nodes so we can remove the isel patterns.

llvm-svn: 148933

78349009

Jan 24, 2012

Add comments near load pattern fragments indicating that all integer vector... · 0d8e67ae

Craig Topper authored Jan 24, 2012

Add comments near load pattern fragments indicating that all integer vector loads are promoted to v2i64 or v4i64 so that no one tries to reintroduce pattern fragments for other types.

llvm-svn: 148771

0d8e67ae

Jan 23, 2012
- Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32... · 20c98df3
  Craig Topper authored Jan 23, 2012
```
Remove pattern fragments for v32i8, v16i16, v8i32, v16i8, v8i16, and v4i32 loads. All integer vector loads are promoted to v2i64 or v4i64 so these pattern fragments can never match. Fix or remove patterns that used these fragments.

llvm-svn: 148672
```
  20c98df3
- Combine X86 CMPPD and CMPPS node types. Simplifies selection code and pattern matching. · 0b7ad76b
  Craig Topper authored Jan 22, 2012
```
llvm-svn: 148670
```
  0b7ad76b
Jan 22, 2012

Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86... · bd488437

Craig Topper authored Jan 22, 2012

Merge PCMPEQB/PCMPEQW/PCMPEQD/PCMPEQQ and PCMPGTB/PCMPGTW/PCMPGTD/PCMPGTQ X86 ISD node types into only two node types. Simplifying opcode selection and pattern matching.

llvm-svn: 148667

bd488437

Add target specific ISD node types for SSE/AVX vector shuffle instructions and... · 09462641

Craig Topper authored Jan 22, 2012

Add target specific ISD node types for SSE/AVX vector shuffle instructions and change all the code that used to create intrinsic nodes to create the new nodes instead.

llvm-svn: 148664

09462641

Jan 19, 2012
- Merge 128-bit and 256-bit SHUFPS/SHUFPD handling. · 80576e8d
  Craig Topper authored Jan 19, 2012
```
llvm-svn: 148466
```
  80576e8d
Jan 01, 2012
- Merge X86 SHUFPS and SHUFPD node types. · 6e54ba7e
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147394
```
  6e54ba7e
Dec 17, 2011
- Remove an unused X86ISD node type. · a913dde0
  Craig Topper authored Dec 17, 2011
```
llvm-svn: 146833
```
  a913dde0
Dec 11, 2011

Remove some remants of the old palign pattern fragment that were still hanging... · 1fdfec63

Craig Topper authored Dec 11, 2011

Remove some remants of the old palign pattern fragment that were still hanging around. Also remove a cast from inside getShuffleVPERM2X128Immediate and getShuffleVPERMILPImmediate since the only caller already had done the cast.

llvm-svn: 146344

1fdfec63

Dec 06, 2011
- Merge floating point and integer UNPCK X86ISD node types. · 8d4ba198
  Craig Topper authored Dec 06, 2011
```
llvm-svn: 145926
```
  8d4ba198
Nov 30, 2011
- Merge VPERM2F128/VPERM2I128 ISD node types. · 0a672eaf
  Craig Topper authored Nov 30, 2011
```
llvm-svn: 145485
```
  0a672eaf
- Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node... · bafd224c
  Craig Topper authored Nov 30, 2011
```
Merge decoding of VPERMILPD and VPERMILPS shuffle masks. Merge X86ISD node type for VPERMILPD/PS. Add instruction selection support for VINSERTI128/VEXTRACTI128.

llvm-svn: 145483
```
  bafd224c
Nov 28, 2011

Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge... · 818a983e

Craig Topper authored Nov 28, 2011

Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.

llvm-svn: 145238

818a983e

Nov 26, 2011

Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD.... · 51280d56

Craig Topper authored Nov 26, 2011

Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created.

llvm-svn: 145153

51280d56

Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to... · 7704bd7a

Craig Topper authored Nov 26, 2011

Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type.

llvm-svn: 145148

7704bd7a

Nov 24, 2011

Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit... · d65a4444

Craig Topper authored Nov 24, 2011

Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.

llvm-svn: 145126

d65a4444

Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse... · d2646674

Craig Topper authored Nov 24, 2011

Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse the 128-bit versions and let the vector type distinguish.

llvm-svn: 145125

d2646674

Nov 21, 2011
- Lowering for v32i8 to VPUNPCKLBW/VPUNPCKHBW when AVX2 is enabled. · 6270d072
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145028
```
  6270d072
- Add support for lowering 256-bit shuffles to VPUNPCKL/H for i16, i32, i64 if AVX2 is enabled. · 669199ca
  Craig Topper authored Nov 21, 2011
```
llvm-svn: 145026
```
  669199ca
Nov 19, 2011
- Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from... · f984efbf
  Craig Topper authored Nov 19, 2011
```
Synthesize SSSE3/AVX 128-bit horizontal integer add/sub instructions from add/sub of appropriate shuffle vectors.

llvm-svn: 144989
```
  f984efbf
- Collapse X86 PSIGNB/PSIGNW/PSIGND node types. · 81390be0
  Craig Topper authored Nov 19, 2011
```
llvm-svn: 144988
```
  81390be0
- Extend VPBLENDVB and VPSIGN lowering to work for AVX2. · de6b73bb
  Craig Topper authored Nov 19, 2011
```
llvm-svn: 144987
```
  de6b73bb
Nov 02, 2011
- Add a bunch more X86 AVX2 instructions and their corresponding intrinsics. · 682b8506
  Craig Topper authored Nov 02, 2011
```
llvm-svn: 143529
```
  682b8506
Sep 22, 2011

Synthesize SSE3/AVX 128 bit horizontal add/sub instructions from · 0e4fcb8e

Duncan Sands authored Sep 22, 2011

floating point add/sub of appropriate shuffle vectors.  Does not
synthesize the 256 bit AVX versions because they work differently.

llvm-svn: 140332

0e4fcb8e

Sep 13, 2011

Add versions 256-bit versions of alignedstore and alignedload, to be · 03d6002d

Bruno Cardoso Lopes authored Sep 13, 2011

more strict about the alignment checking. This was found by inspection
and I don't have any testcases so far, although the llvm testsuite runs
without any problem.

llvm-svn: 139625

03d6002d

Sep 12, 2011
- Format patterns, remove unused X86blend patterns · c0c71e16
  Nadav Rotem authored Sep 12, 2011
```
llvm-svn: 139491
```
  c0c71e16
Sep 09, 2011

Implement vector-select support for avx256. Refactor the vblend implementation... · de838dae

Nadav Rotem authored Sep 09, 2011

Implement vector-select support for avx256. Refactor the vblend implementation to have tablegen match the instruction by the node type

llvm-svn: 139400

de838dae

Sep 08, 2011

Add AVX versions of blend vector operations and fix some issues noticed · fb113a00

Bruno Cardoso Lopes authored Sep 08, 2011

in Nadav's r139285 and r139287 commits.

1) Rename vsel.ll to a more descriptive name
2) Change the order of BLEND operands to "Op1, Op2, Cond", this is
necessary because PBLENDVB is already used in different places with
this order, and it was being emitted in the wrong way for vselect
3) Add AVX patterns and tests for the same SSE41 instructions

llvm-svn: 139305

fb113a00

Add X86-SSE4 codegen support for vector-select. · 2550ba2a
Nadav Rotem authored Sep 08, 2011
```
llvm-svn: 139285
```
2550ba2a

Aug 17, 2011

Introduce matching patterns for vbroadcast AVX instruction. The idea is to · be5e9873

Bruno Cardoso Lopes authored Aug 17, 2011

match splats in the form (splat (scalar_to_vector (load ...))) whenever
the load can be folded. All the logic and instruction emission is
working but because of PR8156, there are no ways to match loads, cause
they can never be folded for splats. Thus, the tests are XFAILed, but
I've tested and exercised all the logic using a relaxed version for
checking the foldable loads, as if the bug was already fixed. This
should work out of the box once PR8156 gets fixed since MayFoldLoad will
work as expected.

llvm-svn: 137810

be5e9873

Aug 12, 2011

The VPERM2F128 is a AVX instruction which permutes between two 256-bit · f15dfe58

Bruno Cardoso Lopes authored Aug 12, 2011

vectors. It operates on 128-bit elements instead of regular scalar
types. Recognize shuffles that are suitable for VPERM2F128 and teach
the x86 legalizer how to handle them.

llvm-svn: 137519

f15dfe58

Jul 29, 2011
- Cleanup PALIGNR handling and remove the old palign pattern fragment. · 6aee3884
  Bruno Cardoso Lopes authored Jul 29, 2011
```
Also make PALIGNR masks to don't match 256-bits, which isn't supported
It's also a step to solve PR10489

llvm-svn: 136448
```
  6aee3884
Jul 27, 2011

The vpermilps and vpermilpd have different behaviour regarding the · 27a30a77

Bruno Cardoso Lopes authored Jul 27, 2011

usage of the shuffle bitmask. Both work in 128-bit lanes without
crossing, but in the former the mask of the high part is the same
used by the low part while in the later both lanes have independent
masks. Handle this properly and and add support for vpermilpd.

llvm-svn: 136200

27a30a77

Remove more dead code! · db5fb914
Bruno Cardoso Lopes authored Jul 27, 2011
```
llvm-svn: 136199
```
db5fb914
Recognize unpckh* masks and match 256-bit versions. The new versions are · f8fe47bd
Bruno Cardoso Lopes authored Jul 26, 2011
```
different from the previous 128-bit because they work in lanes.
Update a few comments and add testcases

llvm-svn: 136157
```
f8fe47bd

Jul 26, 2011

Cleanup movsldup/movshdup matching. · 957a6a13
Bruno Cardoso Lopes authored Jul 26, 2011
```
27 insertions(+), 62 deletions(-)

llvm-svn: 136047
```
957a6a13

- Handle special scalar_to_vector case: splats. Using a native 128-bit · 123dff0f

Bruno Cardoso Lopes authored Jul 25, 2011

shuffle before inserting on a 256-bit vector.
- Add AVX versions of movd/movq instructions
- Introduce a few COPY patterns to match insert_subvector instructions.
This turns a trivial insert_subvector instruction into a register copy,
coalescing the xmm into a ymm and avoid emiting on more instruction.

llvm-svn: 136002

123dff0f

Jul 21, 2011

Add support for 256-bit versions of VPERMIL instruction. This is a new · b878caa5

Bruno Cardoso Lopes authored Jul 21, 2011

instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
It considers a 256-bit vector as two independent 128-bit lanes. It can permute
any 32 or 64 elements inside a lane, and restricts the second lane to
have the same permutation of the first one. With the improved splat support
introduced early today, adding codegen for this instruction enable more
efficient 256-bit code:

Instead of:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vextractf128  $1, %ymm0, %xmm1
  shufps  $1, %xmm1, %xmm1
  movss %xmm1, 28(%rsp)
  movss %xmm1, 24(%rsp)
  movss %xmm1, 20(%rsp)
  movss %xmm1, 16(%rsp)
  vextractf128  $0, %ymm0, %xmm0
  shufps  $1, %xmm0, %xmm0
  movss %xmm0, 12(%rsp)
  movss %xmm0, 8(%rsp)
  movss %xmm0, 4(%rsp)
  movss %xmm0, (%rsp)
  vmovaps (%rsp), %ymm0
We get:
  vextractf128  $0, %ymm0, %xmm0
  punpcklbw %xmm0, %xmm0
  punpckhbw %xmm0, %xmm0
  vinsertf128 $0, %xmm0, %ymm0, %ymm1
  vinsertf128 $1, %xmm0, %ymm1, %ymm0
  vpermilps $85, %ymm0, %ymm0

llvm-svn: 135662

b878caa5