Commits · 86b4dfac016231d728f060c1a97c664f87582768 · Roger Ferrer / llvm-epi-0.8

Jan 09, 2012
- Enable FISTTP* instructions when AVX is enabled. · c1ab7afe
  Craig Topper authored Jan 08, 2012
```
llvm-svn: 147758
```
  c1ab7afe
Jan 08, 2012
- Reverted commit #147601 upon Evan's request. · 540651cf
  Victor Umansky authored Jan 08, 2012
```
llvm-svn: 147748
```
  540651cf
Jan 07, 2012
- Fix typo in the X86 backend readme. Patch from Jaeden Amero. · f210619d
  Craig Topper authored Jan 07, 2012
```
llvm-svn: 147739
```
  f210619d
- Remove VectorExtras. This unused helper was written for a type of API that is discouraged now. · 6898db62
  Benjamin Kramer authored Jan 07, 2012
```
llvm-svn: 147738
```
  6898db62
- Remove unnecessary check of hasAVX(). It's already included in hasXMM(). · ca66bba4
  Craig Topper authored Jan 07, 2012
```
llvm-svn: 147734
```
  ca66bba4
- Make the 'x' constraint work for AVX registers as well. · c206d467
  Eric Christopher authored Jan 07, 2012
```
Fixes rdar://10614894

llvm-svn: 147704
```
  c206d467
Jan 05, 2012

Mark scalar FMA4 instructions as ignoring the VEX.L bit. · 29b07374
Craig Topper authored Jan 05, 2012
```
llvm-svn: 147602
```
29b07374

Peephole optimization of ptest-conditioned branch in X86 arch. Performs... · 9255b6d9

Victor Umansky authored Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.

Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601

9255b6d9

Replace the uint64_t -> double convertion algorithm with one that's more efficient. · ac27f0c8

Bill Wendling authored Jan 05, 2012

This small bit of ASM code is sufficient to do what the old algorithm did:

     movq       %rax,  %xmm0
     punpckldq  (c0),  %xmm0  // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U }
     subpd      (c1),  %xmm0  // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 }
   #ifdef __SSE3__
     haddpd   %xmm0, %xmm0          
   #else
     pshufd   $0x4e, %xmm0, %xmm1 
     addpd    %xmm1, %xmm0
   #endif

It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on
all processors.
<rdar://problem/7719814>

llvm-svn: 147593

ac27f0c8

Jan 04, 2012

Silence warnings of a mysterious compiler that still defaults to C89. · 9c48f263
Benjamin Kramer authored Jan 04, 2012
```
llvm-svn: 147553
```
9c48f263

For x86, canonicalize max · 104dbb0f

Evan Cheng authored Jan 04, 2012

(x > y) ? x : y
=>
(x >= y) ? x : y

So for something like
(x - y) > 0 : (x - y) ? 0
It will be
(x - y) >= 0 : (x - y) ? 0

This makes is possible to test sign-bit and eliminate a comparison against
zero. e.g.
subl   %esi, %edi
testl  %edi, %edi
movl   $0, %eax
cmovgl %edi, %eax
=>
xorl   %eax, %eax
subl   %esi, $edi
cmovsl %eax, %edi

rdar://10633221

llvm-svn: 147512

104dbb0f

Fix 80-column violations. · 6ca97df9
Chad Rosier authored Jan 03, 2012
```
llvm-svn: 147495
```
6ca97df9

Jan 03, 2012
- Revert 147426 because it caused pr11696. · 6d31bac8
  Nadav Rotem authored Jan 03, 2012
```
llvm-svn: 147485
```
  6d31bac8
- Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather · 493c1b31
  Chad Rosier authored Jan 03, 2012
```
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409

llvm-svn: 147481
```
  493c1b31
- Intel style asm variant does not need '%' prefix. · c1215324
  Devang Patel authored Jan 03, 2012
```
llvm-svn: 147453
```
  c1215324
Jan 02, 2012

Miscellaneous shuffle lowering cleanup. No functional changes. Primarily... · 5bacb7e9

Craig Topper authored Jan 02, 2012

Miscellaneous shuffle lowering cleanup. No functional changes. Primarily converting the indexing loops to unsigned to be consistent across functions.

llvm-svn: 147430

5bacb7e9

Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also... · 53d55964

Craig Topper authored Jan 02, 2012

Make CanXFormVExtractWithShuffleIntoLoad reject loads with multiple uses. Also make it return false if there's not even a load at all. This makes the code better match the code in DAGCombiner that it tries to match. These two changes prevent some cases where vector_shuffles were making it to instruction selection and causing the older shuffle selection code to be triggered. Also needed to fix a bad pattern that this change exposed. This is the first step towards getting rid of the old shuffle selection support. No test cases yet because there's no way to tell whether a shuffle was handled in the legalize stage or at instruction selection.

llvm-svn: 147428

53d55964

· 6c7a0e6c

Nadav Rotem authored Jan 02, 2012

Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit.

llvm-svn: 147426

6c7a0e6c

Jan 01, 2012
- Allow CRC32 instructions to be selected when AVX is enabled. · b9109844
  Craig Topper authored Jan 01, 2012
```
llvm-svn: 147411
```
  b9109844
- Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is... · 1c064e0a
  Craig Topper authored Jan 01, 2012
```
Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.

llvm-svn: 147409
```
  1c064e0a
- X86Disassembler: Fix undefined behavior found by GCC 4.6 · 47aecca5
  Benjamin Kramer authored Jan 01, 2012
```
llvm-svn: 147404
```
  47aecca5
- Merge X86 SHUFPS and SHUFPD node types. · 6e54ba7e
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147394
```
  6e54ba7e
- Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load. · d51092d9
  Craig Topper authored Dec 31, 2011
```
llvm-svn: 147393
```
  d51092d9
- Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a... · 0e796fee
  Craig Topper authored Dec 31, 2011
```
Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.

llvm-svn: 147392
```
  0e796fee
Dec 30, 2011

Make FMA4 imply AVX so that YMM registers would be available. Necessitates... · a5d1fc2c

Craig Topper authored Dec 30, 2011

Make FMA4 imply AVX so that YMM registers would be available. Necessitates removing from Bulldozer CPU types since it would enable AVX code generation implicitly. Also make SSE4A imply SSE3. Without some level of SSE implied, XMM registers wouldn't be legal.

llvm-svn: 147369

a5d1fc2c

Add disassembler support for VPERMIL2PD and VPERMIL2PS. · 2ba766ae
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147368
```
2ba766ae
Add FMA4 instructions to disassembler. · 03a0beda
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147367
```
03a0beda

Separate the concept of having memory access in operand 4 from the concept of... · cd93de93

Craig Topper authored Dec 30, 2011

Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation.

llvm-svn: 147366

cd93de93

Combine FMA4 SS/SD patterns with the instruction definitions. · c0f9bcb5
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147365
```
c0f9bcb5
Combine FMA4 PS/PD patterns with the instruction definitions. · 51fe43fc
Craig Topper authored Dec 30, 2011
```
llvm-svn: 147364
```
51fe43fc

Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to... · 6c08930c

Craig Topper authored Dec 30, 2011

Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.

llvm-svn: 147361

6c08930c

Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size,... · 2ca79b9d

Craig Topper authored Dec 30, 2011

Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.

llvm-svn: 147360

2ca79b9d

Dec 29, 2011
- Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions. · d773607e
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147353
```
  d773607e
- Expose FMA3 instructions to the disassembler. · 8cab06a2
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147351
```
  8cab06a2
- Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types... · e1bd0512
  Craig Topper authored Dec 29, 2011
```
Make FMA3 imply AVX needs to be enabled. Particularly because 256-bit types aren't valid unless AVX is enabled.

llvm-svn: 147349
```
  e1bd0512
- Change XOP detection to use the correct CPUID bit instead of using the FMA4 bit. · dd286a52
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147348
```
  dd286a52
- Add FeaturePOPCNT to all CPU types that lost it was removed from SSE42/SSE4A in r147339. · a060afb5
  Craig Topper authored Dec 29, 2011
```
llvm-svn: 147347
```
  a060afb5
- Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled... · 97f05c57
  Craig Topper authored Dec 29, 2011
```
Mark non-VEX forms of PCLMUL instructions as requiring SSE2 to be enabled along with CLMUL. That's required for the XMM registers to be valid for integer data. Doesn't change any behavior since the CLMUL instructions don't have patterns yet.

llvm-svn: 147345
```
  97f05c57
- Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along... · 1559123c
  Craig Topper authored Dec 29, 2011
```
Mark non-VEX forms of AES instructions as requiring SSE2 to be enabled along with AES. Since that's required for the XMM registers to be valid for integer data. Doesn't change any behavior though since you can't use an intrinsic with an illegal type anyway. Just makes it consistent with the VEX forms.

llvm-svn: 147344
```
  1559123c
- Remove the separate explicit AES instruction patterns. They are equivalent to... · 9e61291b
  Craig Topper authored Dec 29, 2011
```
Remove the separate explicit AES instruction patterns. They are equivalent to the patterns specified by the instructions. Also remove unnecessary bitconverts from the AES patterns.

llvm-svn: 147342
```
  9e61291b