Commits · 37521aa89c2a34301dbb8c1afc82823862ca5194 · Roger Ferrer / llvm-epi-0.8

Sep 16, 2012

The PMOVZXWD family of functions had patterns extends narrow vector types to wide vector types. · 37521aa8

Nadav Rotem authored Sep 16, 2012

It had patterns for zext-loading and extending. This commit adds patterns for loading a wide type, performing a bitcast,
and extending. This is an odd pattern, but it is commonly used when writing code with intrinsics.

rdar://11897677

llvm-svn: 163995

37521aa8

Sep 10, 2012
- Enhance PR11334 fix to support extload from v2f32/v4f32 · 400f7ef8
  Michael Liao authored Sep 10, 2012
```
    
- Fix an remaining issue of PR11674 as well

llvm-svn: 163528
```
  400f7ef8
Sep 08, 2012
- Add instruction selection for ffloor of vectors when SSE4.1 or AVX is enabled. · 4ed79bd7
  Craig Topper authored Sep 08, 2012
```
llvm-svn: 163473
```
  4ed79bd7
Sep 06, 2012

Use iPTR instead of i32 for extract_subvector/insert_subvector index in... · f3e4aa8c

Craig Topper authored Sep 06, 2012

Use iPTR instead of i32 for extract_subvector/insert_subvector index in lowering and patterns. This makes it consistent with the incoming DAG nodes from the DAG builder.

llvm-svn: 163293

f3e4aa8c

Add patterns for converting stores of subvector_extracts of lower 128-bits of... · daa5ed1e

Craig Topper authored Sep 06, 2012

Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.

llvm-svn: 163292

daa5ed1e

Sep 05, 2012

Remove some of the patterns added in r163196. Increasing the complexity on... · 81f06df6

Craig Topper authored Sep 05, 2012

Remove some of the patterns added in r163196. Increasing the complexity on insert_subvector into undef accomplishes the same thing.

llvm-svn: 163198

81f06df6

Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads.... · f7c87d6e

Craig Topper authored Sep 05, 2012

Add patterns for integer forms of VINSERTF128/VINSERTI128 folded with loads. Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS.

llvm-svn: 163196

f7c87d6e

Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build... · 2db2353b

Craig Topper authored Sep 05, 2012

Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.

llvm-svn: 163192

2db2353b

Sep 01, 2012
- Typos · d6cc4062
  Craig Topper authored Sep 01, 2012
```
llvm-svn: 163053
```
  d6cc4062
Aug 31, 2012
- Clean up AddedComplexity further after adding UseSSEx · 969f3913
  Michael Liao authored Aug 31, 2012
```
llvm-svn: 162973
```
  969f3913
- X86: Fix encoding of 'movd %xmm0, %rax' · e423e865
  Jim Grosbach authored Aug 31, 2012
```
The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v'
prefix, resulting in mis-assembly of the vanilla movd instruction.

llvm-svn: 162963
```
  e423e865
Aug 30, 2012

Introduce 'UseSSEx' to force SSE legacy encoding · bbd10792

Michael Liao authored Aug 30, 2012

- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919

bbd10792

Aug 28, 2012
- The commutative flag is already correctly set within the multiclass. If we set · cc567180
  Bill Wendling authored Aug 28, 2012
```
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>

llvm-svn: 162741
```
  cc567180
- Convert V_SETALLONES/AVX_SETALLONES/AVX2_SETALLONES to Post-RA pseudos. · 72f51c39
  Craig Topper authored Aug 28, 2012
```
llvm-svn: 162740
```
  72f51c39
- Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo. · bd509eea
  Craig Topper authored Aug 28, 2012
```
llvm-svn: 162738
```
  bd509eea
- More missing mayLoad flags on AVX multiclasses. · 89d6b29d
  Jakob Stoklund Olesen authored Aug 28, 2012
```
llvm-svn: 162714
```
  89d6b29d
Aug 27, 2012

Don't allow vextractf128 to be folded with unaligned stores. We don't fold... · 5af2fed5

Craig Topper authored Aug 27, 2012

Don't allow vextractf128 to be folded with unaligned stores. We don't fold unaligned loads so shouldn't fold unaligned stores as it can cause an alignment fault to occur.

llvm-svn: 162658

5af2fed5

Fold some patterns into instruction definitons so tablegen can infer flags... · 6d44554c

Craig Topper authored Aug 27, 2012

Fold some patterns into instruction definitons so tablegen can infer flags removing the need for an explicit 'neverHasSideEffects = 1'

llvm-svn: 162656

6d44554c

Add HasAVX1Only predicate and use it for patterns that have an AVX1... · f7828f91

Craig Topper authored Aug 27, 2012

Add HasAVX1Only predicate and use it for patterns that have an AVX1 instruction and an AVX2 instruction rather than relying on AddedComplexity.

llvm-svn: 162654

f7828f91

Aug 25, 2012
- Add missing mayLoad flags to a large class of AVX *_Int instructions. · 3d91b43a
  Jakob Stoklund Olesen authored Aug 24, 2012
```
llvm-svn: 162622
```
  3d91b43a
Aug 24, 2012

Remove some spurious mayLoad = 0 flags. · d3511235

Jakob Stoklund Olesen authored Aug 24, 2012

They were inserted to silence TableGen's warning about
redundant properties. That warning is now gone.

llvm-svn: 162517

d3511235

Aug 19, 2012

When unsafe math is used, we can use commutative FMAX and FMIN. In some cases · 178250ad

Nadav Rotem authored Aug 19, 2012

this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0

llvm-svn: 162187

178250ad

Aug 14, 2012

fix PR11334 · 34107b91

Michael Liao authored Aug 14, 2012

- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.

llvm-svn: 161894

34107b91

Aug 06, 2012

Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom... · ab47fe4e

Craig Topper authored Aug 06, 2012

Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.

llvm-svn: 161318

ab47fe4e

Aug 05, 2012
- Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern. · 6d0408d3
  Craig Topper authored Aug 05, 2012
```
llvm-svn: 161306
```
  6d0408d3
Aug 02, 2012
- X86: mark GATHER instructios as mayLoad · 40591453
  Manman Ren authored Aug 01, 2012
```
llvm-svn: 161143
```
  40591453
Jul 30, 2012
- Give VCVTTPD2DQ priority over CVTTPD2DQ. · 14eac5dd
  Craig Topper authored Jul 30, 2012
```
llvm-svn: 160942
```
  14eac5dd
- Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1. · f881d385
  Craig Topper authored Jul 30, 2012
```
llvm-svn: 160941
```
  f881d385
- Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form.... · 415b3586
  Craig Topper authored Jul 30, 2012
```
Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.

llvm-svn: 160939
```
  415b3586
- Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns. · 28402efc
  Craig Topper authored Jul 29, 2012
```
llvm-svn: 160938
```
  28402efc
- Move more SSE/AVX convert instruction patterns into their definitions. · b6767f3a
  Craig Topper authored Jul 29, 2012
```
llvm-svn: 160937
```
  b6767f3a
Jul 28, 2012
- Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions. · fc93281c
  Craig Topper authored Jul 28, 2012
```
llvm-svn: 160922
```
  fc93281c
- Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects. · 024797b9
  Craig Topper authored Jul 28, 2012
```
llvm-svn: 160921
```
  024797b9
- Make CVTSS2SI instruction definition consistent with CVTSD2SI. · 44f9b534
  Craig Topper authored Jul 28, 2012
```
llvm-svn: 160914
```
  44f9b534
- Fix up memory load types for SSE scalar convert intrinsic patterns. · 1c1aef07
  Craig Topper authored Jul 28, 2012
```
llvm-svn: 160913
```
  1c1aef07
Jul 27, 2012

Remove the last mentions of sub_ss and sub_sd from patterns. · 77cd55b4
Jakob Stoklund Olesen authored Jul 26, 2012
```
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
```
77cd55b4

Eliminate sub_ss, sub_sd from broadcast patterns. · b96d0b4e

Jakob Stoklund Olesen authored Jul 26, 2012

The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830

b96d0b4e

Eliminate more sub_ss / sub_sd patterns. · 206b825f

Jakob Stoklund Olesen authored Jul 26, 2012

This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820

206b825f

Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd. · 75d17b05

Jakob Stoklund Olesen authored Jul 26, 2012

The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818

75d17b05

Jul 26, 2012

Eliminate a batch of uses of sub_ss and sub_sd in the X86 target. · ceee4a9d

Jakob Stoklund Olesen authored Jul 26, 2012

These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves. They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816

ceee4a9d