Commits · 3dbcda84789aa45d29ccc6629fe932ff536e9087 · Roger Ferrer / llvm-epi-0.8

Jan 11, 2012

Unify the interface of the three mask+shift transform helpers, and · 3dbcda84

Chandler Carruth authored Jan 11, 2012

factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.

llvm-svn: 147940

3dbcda84

Clarify and make explicit some of the requirements for transforming · aa01e666

Chandler Carruth authored Jan 11, 2012

mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.

llvm-svn: 147939

aa01e666

Fix undefined code and reenable test case. · 60399837

Jakob Stoklund Olesen authored Jan 11, 2012

I don't think the compact encoding code is right, but at least is has
defined behavior now.

llvm-svn: 147938

60399837

Hoist the logic to transform shift+mask combinations into sub-register · 51d3076b

Chandler Carruth authored Jan 11, 2012

extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.

llvm-svn: 147937

51d3076b

Teach the X86 instruction selection to do some heroic transforms to · 55b2cdee

Chandler Carruth authored Jan 11, 2012

detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936

55b2cdee

Jan 10, 2012

Fixed order of operands in comment to match code. · 995c6332
Lang Hames authored Jan 10, 2012
```
llvm-svn: 147890
```
995c6332

Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes. · 96cd35cf

Joerg Sonnenberger authored Jan 10, 2012

Add a test that checks the stack alignment of a simple function for
Darwin, Linux and NetBSD for 32bit and 64bit mode.

llvm-svn: 147888

96cd35cf

Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a few · 1a8f0ccd
Chad Rosier authored Jan 10, 2012
```
failing test cases on our internal AVX nightly tester.
rdar://10663637

llvm-svn: 147881
```
1a8f0ccd

For i386, don't use the generic code. · d5ab0260

Bill Wendling authored Jan 10, 2012

As the comment around 7746 says, it's better to use the x87 extended precision
here than SSE. And the generic code doesn't know how to do that. It also regains
the speed lost for the uint64_to_float.c testcase.
<rdar://problem/10669858>

llvm-svn: 147869

d5ab0260

Add definition for intel asm variant. · 67bf992a

Devang Patel authored Jan 10, 2012

Right now, this just adds additional entries in match table. The parser does not use them yet.

llvm-svn: 147859

67bf992a

Remove unnecessary default cases in switches that cover all enum values. · edbb58c5
David Blaikie authored Jan 10, 2012
```
llvm-svn: 147855
```
edbb58c5
Add definitions for AMD's bobcat (aka btver1) · 077ae1d7
Benjamin Kramer authored Jan 10, 2012
```
llvm-svn: 147846
```
077ae1d7

Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector.... · 430f3f1b

Craig Topper authored Jan 10, 2012

Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside.

llvm-svn: 147844

430f3f1b

Remove hasXMM/hasXMMInt functions. Move callers to hasSSE1/hasSSE2. This is... · b0c0f72a

Craig Topper authored Jan 10, 2012

Remove hasXMM/hasXMMInt functions. Move callers to hasSSE1/hasSSE2. This is the final piece to remove the AVX hack that disabled SSE.

llvm-svn: 147843

b0c0f72a

Remove hasSSE*orAVX functions and change all callers to use just hasSSE*. AVX... · d97bbd7b

Craig Topper authored Jan 10, 2012

Remove hasSSE*orAVX functions and change all callers to use just hasSSE*. AVX is now an SSE level and no longer disables SSE checks.

llvm-svn: 147842

d97bbd7b

Instruction selection priority fixes to remove the XMM/XMMInt/orAVX... · eb8f9e9e

Craig Topper authored Jan 10, 2012

Instruction selection priority fixes to remove the XMM/XMMInt/orAVX predicates. Another commit will remove orAVX functions from X86SubTarget.

llvm-svn: 147841

eb8f9e9e

Jan 09, 2012

Fix asm string wrt variants. · 29ba4f97
Devang Patel authored Jan 09, 2012
```
llvm-svn: 147805
```
29ba4f97

Split AsmParser into two components - AsmParser and AsmParserVariant · 85d684a4

Devang Patel authored Jan 09, 2012

AsmParser holds info specific to target parser.
AsmParserVariant holds info specific to asm variants supported by the target.

llvm-svn: 147787

85d684a4

Don't rely on the fact that shift values are never very large, and thus · c16622da

Chandler Carruth authored Jan 09, 2012

this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.

Spotted by inspection, and impossible to trigger given the shift widths
that can be used.

llvm-svn: 147773

c16622da

Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level.... · f287a450

Craig Topper authored Jan 09, 2012

Remove AVX hack in X86Subtarget. AVX/AVX2 are now treated as an SSE level. Predicate functions have been altered to maintain previous names and behavior.

llvm-svn: 147770

f287a450

Add HasAVX predicate to some of the AVX patterns. · b89805c7
Craig Topper authored Jan 09, 2012
```
llvm-svn: 147769
```
b89805c7

Reorder a bunch of patterns to put the AVX version first thus giving it... · a51f7f75

Craig Topper authored Jan 09, 2012

Reorder a bunch of patterns to put the AVX version first thus giving it priority over the SSE version. Another step towards trying to remove the AVX hack that disables SSE from X86Subtarget.

llvm-svn: 147768

a51f7f75

Clean up patterns for MOVNT*. Not sure why there were floating point types on... · ef7f5bf8

Craig Topper authored Jan 09, 2012

Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.

llvm-svn: 147767

ef7f5bf8

Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no... · c1f5622a

Craig Topper authored Jan 09, 2012

Mark MOVNTI as being supported in SSE2 OR AVX mode. This instruction has no AVX equivalent so we should use the SSE version.

llvm-svn: 147766

c1f5622a

Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations... · a081644f

Craig Topper authored Jan 09, 2012

Move SSE2 logical operations PAND/POR/PXOR/PANDN above SSE1 logical operations ANDPS/ORPS/XORPS/ANDNPS. This fixes a pattern ordering issue that meant that the SSE2 instructions could never be directly selected since the SSE1 patterns would always match first. This is largely moot with the ExeDepsFix pass, but I'm trying to audit for all such ordering issues.

llvm-svn: 147765

a081644f

Change some places that were checking for AVX OR SSE1/2 to use... · 210e4f81

Craig Topper authored Jan 09, 2012

Change some places that were checking for AVX OR SSE1/2 to use hasXMM/hasXMMInt instead. Also fix one place that checked SSE3, but accidentally excluded AVX to use hasSSE3orAVX. This is a step towards removing the AVX hack from the X86Subtarget.h

llvm-svn: 147764

210e4f81

Don't disable MMX support when AVX is enabled. Fix predicates for MMX... · 744f6311

Craig Topper authored Jan 09, 2012

Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.

llvm-svn: 147762

744f6311

Enable FISTTP* instructions when AVX is enabled. · c1ab7afe
Craig Topper authored Jan 08, 2012
```
llvm-svn: 147758
```
c1ab7afe

Jan 08, 2012
- Reverted commit #147601 upon Evan's request. · 540651cf
  Victor Umansky authored Jan 08, 2012
  
  llvm-svn: 147748
  540651cf
Jan 07, 2012
- Fix typo in the X86 backend readme. Patch from Jaeden Amero. · f210619d
  Craig Topper authored Jan 07, 2012
  
  llvm-svn: 147739
  f210619d
- Remove VectorExtras. This unused helper was written for a type of API that is discouraged now. · 6898db62
  Benjamin Kramer authored Jan 07, 2012
  
  llvm-svn: 147738
  6898db62
- Remove unnecessary check of hasAVX(). It's already included in hasXMM(). · ca66bba4
  Craig Topper authored Jan 07, 2012
  
  llvm-svn: 147734
  ca66bba4
- Make the 'x' constraint work for AVX registers as well. · c206d467
  Eric Christopher authored Jan 07, 2012
  
  Fixes rdar://10614894 llvm-svn: 147704
  c206d467
Jan 05, 2012

Mark scalar FMA4 instructions as ignoring the VEX.L bit. · 29b07374
Craig Topper authored Jan 05, 2012
```
llvm-svn: 147602
```
29b07374

Peephole optimization of ptest-conditioned branch in X86 arch. Performs... · 9255b6d9

Victor Umansky authored Jan 05, 2012

Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.

Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601

9255b6d9

Replace the uint64_t -> double convertion algorithm with one that's more efficient. · ac27f0c8

Bill Wendling authored Jan 05, 2012

This small bit of ASM code is sufficient to do what the old algorithm did:

     movq       %rax,  %xmm0
     punpckldq  (c0),  %xmm0  // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U }
     subpd      (c1),  %xmm0  // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 }
   #ifdef __SSE3__
     haddpd   %xmm0, %xmm0          
   #else
     pshufd   $0x4e, %xmm0, %xmm1 
     addpd    %xmm1, %xmm0
   #endif

It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on
all processors.
<rdar://problem/7719814>

llvm-svn: 147593

ac27f0c8

Jan 04, 2012

Silence warnings of a mysterious compiler that still defaults to C89. · 9c48f263
Benjamin Kramer authored Jan 04, 2012
```
llvm-svn: 147553
```
9c48f263

For x86, canonicalize max · 104dbb0f

Evan Cheng authored Jan 04, 2012

(x > y) ? x : y
=>
(x >= y) ? x : y

So for something like
(x - y) > 0 : (x - y) ? 0
It will be
(x - y) >= 0 : (x - y) ? 0

This makes is possible to test sign-bit and eliminate a comparison against
zero. e.g.
subl   %esi, %edi
testl  %edi, %edi
movl   $0, %eax
cmovgl %edi, %eax
=>
xorl   %eax, %eax
subl   %esi, $edi
cmovsl %eax, %edi

rdar://10633221

llvm-svn: 147512

104dbb0f

Fix 80-column violations. · 6ca97df9
Chad Rosier authored Jan 03, 2012
```
llvm-svn: 147495
```
6ca97df9

Jan 03, 2012
- Revert 147426 because it caused pr11696. · 6d31bac8
  Nadav Rotem authored Jan 03, 2012
  
  llvm-svn: 147485
  6d31bac8