Commits · b7ff9b1599fa81a67297786301678fed660f11e2 · Roger Ferrer / llvm-epi-0.8

Apr 26, 2012

· 81290f4b

Preston Gurd authored Apr 26, 2012

Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655

81290f4b

Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to... · 08ccfbe5

Craig Topper authored Apr 26, 2012

Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.

llvm-svn: 155618

08ccfbe5

Apr 25, 2012

Use vector_shuffles instead of target specific unpack nodes for AVX... · 5ff6dc34

Craig Topper authored Apr 25, 2012

Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.

llvm-svn: 155537

5ff6dc34

Apr 24, 2012

AVX: Add additional vbroadcast replacement sequences for integers. · 810734b7
Nadav Rotem authored Apr 24, 2012
```
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.

llvm-svn: 155461
```
810734b7

AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8 · 7b7b99c7

Nadav Rotem authored Apr 24, 2012

immediate. We can't use it here because the shuffle code does not check that
the lower part of the word is identical to the upper part.

llvm-svn: 155440

7b7b99c7

AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions · aa3ff8da

Nadav Rotem authored Apr 24, 2012

using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).

llvm-svn: 155437

aa3ff8da

Remove dangling spaces. Fix some other formatting. · 0b65c408
Craig Topper authored Apr 24, 2012
```
llvm-svn: 155429
```
0b65c408
Simplify code a bit and make it compile better. Remove unused parameters. · 6f2a535d
Craig Topper authored Apr 24, 2012
```
llvm-svn: 155428
```
6f2a535d

Apr 23, 2012

Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the... · 3f8acfc3

Nadav Rotem authored Apr 23, 2012

Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics).

llvm-svn: 155397

3f8acfc3

This patch fixes a problem which arose when using the Post-RA scheduler · 9a091475

Preston Gurd authored Apr 23, 2012

on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395

9a091475

Use MVT instead of EVT through all of LowerVECTOR_SHUFFLEtoBlend and not just... · 153bb34a

Craig Topper authored Apr 23, 2012

Use MVT instead of EVT through all of LowerVECTOR_SHUFFLEtoBlend and not just the switch. Saves a little bit of binary size.

llvm-svn: 155339

153bb34a

Make getZeroVector and getOnesVector more alike as far as how they detect... · 0a2c809d

Craig Topper authored Apr 23, 2012

Make getZeroVector and getOnesVector more alike as far as how they detect 128-bit versus 256-bit vectors. Be explicit about both sizes and use llvm_unreachable. Similar changes to getLegalSplat.

llvm-svn: 155337

0a2c809d

Tidy up by removing some 'else' after 'return' · 2bbe8bcf
Craig Topper authored Apr 23, 2012
```
llvm-svn: 155336
```
2bbe8bcf

Tidy up spacing in LowerVECTOR_SHUFFLEtoBlend. Remove code that checks if... · 5c51eeec

Craig Topper authored Apr 23, 2012

Tidy up spacing in LowerVECTOR_SHUFFLEtoBlend. Remove code that checks if shuffle operand has a different type than the the shuffle result since it can never happen.

llvm-svn: 155333

5c51eeec

Add a couple llvm_unreachables. · a52f0d09
Craig Topper authored Apr 23, 2012
```
llvm-svn: 155332
```
a52f0d09
Remove some tab characers. · 984dc015
Craig Topper authored Apr 23, 2012
```
llvm-svn: 155331
```
984dc015
Remove some 'else' after 'return'. No functional change. · ea428fd7
Craig Topper authored Apr 23, 2012
```
llvm-svn: 155330
```
ea428fd7

Apr 22, 2012

Make Extract128BitVector and Insert128BitVector take an unsigned instead of an... · bf7d5666

Craig Topper authored Apr 22, 2012

Make Extract128BitVector and Insert128BitVector take an unsigned instead of an ConstantNode SDValue. getConstant was almost always called just before only to have the functions take it apart and build a new ConstantSDNode.

llvm-svn: 155325

bf7d5666

Convert getNode(UNDEF) to getUNDEF. · 2d474d6d
Craig Topper authored Apr 22, 2012
```
llvm-svn: 155321
```
2d474d6d

Make calls to getVectorShuffle more consistent. Use shuffle VT for calls to... · 860ed0d2

Craig Topper authored Apr 22, 2012

Make calls to getVectorShuffle more consistent. Use shuffle VT for calls to getUNDEF instead of requerying. Use &Mask[0] instead of Mask.data().

llvm-svn: 155320

860ed0d2

Tidy up. 80 columns and argument alignment. · 43397c09
Craig Topper authored Apr 22, 2012
```
llvm-svn: 155319
```
43397c09

Simplify code by converting multiple places that were manually concatenating... · ad56a744

Craig Topper authored Apr 22, 2012

Simplify code by converting multiple places that were manually concatenating 128-bit vectors to use either CONCAT_VECTORS or a helper function. CONCAT_VECTORS will itself be lowered to the same pattern as before. The helper function is needed for concats of BUILD_VECTORs since getNode(CONCAT_VECTORS) will just return a large BUILD_VECTOR and we may be trying to lower large BUILD_VECTORS when this occurs.

llvm-svn: 155318

ad56a744

ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 · 8d7e56c4
Elena Demikhovsky authored Apr 22, 2012
```
llvm-svn: 155309
```
8d7e56c4

Apr 21, 2012
- Make some fixed arrays const. Use array_lengthof in a couple places instead of a hardcoded number. · 6eadae8e
  Craig Topper authored Apr 21, 2012
```
llvm-svn: 155294
```
  6eadae8e
- Tidy up. 80 columns and some other spacing issues. · 2568bf30
  Craig Topper authored Apr 21, 2012
```
llvm-svn: 155291
```
  2568bf30
Apr 20, 2012
- Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change... · abadc660
  Craig Topper authored Apr 20, 2012
```
Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.

llvm-svn: 155186
```
  abadc660
Apr 19, 2012

Fixed the llvm-mv X86 disassembler so the 'C' API gets jumps properly · ec4bd312

Kevin Enderby authored Apr 18, 2012

symbolicated.  These have and operand type of TYPE_RELv which was not handled
as isBranch in translateImmediate() in X86Disassembler.cpp.  rdar://11268426 

llvm-svn: 155074

ec4bd312

Apr 18, 2012
- Remove AVX vpermil intrinsics. I removed their uses from clang headers and builtins a while back. · d3c9e404
  Craig Topper authored Apr 18, 2012
```
llvm-svn: 154985
```
  d3c9e404
Apr 17, 2012
- Don't decode vperm2i128 or vperm2f128 into a shuffle if bit 3 or 7 of the immediate is set. · 354103d8
  Craig Topper authored Apr 17, 2012
```
llvm-svn: 154907
```
  354103d8
- Temporarily turn off anti-dependency checking · 5333e2e5
  Preston Gurd authored Apr 16, 2012
```
during Post RA scheduling in X86,
until the X86 target is changed to properly set up
post RA liveness.

llvm-svn: 154874
```
  5333e2e5
Apr 16, 2012
- Fix incorrect atomics codegen introduced in r154705, and extend test to catch it. · 12da79b8
  Richard Smith authored Apr 16, 2012
```
llvm-svn: 154845
```
  12da79b8
- Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes. · 4badeb3f
  Craig Topper authored Apr 16, 2012
```
llvm-svn: 154801
```
  4badeb3f
- Change type profile for vpermv back to using operand type for the mask... · 26d7a949
  Craig Topper authored Apr 16, 2012
```
Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.

llvm-svn: 154798
```
  26d7a949
- Flip the arguments when converting vpermd/vpermps intrinsics into... · c0075aa7
  Craig Topper authored Apr 16, 2012
```
Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.

llvm-svn: 154797
```
  c0075aa7
- Merge vpermps/vpermd and vpermpd/vpermq SD nodes. · b86fa404
  Craig Topper authored Apr 16, 2012
```
llvm-svn: 154782
```
  b86fa404
- Fix SDTypeProfile for vpermps. The mask operand should be v8i32. · b04fe340
  Craig Topper authored Apr 16, 2012
```
llvm-svn: 154781
```
  b04fe340
- Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in... · 1f8c9eb9
  Craig Topper authored Apr 15, 2012
```
Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in vpermps/vpermd. Hardware only looks at lower 3-bits.

llvm-svn: 154780
```
  1f8c9eb9
- Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors. · bfc9a5f7
  Craig Topper authored Apr 15, 2012
```
llvm-svn: 154778
```
  bfc9a5f7
Apr 15, 2012
- Fix PR12529. The Vxx family of instructions are only supported by AVX. · 42bcd04e
  Nadav Rotem authored Apr 15, 2012
```
Use non-vex instructions for SSE4.

llvm-svn: 154770
```
  42bcd04e
- Added VPERM optimization for AVX2 shuffles · 779a72b4
  Elena Demikhovsky authored Apr 15, 2012
```
llvm-svn: 154761
```
  779a72b4