Commits · 99b017ae35c2a41f0970bf5c847eb8b34a696702 · Lorenzo Albano / LLVM bpEVL

Mar 31, 2016

[PPC] basic support for Power 9 direct move instructions · 99b017ae

Ehsan Amiri authored Mar 31, 2016

http://reviews.llvm.org/D18097

Initial support does not include any patterns to generate this instructions

llvm-svn: 265031

99b017ae

Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class,... · 80722719

Chuang-Yu Cheng authored Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat

This change implements the following vsx instructions:

- Scalar Insert/Extract
    xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp

- Vector Insert/Extract
    xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
    xxextractuw xxinsertw

- Scalar/Vector Test Data Class
    xststdcdp xststdcsp xststdcqp
    xvtstdcdp xvtstdcsp

- Maximum/Minimum
    xsmaxcdp xsmaxjdp
    xsmincdp xsminjdp

- Vector Byte-Reverse/Permute/Splat
    xxbrd xxbrh xxbrq xxbrw
    xxperm xxpermr
    xxspltib

30 instructions

Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16842

llvm-svn: 264567

80722719

[Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic · 56638489

Chuang-Yu Cheng authored Mar 28, 2016

This change implements the following vsx instructions:

- quad-precision move
    xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp

- quad-precision fp-arithmetic
    xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
    xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)

22 instructions

Thanks Nemanja and Kit for careful review and invaluable discussion!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16110

llvm-svn: 264565

56638489

Mar 08, 2016

[Power9] Implement new vsx instructions: load, store instructions for vector and scalar · ba532dc8

Kit Barton authored Mar 08, 2016

We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to
implement this new patch.

This patch implements the following vsx instructions:

Vector load/store:
lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx
stxv stxvb16x stxvh8x stxvl stxvll stxvx
Scalar load/store:
lxsd lxssp lxsibzx lxsihzx
stxsd stxssp stxsibx stxsihx
21 instructions

Phabricator: http://reviews.llvm.org/D16919
llvm-svn: 262906

ba532dc8

Feb 26, 2016

Power9] Implement new vsx instructions: compare and conversion · 93612ec5

Kit Barton authored Feb 26, 2016

This change implements the following vsx instructions:

Quad/Double-Precision Compare:
xscmpoqp xscmpuqp
xscmpexpdp xscmpexpqp
xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp
xvcmpnedp(.) xvcmpnesp(.)
Quad-Precision Floating-Point Conversion
xscvqpdp(o) xscvdpqp
xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp
xscvdphp xscvhpdp xvcvhpsp xvcvsphp
xsrqpi xsrqpix xsrqpxp
28 instructions

Phabricator: http://reviews.llvm.org/D16709
llvm-svn: 262068

93612ec5

Dec 15, 2015

Bitcasts between FP and INT values using direct moves · 8922476b

Nemanja Ivanovic authored Dec 15, 2015

This patch corresponds to review:
http://reviews.llvm.org/D15286

This patch was meant to land in revision 255246, but I accidentally uploaded
the patch that corresponds to http://reviews.llvm.org/D15372 in that revision
accidentally.

Thereby, this patch is the actual Bitcasts using direct moves patch, whereas
http://reviews.llvm.org/rL255246 actually corresponds to
http://reviews.llvm.org/D15372.

llvm-svn: 255649

8922476b

Dec 11, 2015

Start replacing vector_extract/vector_insert with extractelt/insertelt · fbd9bbfd

Matt Arsenault authored Dec 11, 2015

These are redundant pairs of nodes defined for
INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
insertelement/extractelement are slightly closer to the corresponding
C++ node name, and has stricter type checking so prefer it.

Update targets to only use these nodes where it is trivial to do so.
AArch64, ARM, and Mips all have various type errors on simple replacement,
so they will need work to fix.

Example from AArch64:

def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
          (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;

Which is trying to do sext_inreg i8, i8.

llvm-svn: 255359

fbd9bbfd

Dec 10, 2015

Bitcasts between FP and INT values using direct moves · ac8d01ad

Nemanja Ivanovic authored Dec 10, 2015

This patch corresponds to review:
http://reviews.llvm.org/D15286

LLVM IR frequently contains bitcast operations between floating point and
integer values of the same width. Doing this through memory operations is
quite expensive on PPC. This patch allows the use of direct register moves
between FPRs and GPRs for lowering bitcasts.

llvm-svn: 255246

ac8d01ad

Oct 09, 2015

Vector element extraction without stack operations on Power 8 · d3896573

Nemanja Ivanovic authored Oct 09, 2015

This patch corresponds to review:
http://reviews.llvm.org/D12032

This patch builds onto the patch that provided scalar to vector conversions
without stack operations (D11471).
Included in this patch:

    - Vector element extraction for all vector types with constant element number
    - Vector element extraction for v16i8 and v8i16 with variable element number
    - Removal of some unnecessary COPY_TO_REGCLASS operations that ended up
      unnecessarily moving things around between registers

Not included in this patch (will be in upcoming patch):

    - Vector element extraction for v4i32, v4f32, v2i64 and v2f64 with
      variable element number
    - Vector element insertion for variable/constant element number

Testing is provided for all extractions. The extractions that are not
implemented yet are just placeholders.

llvm-svn: 249822

d3896573

Sep 29, 2015

Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1 · 2c84b294

Nemanja Ivanovic authored Sep 29, 2015

This patch corresponds to review:
http://reviews.llvm.org/D13191

Back end portion of the fifth round of additions to altivec.h.

llvm-svn: 248809

2c84b294

Aug 31, 2015

[PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands · a2cdbce6

Hal Finkel authored Aug 30, 2015

There were really two problems here. The first was that we had the truth tables
for signed i1 comparisons backward. I imagine these are not very common, but if
you have:
  setcc i1 x, y, LT
this has the '0 1' and the '1 0' results flipped compared to:
  setcc i1 x, y, ULT
because, in the signed case, '1 0' is really '-1 0', and the answer is not the
same as in the unsigned case.

The second problem was that we did not have patterns (at all) for the unsigned
comparisons select_cc nodes for i1 comparison operands. This was the specific
cause of PR24552. These had to be added (and a missing Altivec promotion added
as well) to make sure these function for all types. I've added a bunch more
test cases for these patterns, and there are a few FIXMEs in the test case
regarding code-quality.

Fixes PR24552.

llvm-svn: 246400

a2cdbce6

Aug 13, 2015

Scalar to vector conversions using direct moves · 1c39ca65

Nemanja Ivanovic authored Aug 13, 2015

This patch corresponds to review:
http://reviews.llvm.org/D11471

It improves the code generated for converting a scalar to a vector value. With
direct moves from GPRs to VSRs, we no longer require expensive stack operations
for this. Subsequent patches will handle the reverse case and more general
operations between vectors and their scalar elements.

llvm-svn: 244921

1c39ca65

Jul 14, 2015

Add missing builtins to the PPC back end for ABI compliance (vol. 4) · 984a3613

Nemanja Ivanovic authored Jul 14, 2015

This patch corresponds to review:
http://reviews.llvm.org/D11183

Back end portion of the fourth round of additions to altivec.h.

llvm-svn: 242167

984a3613

Jul 10, 2015
- NFC. Added a blank line for consistency. · d9e4b4ff
  Nemanja Ivanovic authored Jul 10, 2015
```
llvm-svn: 241913
```
  d9e4b4ff
- Add missing builtins to the PPC back end for ABI compliance (vol. 3) · 5655fb32
  Nemanja Ivanovic authored Jul 10, 2015
```
This patch corresponds to review:
http://reviews.llvm.org/D10973

Back end portion of the third round of additions to altivec.h.

llvm-svn: 241900
```
  5655fb32
Jul 05, 2015

Add missing builtins to the PPC back end for ABI compliance (vol. 2) · d358b8f8

Nemanja Ivanovic authored Jul 05, 2015

This patch corresponds to review:
http://reviews.llvm.org/D10874

Back end portion of the second round of additions to altivec.h.

llvm-svn: 241398

d358b8f8

Jun 26, 2015

Add missing builtins to the PPC back end for ABI compliance (vol. 1) · f502a428

Nemanja Ivanovic authored Jun 26, 2015

This patch corresponds to review:
http://reviews.llvm.org/D10638

This is the back end portion of patch
http://reviews.llvm.org/D10637
It just adds the code gen and intrinsic functions necessary to support that patch to the back end.

llvm-svn: 240820

f502a428

May 29, 2015

Add support for VSX FMA single-precision instructions to the PPC back end · 376e1736

Nemanja Ivanovic authored May 29, 2015

This patch corresponds to review:
http://reviews.llvm.org/D9941

It adds the various FMA instructions introduced in the version 2.07 of
the ISA along with the testing for them. These are operations on single
precision scalar values in VSX registers.

llvm-svn: 238578

376e1736

May 21, 2015

Add support for VSX scalar single-precision arithmetic in the PPC target · f02def6c

Nemanja Ivanovic authored May 21, 2015

http://reviews.llvm.org/D9891
Following up on the VSX single precision loads and stores added earlier, this
adds support for elementary arithmetic operations on single precision values
in VSX registers. These instructions utilize the new VSSRC register class.
Instructions added:
xsaddsp
xsdivsp
xsmulsp
xsresp
xsrsqrtesp
xssqrtsp
xssubsp

llvm-svn: 237937

f02def6c

May 07, 2015

Add VSX Scalar loads and stores to the PPC back end · f3c94b1e

Nemanja Ivanovic authored May 07, 2015

This patch corresponds to review:
http://reviews.llvm.org/D9440

It adds a new register class to the PPC back end to contain single precision
values in VSX registers. Additionally, it adds scalar loads and stores for
VSX registers.

llvm-svn: 236755

f3c94b1e

May 05, 2015

This patch adds ABI support for v1i128 data type. · d4eb73c0

Kit Barton authored May 05, 2015

It adds v1i128 to the appropriate register classes and checks parameter passing
and return values.

This is related to http://reviews.llvm.org/D9081, which will add instructions
that exploit the v1i128 datatype.

Phabricator review: http://reviews.llvm.org/D9475

llvm-svn: 236503

d4eb73c0

Apr 27, 2015

[PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations · fe723b9a

Bill Schmidt authored Apr 27, 2015

This patch adds a new SSA MI pass that runs on little-endian PPC64
code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors
without alignment constraints are accomplished for little-endian using
lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional
xxswapd instructions hurts performance in comparison with big-endian
code, but they are necessary in the general case to support correct
semantics.

However, the general case does not apply to most vector code. Many
vector instructions are lane-insensitive; they do not "care" which
lanes the parallel computations are performed within, provided that
the resulting data is stored into the correct locations. Thus this
pass looks for computations that perform only lane-insensitive
operations, and remove the unnecessary swaps from loads and stores in
such computations.

Future improvements will allow computations using certain
lane-sensitive operations to also be optimized in this manner, by
modifying the lane-sensitive operations to account for the permuted
order of the lanes. However, this patch only adds the infrastructure
to permit this; no lane-sensitive operations are optimized at this
time.

This code is heavily exercised by the various vectorizing applications
in the projects/test-suite tree. For the time being, I have only added
one simple test case to demonstrate what the pass is doing. Although
it is quite simple, it provides coverage for much of the code,
including the special case handling of copies and subreg-to-reg
operations feeding the swaps. I plan to add additional tests in the
future as I fill in more of the "special handling" code.

Two existing tests were affected, because they expected the swaps to
be present, but they are now removed.

llvm-svn: 235910

fe723b9a

Apr 11, 2015

Add direct moves to/from VSR and exploit them for FP/INT conversions · c38b5311

Nemanja Ivanovic authored Apr 11, 2015

This patch corresponds to review:
http://reviews.llvm.org/D8928

It adds direct move instructions to/from VSX registers to GPR's. These are
exploited for FP <-> INT conversions.

llvm-svn: 234682

c38b5311

Mar 12, 2015

[PowerPC] Remove canFoldAsLoad from instruction definitions · 6a778fb7

Hal Finkel authored Mar 11, 2015

The PowerPC backend had a number of loads that were marked as canFoldAsLoad
(and I'm partially at fault here for copying around the relevant line of
TableGen definitions without really looking at what it meant). This is not
right; PPC (non-memory) instructions don't support direct memory operands, and
so there is nothing a 'foldable' instruction could be folded into.

Noticed by inspection, no test case.

The one thing we might lose by doing this is ability to fold some loads into
stackmap/patchpoint pseudo-instructions. However, this was untested, and would
not obviously have worked for extending loads, and I'd rather re-add support
for that once it can be tested.

llvm-svn: 231982

6a778fb7

Feb 18, 2015

This patch adds the VSX logical instructions introduced in the Power ISA 2.07.... · 298beb5e

Kit Barton authored Feb 18, 2015

This patch adds the VSX logical instructions introduced in the Power ISA 2.07. It also removes the added complexity that favors VMX versions of the three instructions.

Phabricator review: http://reviews.llvm.org/D7616

Commiting on Nemanja's behalf.

llvm-svn: 229694

298beb5e

Feb 01, 2015

[PowerPC] VSX stores don't also read · e3d2b20c

Hal Finkel authored Feb 01, 2015

The VSX store instructions were also picking up an implicit "may read" from the
default pattern, which was an intrinsic (and we don't currently have a way of
specifying write-only intrinsics).

This was causing MI verification to fail for VSX spill restores.

llvm-svn: 227759

e3d2b20c

Dec 09, 2014

[PowerPC 2/4] Little-endian adjustments for VSX insert/extract operations · 10f6eb91

Bill Schmidt authored Dec 09, 2014

For little endian, we need to make some straightforward adjustments in
the code expansions for scalar_to_vector and vector_extract of v2f64.
First, scalar_to_vector must place the scalar into vector element
zero.  However, our implementation of SUBREG_TO_REG will place it into
big-element vector element zero (high-order bits), and for little
endian we need it in the low-order bits.  The LE implementation splats
the high-order doubleword into the low-order doubleword.

Second, the meaning of (vector_extract x, 0) and (vector_extract x, 1)
must be reversed for similar reasons.

A new test is added that tests code generation for insertelement and
extractelement for both element 0 and element 1.  It is disabled in
this patch but enabled in patch 4/4, for reasons stated in the test.

llvm-svn: 223788

10f6eb91

[PowerPC 1/4] Little-endian adjustments for VSX loads/stores · fae5d715

Bill Schmidt authored Dec 09, 2014

This patch addresses the inherent big-endian bias in the lxvd2x,
lxvw4x, stxvd2x, and stxvw4x instructions. These instructions load
vector elements into registers left-to-right (with the first element
loaded into the high-order bits of the register), regardless of the
endian setting of the processor. However, these are the only
vector memory instructions that permit unaligned storage accesses, so
we want to use them for little-endian.

To make this work, a lxvd2x or lxvw4x is replaced with an lxvd2x
followed by an xxswapd, which swaps the doublewords. This works for
lxvw4x as well as lxvd2x, because for lxvw4x on an LE system the
vector elements are in LE order (right-to-left) within each
doubleword. (Thus after lxvw2x of a <4 x float> the elements will
appear as 1, 0, 3, 2. Following the swap, they will appear as 3, 2,
0, 1, as desired.) For stores, an stxvd2x or stxvw4x is replaced
with an stxvd2x preceded by an xxswapd.

Introduction of extra swap instructions provides correctness, but
obviously is not ideal from a performance perspective. Future patches
will address this with optimizations to remove most of the introduced
swaps, which have proven effective in other implementations.

The introduction of the swaps is performed during lowering of LOAD,
STORE, INTRINSIC_W_CHAIN, and INTRINSIC_VOID operations. The latter
are used to translate intrinsics that specify the VSX loads and stores
directly into equivalent sequences for little endian. Thus code that
uses vec_vsx_ld and vec_vsx_st does not have to be modified to be
ported from BE to LE.

We introduce new PPCISD opcodes for LXVD2X, STXVD2X, and XXSWAPD for
use during this lowering step. In PPCInstrVSX.td, we add new SDType
and SDNode definitions for these (PPClxvd2x, PPCstxvd2x, PPCxxswapd).
These are recognized during instruction selection and mapped to the
correct instructions.

Several tests that were written to use -mcpu=pwr7 or pwr8 are modified
to disable VSX on LE variants because code generation changes with
this and subsequent patches in this set. I chose to include all of
these in the first patch than try to rigorously sort out which tests
were broken by one or another of the patches. Sorry about that.

The new test vsx-ldst-builtin-le.ll, and the changes to vsx-ldst.ll,
are disabled until LE support is enabled because of breakages that
occur as noted in those tests. They are re-enabled in patch 4/4.

llvm-svn: 223783

fae5d715

Nov 26, 2014
- Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files. · c50d64b0
  Craig Topper authored Nov 26, 2014
```
llvm-svn: 222801
```
  c50d64b0
Nov 14, 2014

[PowerPC] Add VSX builtins for vec_div · 76746929

Bill Schmidt authored Nov 14, 2014

This patch adds builtin support for xvdivdp and xvdivsp, along with a
test case.  Straightforward stuff.

There's a companion patch for Clang.

llvm-svn: 221983

76746929

Nov 12, 2014

[PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics · 72954784

Bill Schmidt authored Nov 12, 2014

This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for
PowerPC, which provide programmer access to the lxvd2x, lxvw4x,
stxvd2x, and stxvw4x instructions.

New LLVM intrinsics are provided to represent these four instructions
in IntrinsicsPowerPC.td.  These are patterned after the similar
intrinsics for lvx and stvx (Altivec).  In PPCInstrVSX.td, these
intrinsics are tied to the code gen patterns, with additional patterns
to allow plain vanilla loads and stores to still generate these
instructions.

At -O1 and higher the intrinsics are immediately converted to loads
and stores in InstCombineCalls.cpp.  This will open up more
optimization opportunities while still allowing the correct
instructions to be generated.  (Similar code exists for aligned
Altivec loads and stores.)

The new intrinsics are added to the code that checks for consecutive
loads and stores in PPCISelLowering.cpp, as well as to
PPCTargetLowering::getTgtMemIntrinsic().

There's a new test to verify the correct instructions are generated.
The loads and stores tend to be reordered, so the test just counts
their number.  It runs at -O2, as it's not very effective to test this
at -O0, when many unnecessary loads and stores are generated.

I ended up having to modify vsx-fma-m.ll.  It turns out this test case
is slightly unreliable, but I don't know a good way to prevent
problems with it.  The xvmaddmdp instructions read and write the same
register, which is one of the multiplicands.  Commutativity allows
either to be chosen.  If the FMAs are reordered differently than
expected by the test, the register assignment can be different as a
result.  Hopefully this doesn't change often.

There is a companion patch for Clang.

llvm-svn: 221767

72954784

Oct 31, 2014

[PowerPC] Initial VSX intrinsic support, with min/max for vector double · 1ca69fa6

Bill Schmidt authored Oct 31, 2014

Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions.  This patch adds
basic support for VSX intrinsics in general, and tests it by
implementing intrinsics for minimum and maximum for the vector double
data type.

The LLVM portion of this is quite straightforward.  There is a
companion patch for Clang.

llvm-svn: 220988

1ca69fa6

Oct 22, 2014

[PATCH] Support select-cc for VSFRC when VSX is enabled · 9c54bbd7

Bill Schmidt authored Oct 22, 2014

A previous patch enabled SELECT_VSRC and SELECT_CC_VSRC for VSX to
handle <2 x double> cases.  This patch adds SELECT_VSFRC and
SELECT_CC_VSFRC to allow use of all 64 vector-scalar registers for the
f64 type when VSX is enabled.  The changes are analogous to those in
the previous patch.  I've added a new variant to vsx.ll to test the
code generation.

(I also cleaned up a little formatting in PPCInstrVSX.td from the
previous patch.)

llvm-svn: 220395

9c54bbd7

[PowerPC] Support select-cc for VSX · 61e65233

Bill Schmidt authored Oct 22, 2014

The tests test/CodeGen/Generic/select-cc.ll and
test/CodeGen/PowerPC/select-cc.ll both fail with VSX enabled.  The
problem is that the lowering logic for the SELECT and SELECT_CC
operations doesn't currently support the VSX registers.  This patch
fixes that.

In lib/Target/PowerPC/PPCInstrInfo.td, we have pseudos to handle this
for other register classes.  Similar pseudos are added in
PPCInstrVSX.td (they must be there, because the "vsrc" register class
definition appears there) for the VSRC register class.  The
SELECT_VSRC pseudo is then used in pattern matching for SELECT_CC.

The rest of the patch just adds logic for SELECT_VSRC wherever similar
logic appears for SELECT_VRRC.

There are no new test cases because the existing tests above test
this, along with a variant in test/CodeGen/PowerPC/vsx.ll.

After discussion with Hal, a future patch will add similar _VSFRC
variants to override f64 type handling (currently using F8RC).

llvm-svn: 220385

61e65233

Oct 17, 2014

[PowerPC] Enable use of lxvw4x/stxvw4x in VSX code generation · 2d1128ac

Bill Schmidt authored Oct 17, 2014

Currently the VSX support enables use of lxvd2x and stxvd2x for 2x64
types, but does not yet use lxvw4x and stxvw4x for 4x32 types.  This
patch adds that support.

As with lxvd2x/stxvd2x, this involves straightforward overriding of
the patterns normally recognized for lvx/stvx, with preference given
to the VSX patterns when VSX is enabled.

In addition, the logic for permitting misaligned memory accesses is
modified so that v4r32 and v4i32 are treated the same as v2f64 and
v2i64 when VSX is enabled.  Finally, the DAG generation for unaligned
loads is changed to just use a normal LOAD (which will become lxvw4x)
on P8 and later hardware, where unaligned loads are preferred over
lvsl/lvx/lvx/vperm.

A number of tests now generate the VSX loads/stores instead of
lvx/stvx, so this patch adds VSX variants to those tests.  I've also
added <4 x float> tests to the vsx.ll test case, and created a
vsx-p8.ll test case to be used for testing code generation for the
P8Vector feature.  For now, that simply tests the unaligned load/store
behavior.

This has been tested along with a temporary patch to enable the VSX
and P8Vector features, with no new regressions encountered with or
without the temporary patch applied.

llvm-svn: 220047

2d1128ac

Oct 09, 2014

[PPC64] VSX indexed-form loads use wrong instruction format · cb34fd09

Bill Schmidt authored Oct 09, 2014

The VSX instruction definitions for lxsdx, lxvd2x, lxvdsx, and lxvw4x
incorrectly use the XForm_1 instruction format, rather than the
XX1Form instruction format.  This is likely a pasto when creating
these instructions, which were based on lvx and so forth.  This patch
uses the correct format.

The existing reformatting test (test/MC/PowerPC/vsx.s) missed this
because the two formats differ only in that XX1Form has an extension
to the target register field in bit 31.  The tests for these
instructions used a target register of 7, so the default of 0 in bit
31 for XForm_1 didn't expose a problem.  For register numbers 32-63
this would be noticeable.  I've changed the test to use higher
register numbers to verify my change is effective.

llvm-svn: 219416

cb34fd09

May 22, 2014

Reset the subtarget for DAGToDAG on every iteration of runOnMachineFunction. · 1b8e7636

Eric Christopher authored May 22, 2014

This required updating the generated functions and TD file accordingly
to be pointers rather than const references.

llvm-svn: 209375

1b8e7636

Apr 01, 2014
- [PowerPC] Add some missing VSX bitcast patterns · 9e0baa6d
  Hal Finkel authored Apr 01, 2014
```
llvm-svn: 205352
```
  9e0baa6d
Mar 30, 2014

[PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREG · 5c0d1454

Hal Finkel authored Mar 30, 2014

sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node
(and similarly for v2i16 and v2i8). Even though there are no sign-extension (or
algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by
converting two and from v2i64. The small trick necessary here is to shift the
i32 elements into the right lanes before the i32 -> f64 step. This is because
of the big Endian nature of the system, we need the i32 portion in the high
word of the i64 elements.

For v2i16 and v2i8 we can do the same, but we first use the default Altivec
shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and
then apply the above procedure.

llvm-svn: 205146

5c0d1454

Mar 29, 2014

[PowerPC] VSX instruction latency corrections · e8fba987

Hal Finkel authored Mar 29, 2014

The vector divide and sqrt instructions have high latencies, and the scalar
comparisons are like all of the others. On the P7, permutations take an extra
cycle over purely-simple vector ops.

llvm-svn: 205096

e8fba987