Commits · 3c8c385a77bbbcadb96cc0f4c65eaded257da6f8 · Lorenzo Albano / LLVM bpEVL

Jan 26, 2017

[PPC] cleanup of mayLoad/mayStore flags and memory operands. · 3c8c385a

Sean Fertile authored Jan 26, 2017

1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store
   instructions.
2) Updated the flags on a number of intrinsics indicating that they write
    memory.
3) Added SDNPMemOperand flags for some target dependent SDNodes so that they
   propagate their memory operand

Review: https://reviews.llvm.org/D28818
llvm-svn: 293200

3c8c385a

Dec 15, 2016

[Power9] Allow AnyExt immediates for XXSPLTIB · 552c8e96

Nemanja Ivanovic authored Dec 15, 2016

In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
a splat of an i8 constant will end up with signed 8-bit values and other
situations, it'll end up with unsigned ones. Handle both situations.

Fixes PR31340.

llvm-svn: 289804

552c8e96

Dec 09, 2016
- [PPC] Add intrinsics for vector extract word and vector insert word. · 1c4109b4
  Sean Fertile authored Dec 09, 2016
```
Revision: https://reviews.llvm.org/D26547
llvm-svn: 289227
```
  1c4109b4
Dec 06, 2016

[PowerPC] Improvements for BUILD_VECTOR Vol. 4 · 15748f49

Nemanja Ivanovic authored Dec 06, 2016

This is the final patch in the series of patches that improves
BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
to remove redundant instructions. It also adds a large test case which
encompasses a large set of code patterns that build vectors - this test
case was the motivator for this series of patches.

Differential Revision: https://reviews.llvm.org/D26066

llvm-svn: 288800

15748f49

Nov 30, 2016

Revert https://reviews.llvm.org/rL287679 · f57f150b

Nemanja Ivanovic authored Nov 29, 2016

This commit caused some miscompiles that did not show up on any of the bots.
Reverting until we can investigate the cause of those failures.

llvm-svn: 288214

f57f150b

Nov 29, 2016

[PowerPC] Improvements for BUILD_VECTOR Vol. 1 · df1cb520

Nemanja Ivanovic authored Nov 29, 2016

This patch corresponds to review:
https://reviews.llvm.org/D25912

This is the first patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC.

llvm-svn: 288152

df1cb520

Nov 23, 2016

[PowerPC] Remove InstAlias definitions that cause incorrect assembly · 10fc3cfc

Nemanja Ivanovic authored Nov 23, 2016

In rL283190, I added some InstAlias definitions to generate extended mnemonics
for some uses of the XXPERMDI instruction. However, when the assembler matches
these extended mnemonics, it matches the new instruction in situations where it
should match the old one.
This patch removes these definitions and accomplishes that by defining these
mnemonics with additional instructions that are isCodeGenOnly.

Fixes PR31127.

llvm-svn: 287765

10fc3cfc

Nov 22, 2016

[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE · b8e30d6d

Nemanja Ivanovic authored Nov 22, 2016

This patch corresponds to review:
https://reviews.llvm.org/D26861

It also fixes PR30730.

Committing on behalf of Lei Huang.

llvm-svn: 287679

b8e30d6d

Nov 15, 2016

vector load store with length (left justified) llvm portion · a19c9e60
Zaara Syeda authored Nov 15, 2016
```
llvm-svn: 286993
```
a19c9e60

[PowerPC] Implement BE VSX load/store builtins - llvm portion. · 5f850cd1

Tony Jiang authored Nov 15, 2016

This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.

llvm-svn: 286967

5f850cd1

Nov 14, 2016

[PPC] Add intrinsic mapping to the xscvhpsp instruction · a435e07d

Sean Fertile authored Nov 14, 2016

add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to
Single-Precision' instruction.

Differential review: https://reviews.llvm.org/D26536

llvm-svn: 286862

a435e07d

[PPC] add intrinsics for vec extract exp/significand and vec test data class. · adda5b2d
Sean Fertile authored Nov 14, 2016
```
  Differential Revision: https://reviews.llvm.org/D26272

llvm-svn: 286829
```
adda5b2d

Nov 11, 2016

[PowerPC] Add remaining vector permute builtins in altivec.h - LLVM portion · ec4b0c36

Nemanja Ivanovic authored Nov 11, 2016

This patch corresponds to review:
https://reviews.llvm.org/D26480

Adds all the intrinsics used for various permute builtins that will
be added to altivec.h.

llvm-svn: 286638

ec4b0c36

[PowerPC] Add vector conversion builtins to altivec.h - LLVM portion · 2efc3cb9

Nemanja Ivanovic authored Nov 11, 2016

This patch corresponds to review:
https://reviews.llvm.org/D26307

Adds all the intrinsics used for various conversion builtins that will
be added to altivec.h. These are type conversions between various types of
vectors.

llvm-svn: 286596

2efc3cb9

Oct 26, 2016

[PowerPC] Implement vec_insert_exp builtins - llvm portion · 0f45998b

Nemanja Ivanovic authored Oct 26, 2016

This revision corresponds to review: https://reviews.llvm.org/D25957.
Committing on behalf of Zaara Syeda.

llvm-svn: 285225

0f45998b

Oct 24, 2016

[PPC] Generate positive FP zero using xor insn instead of loading from constant area · c90b02cf

Ehsan Amiri authored Oct 24, 2016

https://reviews.llvm.org/D23614

Currently we load +0.0 from constant area. That can change to be generated using
XOR instruction.

llvm-svn: 284995

c90b02cf

Oct 04, 2016

[Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set · 6354d235

Nemanja Ivanovic authored Oct 04, 2016

This patch corresponds to review:

The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.

llvm-svn: 283212

6354d235

[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions · 11049f8f

Nemanja Ivanovic authored Oct 04, 2016

This patch corresponds to review:
https://reviews.llvm.org/D23155

This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:

    Int to Fp conversions of 1 or 2-byte values loaded from memory
    Building vectors of 1 or 2-byte integers with values loaded from memory
    Storing individual 1 or 2-byte elements from integer vectors

This patch implements all of those uses.

llvm-svn: 283190

11049f8f

Sep 27, 2016

[Power9] Builtins for ELF v.2 API conformance - back end portion · 6f22b413

Nemanja Ivanovic authored Sep 27, 2016

This patch corresponds to review:
https://reviews.llvm.org/D24396

This patch adds support for the "vector count trailing zeroes",
"vector compare not equal" and "vector compare not equal or zero instructions"
as well as "scalar count trailing zeroes" instructions. It also changes the
vector negation to use XXLNOR (when VSX is enabled) so as not to increase
register pressure (previously this was done with a splat immediate of all
ones followed by an XXLXOR). This was done because the altivec.h
builtins (patch to follow) use vector negation and the use of an additional
register for the splat immediate is not optimal.

llvm-svn: 282478

6f22b413

Sep 23, 2016

[Power9] Exploit move and splat instructions for build_vector improvement · d2c3c51a

Nemanja Ivanovic authored Sep 23, 2016

This patch corresponds to review:
https://reviews.llvm.org/D21135

This patch exploits the following instructions:
mtvsrws
lxvwsx
mtvsrdd
mfvsrld

In order to improve some build_vector and extractelement patterns.

llvm-svn: 282246

d2c3c51a

Sep 22, 2016

[PowerPC] Remove LE patterns matching generic stores/loads to VSX permuting ops · e78ffede

Nemanja Ivanovic authored Sep 22, 2016

This patch corresponds to:
https://reviews.llvm.org/D21409

The LXVD2X, LXVW4X, STXVD2X and STXVW4X instructions permute the two doublewords
in the vector register when in little-endian mode. Custom code ensures that the
necessary swaps are inserted for these. This patch simply removes the possibilty
that a load/store node will match one of these instructions in the SDAG as that
would not insert the necessary swaps.

llvm-svn: 282144

e78ffede

[Power9] Add exploitation of non-permuting memory ops · 6e7879c5

Nemanja Ivanovic authored Sep 22, 2016

This patch corresponds to review:
https://reviews.llvm.org/D19825

The new lxvx/stxvx instructions do not require the swaps to line the elements
up correctly. In order to select them over the lxvd2x/lxvw4x instructions which
require swaps, the patterns for the old instruction have a predicate that
ensures they won't be selected on Power9 and newer CPUs.

llvm-svn: 282143

6e7879c5

Aug 18, 2016

[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround · 2bc3d4d4

Michael Kuperstein authored Aug 18, 2016

The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.

Differential Revision: https://reviews.llvm.org/D23597

llvm-svn: 279129

2bc3d4d4

Jul 18, 2016

[PowerPC] Remove redundant direct moves when extracting integers and converting to FP · d3c284f6

Nemanja Ivanovic authored Jul 18, 2016

This patch corresponds to review:
https://reviews.llvm.org/D21354

We use direct moves for extracting integer elements from vectors. We also use
direct moves when converting integers to FP. When these operations are chained,
we get a direct move out of a VSR followed by a direct move back into a VSR.
These are redundant - all we need to do is line up the element and convert.

llvm-svn: 275796

d3c284f6

Jul 12, 2016

[Power9] Add codegen for VSX word insert/extract instructions · b43bb614

Nemanja Ivanovic authored Jul 12, 2016

This patch corresponds to review:
http://reviews.llvm.org/D20239

It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that
are useful in some cases for inserting and extracting vector elements of
v4[if]32 vectors.

llvm-svn: 275215

b43bb614

[PowerPC] Cannonicalize applicable vector shift immediates as swaps · eebbcb6d

Nemanja Ivanovic authored Jul 12, 2016

This patch corresponds to review:
http://reviews.llvm.org/D21358

Vector shifts that have the same semantics as a vector swap are cannonicalized
as such to provide additional opportunities for swap removal optimization to
remove unnecessary swaps.

llvm-svn: 275168

eebbcb6d

Jul 05, 2016

[PowerPC] - Legalize vector types by widening instead of integer promotion · 44513e54

Nemanja Ivanovic authored Jul 05, 2016

This patch corresponds to review:
http://reviews.llvm.org/D20443

It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).

llvm-svn: 274535

44513e54

May 04, 2016

[PowerPC] Generate VSX version of splat word · 1a2b2f03

Nemanja Ivanovic authored May 04, 2016

This patch corresponds to review:
http://reviews.llvm.org/D18592

It allows the PPC back end to generate the xxspltw instruction where we
previously only emitted vspltw.

llvm-svn: 268516

1a2b2f03

Mar 31, 2016

[PPC] basic support for Power 9 direct move instructions · 99b017ae

Ehsan Amiri authored Mar 31, 2016

http://reviews.llvm.org/D18097

Initial support does not include any patterns to generate this instructions

llvm-svn: 265031

99b017ae

Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class,... · 80722719

Chuang-Yu Cheng authored Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat

This change implements the following vsx instructions:

- Scalar Insert/Extract
    xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp

- Vector Insert/Extract
    xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
    xxextractuw xxinsertw

- Scalar/Vector Test Data Class
    xststdcdp xststdcsp xststdcqp
    xvtstdcdp xvtstdcsp

- Maximum/Minimum
    xsmaxcdp xsmaxjdp
    xsmincdp xsminjdp

- Vector Byte-Reverse/Permute/Splat
    xxbrd xxbrh xxbrq xxbrw
    xxperm xxpermr
    xxspltib

30 instructions

Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16842

llvm-svn: 264567

80722719

[Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic · 56638489

Chuang-Yu Cheng authored Mar 28, 2016

This change implements the following vsx instructions:

- quad-precision move
    xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp

- quad-precision fp-arithmetic
    xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
    xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)

22 instructions

Thanks Nemanja and Kit for careful review and invaluable discussion!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16110

llvm-svn: 264565

56638489

Mar 08, 2016

[Power9] Implement new vsx instructions: load, store instructions for vector and scalar · ba532dc8

Kit Barton authored Mar 08, 2016

We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to
implement this new patch.

This patch implements the following vsx instructions:

Vector load/store:
lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx
stxv stxvb16x stxvh8x stxvl stxvll stxvx
Scalar load/store:
lxsd lxssp lxsibzx lxsihzx
stxsd stxssp stxsibx stxsihx
21 instructions

Phabricator: http://reviews.llvm.org/D16919
llvm-svn: 262906

ba532dc8

Feb 26, 2016

Power9] Implement new vsx instructions: compare and conversion · 93612ec5

Kit Barton authored Feb 26, 2016

This change implements the following vsx instructions:

Quad/Double-Precision Compare:
xscmpoqp xscmpuqp
xscmpexpdp xscmpexpqp
xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp
xvcmpnedp(.) xvcmpnesp(.)
Quad-Precision Floating-Point Conversion
xscvqpdp(o) xscvdpqp
xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp
xscvdphp xscvhpdp xvcvhpsp xvcvsphp
xsrqpi xsrqpix xsrqpxp
28 instructions

Phabricator: http://reviews.llvm.org/D16709
llvm-svn: 262068

93612ec5

Dec 15, 2015

Bitcasts between FP and INT values using direct moves · 8922476b

Nemanja Ivanovic authored Dec 15, 2015

This patch corresponds to review:
http://reviews.llvm.org/D15286

This patch was meant to land in revision 255246, but I accidentally uploaded
the patch that corresponds to http://reviews.llvm.org/D15372 in that revision
accidentally.

Thereby, this patch is the actual Bitcasts using direct moves patch, whereas
http://reviews.llvm.org/rL255246 actually corresponds to
http://reviews.llvm.org/D15372.

llvm-svn: 255649

8922476b

Dec 11, 2015

Start replacing vector_extract/vector_insert with extractelt/insertelt · fbd9bbfd

Matt Arsenault authored Dec 11, 2015

These are redundant pairs of nodes defined for
INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
insertelement/extractelement are slightly closer to the corresponding
C++ node name, and has stricter type checking so prefer it.

Update targets to only use these nodes where it is trivial to do so.
AArch64, ARM, and Mips all have various type errors on simple replacement,
so they will need work to fix.

Example from AArch64:

def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
          (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;

Which is trying to do sext_inreg i8, i8.

llvm-svn: 255359

fbd9bbfd

Dec 10, 2015

Bitcasts between FP and INT values using direct moves · ac8d01ad

Nemanja Ivanovic authored Dec 10, 2015

This patch corresponds to review:
http://reviews.llvm.org/D15286

LLVM IR frequently contains bitcast operations between floating point and
integer values of the same width. Doing this through memory operations is
quite expensive on PPC. This patch allows the use of direct register moves
between FPRs and GPRs for lowering bitcasts.

llvm-svn: 255246

ac8d01ad

Oct 09, 2015

Vector element extraction without stack operations on Power 8 · d3896573

Nemanja Ivanovic authored Oct 09, 2015

This patch corresponds to review:
http://reviews.llvm.org/D12032

This patch builds onto the patch that provided scalar to vector conversions
without stack operations (D11471).
Included in this patch:

    - Vector element extraction for all vector types with constant element number
    - Vector element extraction for v16i8 and v8i16 with variable element number
    - Removal of some unnecessary COPY_TO_REGCLASS operations that ended up
      unnecessarily moving things around between registers

Not included in this patch (will be in upcoming patch):

    - Vector element extraction for v4i32, v4f32, v2i64 and v2f64 with
      variable element number
    - Vector element insertion for variable/constant element number

Testing is provided for all extractions. The extractions that are not
implemented yet are just placeholders.

llvm-svn: 249822

d3896573

Sep 29, 2015

Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1 · 2c84b294

Nemanja Ivanovic authored Sep 29, 2015

This patch corresponds to review:
http://reviews.llvm.org/D13191

Back end portion of the fifth round of additions to altivec.h.

llvm-svn: 248809

2c84b294

Aug 31, 2015

[PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands · a2cdbce6

Hal Finkel authored Aug 30, 2015

There were really two problems here. The first was that we had the truth tables
for signed i1 comparisons backward. I imagine these are not very common, but if
you have:
  setcc i1 x, y, LT
this has the '0 1' and the '1 0' results flipped compared to:
  setcc i1 x, y, ULT
because, in the signed case, '1 0' is really '-1 0', and the answer is not the
same as in the unsigned case.

The second problem was that we did not have patterns (at all) for the unsigned
comparisons select_cc nodes for i1 comparison operands. This was the specific
cause of PR24552. These had to be added (and a missing Altivec promotion added
as well) to make sure these function for all types. I've added a bunch more
test cases for these patterns, and there are a few FIXMEs in the test case
regarding code-quality.

Fixes PR24552.

llvm-svn: 246400

a2cdbce6

Aug 13, 2015

Scalar to vector conversions using direct moves · 1c39ca65

Nemanja Ivanovic authored Aug 13, 2015

This patch corresponds to review:
http://reviews.llvm.org/D11471

It improves the code generated for converting a scalar to a vector value. With
direct moves from GPRs to VSRs, we no longer require expensive stack operations
for this. Subsequent patches will handle the reverse case and more general
operations between vectors and their scalar elements.

llvm-svn: 244921

1c39ca65