Commits · 79acbbe51353af3a47a6483c58d3bbf0ac8f2dfa · Lorenzo Albano / LLVM bpEVL

Jun 08, 2017

[Power9] Exploit vector integer extend instructions · 79acbbe5

Zaara Syeda authored Jun 08, 2017

This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword

Differential Revision: https://reviews.llvm.org/D33510

llvm-svn: 304992

79acbbe5

May 31, 2017

[PowerPC] Fix a performance bug for PPC::XXPERMDI. · 60c247de

Tony Jiang authored May 31, 2017

There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.

Differential Revision: https://reviews.llvm.org/D33404

llvm-svn: 304298

60c247de

May 29, 2017

[PPC] Fix assertion failure during binary encoding with -mcpu=pwr9 · e3c14ebb

Hiroshi Inoue authored May 29, 2017

Summary
clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding.
The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate.

This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test).
The fix is from Nemanja Ivanovic @nemanjai.

Differential Revision: https://reviews.llvm.org/D33482

llvm-svn: 304133

e3c14ebb

May 25, 2017

[PowerPC] Fix a performance bug for PPC::XXSLDWI. · 0a429f04

Tony Jiang authored May 24, 2017

There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI
instruction, this patch recognizes them and does the selection to improve the
PPC performance.

llvm-svn: 303822

0a429f04

May 24, 2017
- P9: D-form vector load/store. Differential Revision: https://reviews.llvm.org/D33248 · 93297831
  Zaara Syeda authored May 24, 2017
```
llvm-svn: 303780
```
  93297831
May 12, 2017

[PPC] Change the register constraint of the first source operand of... · 22e7da95

Guozhi Wei authored May 11, 2017

[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0

According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.

This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.

Differential Revision: https://reviews.llvm.org/D32880

llvm-svn: 302834

22e7da95

May 02, 2017

[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE · b89c27f5

Nemanja Ivanovic authored May 02, 2017

Fixes PR30730.
This is a re-commit of a pulled commit. The commit was pulled because some
software projects contained uses of Altivec vectors that violated alignment
requirements. Known issues have now been fixed.

Committing on behalf of Lei Huang.

Differential Revision: https://reviews.llvm.org/D26861

llvm-svn: 301892

b89c27f5

Mar 30, 2017
- Spelling mistakes in comments. NFCI. · 68168d17
  Simon Pilgrim authored Mar 30, 2017
```
Based on corrections mentioned in patch for clang for PR27635

llvm-svn: 299072
```
  68168d17
Mar 15, 2017

[PowerPC][Altivec] Add mfvrd and mffprd extended mnemonic · ffcf0fb1

Nemanja Ivanovic authored Mar 15, 2017

mfvrd and mffprd are both alias to mfvrsd.
This patch enables correct parsing of the aliases, but we still emit a mfvrsd.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29177

llvm-svn: 297849

ffcf0fb1

Jan 26, 2017

[PPC] cleanup of mayLoad/mayStore flags and memory operands. · 3c8c385a

Sean Fertile authored Jan 26, 2017

1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store
   instructions.
2) Updated the flags on a number of intrinsics indicating that they write
    memory.
3) Added SDNPMemOperand flags for some target dependent SDNodes so that they
   propagate their memory operand

Review: https://reviews.llvm.org/D28818
llvm-svn: 293200

3c8c385a

Dec 15, 2016

[Power9] Allow AnyExt immediates for XXSPLTIB · 552c8e96

Nemanja Ivanovic authored Dec 15, 2016

In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
a splat of an i8 constant will end up with signed 8-bit values and other
situations, it'll end up with unsigned ones. Handle both situations.

Fixes PR31340.

llvm-svn: 289804

552c8e96

Dec 09, 2016
- [PPC] Add intrinsics for vector extract word and vector insert word. · 1c4109b4
  Sean Fertile authored Dec 09, 2016
```
Revision: https://reviews.llvm.org/D26547
llvm-svn: 289227
```
  1c4109b4
Dec 06, 2016

[PowerPC] Improvements for BUILD_VECTOR Vol. 4 · 15748f49

Nemanja Ivanovic authored Dec 06, 2016

This is the final patch in the series of patches that improves
BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
to remove redundant instructions. It also adds a large test case which
encompasses a large set of code patterns that build vectors - this test
case was the motivator for this series of patches.

Differential Revision: https://reviews.llvm.org/D26066

llvm-svn: 288800

15748f49

Nov 30, 2016

Revert https://reviews.llvm.org/rL287679 · f57f150b

Nemanja Ivanovic authored Nov 29, 2016

This commit caused some miscompiles that did not show up on any of the bots.
Reverting until we can investigate the cause of those failures.

llvm-svn: 288214

f57f150b

Nov 29, 2016

[PowerPC] Improvements for BUILD_VECTOR Vol. 1 · df1cb520

Nemanja Ivanovic authored Nov 29, 2016

This patch corresponds to review:
https://reviews.llvm.org/D25912

This is the first patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC.

llvm-svn: 288152

df1cb520

Nov 23, 2016

[PowerPC] Remove InstAlias definitions that cause incorrect assembly · 10fc3cfc

Nemanja Ivanovic authored Nov 23, 2016

In rL283190, I added some InstAlias definitions to generate extended mnemonics
for some uses of the XXPERMDI instruction. However, when the assembler matches
these extended mnemonics, it matches the new instruction in situations where it
should match the old one.
This patch removes these definitions and accomplishes that by defining these
mnemonics with additional instructions that are isCodeGenOnly.

Fixes PR31127.

llvm-svn: 287765

10fc3cfc

Nov 22, 2016

[PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE · b8e30d6d

Nemanja Ivanovic authored Nov 22, 2016

This patch corresponds to review:
https://reviews.llvm.org/D26861

It also fixes PR30730.

Committing on behalf of Lei Huang.

llvm-svn: 287679

b8e30d6d

Nov 15, 2016

vector load store with length (left justified) llvm portion · a19c9e60
Zaara Syeda authored Nov 15, 2016
```
llvm-svn: 286993
```
a19c9e60

[PowerPC] Implement BE VSX load/store builtins - llvm portion. · 5f850cd1

Tony Jiang authored Nov 15, 2016

This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.

llvm-svn: 286967

5f850cd1

Nov 14, 2016

[PPC] Add intrinsic mapping to the xscvhpsp instruction · a435e07d

Sean Fertile authored Nov 14, 2016

add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to
Single-Precision' instruction.

Differential review: https://reviews.llvm.org/D26536

llvm-svn: 286862

a435e07d

[PPC] add intrinsics for vec extract exp/significand and vec test data class. · adda5b2d
Sean Fertile authored Nov 14, 2016
```
  Differential Revision: https://reviews.llvm.org/D26272

llvm-svn: 286829
```
adda5b2d

Nov 11, 2016

[PowerPC] Add remaining vector permute builtins in altivec.h - LLVM portion · ec4b0c36

Nemanja Ivanovic authored Nov 11, 2016

This patch corresponds to review:
https://reviews.llvm.org/D26480

Adds all the intrinsics used for various permute builtins that will
be added to altivec.h.

llvm-svn: 286638

ec4b0c36

[PowerPC] Add vector conversion builtins to altivec.h - LLVM portion · 2efc3cb9

Nemanja Ivanovic authored Nov 11, 2016

This patch corresponds to review:
https://reviews.llvm.org/D26307

Adds all the intrinsics used for various conversion builtins that will
be added to altivec.h. These are type conversions between various types of
vectors.

llvm-svn: 286596

2efc3cb9

Oct 26, 2016

[PowerPC] Implement vec_insert_exp builtins - llvm portion · 0f45998b

Nemanja Ivanovic authored Oct 26, 2016

This revision corresponds to review: https://reviews.llvm.org/D25957.
Committing on behalf of Zaara Syeda.

llvm-svn: 285225

0f45998b

Oct 24, 2016

[PPC] Generate positive FP zero using xor insn instead of loading from constant area · c90b02cf

Ehsan Amiri authored Oct 24, 2016

https://reviews.llvm.org/D23614

Currently we load +0.0 from constant area. That can change to be generated using
XOR instruction.

llvm-svn: 284995

c90b02cf

Oct 04, 2016

[Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set · 6354d235

Nemanja Ivanovic authored Oct 04, 2016

This patch corresponds to review:

The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.

llvm-svn: 283212

6354d235

[Power9] Part-word VSX integer scalar loads/stores and sign extend instructions · 11049f8f

Nemanja Ivanovic authored Oct 04, 2016

This patch corresponds to review:
https://reviews.llvm.org/D23155

This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:

    Int to Fp conversions of 1 or 2-byte values loaded from memory
    Building vectors of 1 or 2-byte integers with values loaded from memory
    Storing individual 1 or 2-byte elements from integer vectors

This patch implements all of those uses.

llvm-svn: 283190

11049f8f

Sep 27, 2016

[Power9] Builtins for ELF v.2 API conformance - back end portion · 6f22b413

Nemanja Ivanovic authored Sep 27, 2016

This patch corresponds to review:
https://reviews.llvm.org/D24396

This patch adds support for the "vector count trailing zeroes",
"vector compare not equal" and "vector compare not equal or zero instructions"
as well as "scalar count trailing zeroes" instructions. It also changes the
vector negation to use XXLNOR (when VSX is enabled) so as not to increase
register pressure (previously this was done with a splat immediate of all
ones followed by an XXLXOR). This was done because the altivec.h
builtins (patch to follow) use vector negation and the use of an additional
register for the splat immediate is not optimal.

llvm-svn: 282478

6f22b413

Sep 23, 2016

[Power9] Exploit move and splat instructions for build_vector improvement · d2c3c51a

Nemanja Ivanovic authored Sep 23, 2016

This patch corresponds to review:
https://reviews.llvm.org/D21135

This patch exploits the following instructions:
mtvsrws
lxvwsx
mtvsrdd
mfvsrld

In order to improve some build_vector and extractelement patterns.

llvm-svn: 282246

d2c3c51a

Sep 22, 2016

[PowerPC] Remove LE patterns matching generic stores/loads to VSX permuting ops · e78ffede

Nemanja Ivanovic authored Sep 22, 2016

This patch corresponds to:
https://reviews.llvm.org/D21409

The LXVD2X, LXVW4X, STXVD2X and STXVW4X instructions permute the two doublewords
in the vector register when in little-endian mode. Custom code ensures that the
necessary swaps are inserted for these. This patch simply removes the possibilty
that a load/store node will match one of these instructions in the SDAG as that
would not insert the necessary swaps.

llvm-svn: 282144

e78ffede

[Power9] Add exploitation of non-permuting memory ops · 6e7879c5

Nemanja Ivanovic authored Sep 22, 2016

This patch corresponds to review:
https://reviews.llvm.org/D19825

The new lxvx/stxvx instructions do not require the swaps to line the elements
up correctly. In order to select them over the lxvd2x/lxvw4x instructions which
require swaps, the patterns for the old instruction have a predicate that
ensures they won't be selected on Power9 and newer CPUs.

llvm-svn: 282143

6e7879c5

Aug 18, 2016

[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround · 2bc3d4d4

Michael Kuperstein authored Aug 18, 2016

The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.

Differential Revision: https://reviews.llvm.org/D23597

llvm-svn: 279129

2bc3d4d4

Jul 18, 2016

[PowerPC] Remove redundant direct moves when extracting integers and converting to FP · d3c284f6

Nemanja Ivanovic authored Jul 18, 2016

This patch corresponds to review:
https://reviews.llvm.org/D21354

We use direct moves for extracting integer elements from vectors. We also use
direct moves when converting integers to FP. When these operations are chained,
we get a direct move out of a VSR followed by a direct move back into a VSR.
These are redundant - all we need to do is line up the element and convert.

llvm-svn: 275796

d3c284f6

Jul 12, 2016

[Power9] Add codegen for VSX word insert/extract instructions · b43bb614

Nemanja Ivanovic authored Jul 12, 2016

This patch corresponds to review:
http://reviews.llvm.org/D20239

It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that
are useful in some cases for inserting and extracting vector elements of
v4[if]32 vectors.

llvm-svn: 275215

b43bb614

[PowerPC] Cannonicalize applicable vector shift immediates as swaps · eebbcb6d

Nemanja Ivanovic authored Jul 12, 2016

This patch corresponds to review:
http://reviews.llvm.org/D21358

Vector shifts that have the same semantics as a vector swap are cannonicalized
as such to provide additional opportunities for swap removal optimization to
remove unnecessary swaps.

llvm-svn: 275168

eebbcb6d

Jul 05, 2016

[PowerPC] - Legalize vector types by widening instead of integer promotion · 44513e54

Nemanja Ivanovic authored Jul 05, 2016

This patch corresponds to review:
http://reviews.llvm.org/D20443

It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).

llvm-svn: 274535

44513e54

May 04, 2016

[PowerPC] Generate VSX version of splat word · 1a2b2f03

Nemanja Ivanovic authored May 04, 2016

This patch corresponds to review:
http://reviews.llvm.org/D18592

It allows the PPC back end to generate the xxspltw instruction where we
previously only emitted vspltw.

llvm-svn: 268516

1a2b2f03

Mar 31, 2016

[PPC] basic support for Power 9 direct move instructions · 99b017ae

Ehsan Amiri authored Mar 31, 2016

http://reviews.llvm.org/D18097

Initial support does not include any patterns to generate this instructions

llvm-svn: 265031

99b017ae

Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class,... · 80722719

Chuang-Yu Cheng authored Mar 28, 2016

[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat

This change implements the following vsx instructions:

- Scalar Insert/Extract
    xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp

- Vector Insert/Extract
    xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
    xxextractuw xxinsertw

- Scalar/Vector Test Data Class
    xststdcdp xststdcsp xststdcqp
    xvtstdcdp xvtstdcsp

- Maximum/Minimum
    xsmaxcdp xsmaxjdp
    xsmincdp xsminjdp

- Vector Byte-Reverse/Permute/Splat
    xxbrd xxbrh xxbrq xxbrw
    xxperm xxpermr
    xxspltib

30 instructions

Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16842

llvm-svn: 264567

80722719

[Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic · 56638489

Chuang-Yu Cheng authored Mar 28, 2016

This change implements the following vsx instructions:

- quad-precision move
    xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp

- quad-precision fp-arithmetic
    xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
    xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)

22 instructions

Thanks Nemanja and Kit for careful review and invaluable discussion!
Reviewers: hal, nemanja, kbarton, tjablin, amehsan

http://reviews.llvm.org/D16110

llvm-svn: 264565

56638489