Commits · b9d01aa29e5d0aa433c2fc62ace709fe69c45ceb · Lorenzo Albano / LLVM bpEVL

Jul 11, 2018

[Power9] Add remaining __flaot128 builtin support for FMA round to odd · b9d01aa2

Stefan Pintilie authored Jul 11, 2018

Implement this as it is done on GCC:

__float128 a, b, c, d;
a = __builtin_fmaf128_round_to_odd (b, c, d);         // generates xsmaddqpo
a = __builtin_fmaf128_round_to_odd (b, c, -d);        // generates xsmsubqpo
a = - __builtin_fmaf128_round_to_odd (b, c, d);       // generates xsnmaddqpo
a = - __builtin_fmaf128_round_to_odd (b, c, -d);      // generates xsnmsubpqp

Differential Revision: https://reviews.llvm.org/D48218

llvm-svn: 336754

b9d01aa2

Jul 09, 2018

[Power9] Add __float128 builtins for Rounding Operations · 133acb22

Stefan Pintilie authored Jul 09, 2018

Added __float128 support for a number of rounding operations:

trunc
rint
nearbyint
round
floor
ceil

Differential Revision: https://reviews.llvm.org/D48415

llvm-svn: 336601

133acb22

[Power9] [LLVM] Add __float128 support for trunc to double round to odd · 58e3e0a8

Stefan Pintilie authored Jul 09, 2018

Add support for this builtin:
double builtin_truncf128_round_to_odd(float128)

Differential Revision: https://reviews.llvm.org/D48483

llvm-svn: 336595

58e3e0a8

[Power9] Add __float128 builtins for Round To Odd · 83a5fe14

Stefan Pintilie authored Jul 09, 2018

GCC has builtins for these round to odd instructions:

__float128 __builtin_sqrtf128_round_to_odd (__float128)
__float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128)
__float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128)

Differential Revision: https://reviews.llvm.org/D47550

llvm-svn: 336578

83a5fe14

Jul 05, 2018

[Power9] Optimize codgen for conversions of int to float128 · 66e22c21

Lei Huang authored Jul 05, 2018

Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

llvm-svn: 336316

66e22c21

[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg · a855e17f

Lei Huang authored Jul 05, 2018

Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

llvm-svn: 336310

a855e17f

[Power9]Legalize and emit code for quad-precision convert from single-precision · d17c39cc

Lei Huang authored Jul 05, 2018

Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

llvm-svn: 336307

d17c39cc

Jul 04, 2018

[Power9]Legalize and emit code for round & convert quad-precision values · 6270ab6c

Lei Huang authored Jul 04, 2018

Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

llvm-svn: 336299

6270ab6c

May 28, 2018

[Power9]Legalize and emit code for HW/Byte vector extract and convert to QP · 651be449

Lei Huang authored May 28, 2018

Implemente patterns to extract HWord and Byte vector elements and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46774

llvm-svn: 333377

651be449

May 24, 2018

[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX · f4ec6782

Lei Huang authored May 24, 2018

The match pattern in the definition of LXSDX is xoaddr, so the Pseudo
instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post
RA based on the register pressure. To avoid ambiguity, we need to remove the
select pattern for LXSDX, same as what was done for LXSD. STXSDX also have
the same issue.

Patch by Qing Shan Zhang (steven.zhang).

Differential Revision: https://reviews.llvm.org/D47178

llvm-svn: 333150

f4ec6782

May 23, 2018

[Power9]Legalize and emit code for W vector extract and convert to QP · 8b0da65b

Lei Huang authored May 23, 2018

Implemente patterns to extract [Un]signed Word vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46536

llvm-svn: 333115

8b0da65b

[Power9]Legalize and emit code for DW vector extract and convert to QP · 8990168a

Lei Huang authored May 23, 2018

Implemente patterns to extract [Un]signed DWord vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46333

llvm-svn: 333112

8990168a

May 14, 2018

[NFC] [Power] Fix instruction format for xsrqpi · 421a5960

Zaara Syeda authored May 14, 2018

xsrqpi is currently using Z23Form_1.
The instruction format is xsrqpi R,VRT,VRB,RMC.
Rathar than bits 11-15 being used for FRA, it should have
bits 11-14 reserved and bit 15 for R. This patch adds a new
class Z23Form_4 to fix the instruction format.

Differential Revision: https://reviews.llvm.org/D46761

llvm-svn: 332253

421a5960

May 08, 2018

[Power9]Legalize and emit code for truncate and convert QP to HW and Byte · e41e3d32

Lei Huang authored May 08, 2018

Legalize and emit code for truncate and convert float128 to (un)signed short
and (un)signed char.

Differential Revision: https://reviews.llvm.org/D46194

llvm-svn: 331797

e41e3d32

[Power9]Legalize and emit code for truncate and convert Quad-Precision to Word · 6364288d

Lei Huang authored May 08, 2018

Legalize and emit code for:

  * xscvqpswz : VSX Scalar truncate & Convert Quad-Precision to Signed Word
  * xscvqpuwz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Word

Differential Revision: https://reviews.llvm.org/D45635

llvm-svn: 331790

6364288d

[Power9]Legalize and emit code for truncate and convert QP to DW · c517e95b

Lei Huang authored May 08, 2018

Legalize and emit code for:

  * xscvqpsdz : VSX Scalar truncate & Convert Quad-Precision to Signed Dword
  * xscvqpudz : VSX Scalar truncate & Convert Quad-Precision to Unsigned Dword

Differential Revision: https://reviews.llvm.org/D45553

llvm-svn: 331787

c517e95b

[PowerPC] Unify handling for conversion of FP_TO_INT feeding a store · c29229a6

Lei Huang authored May 08, 2018

Existing DAG combine only handles conversions for FP_TO_SINT:
"{f32, f64} x { i32, i16 }"

This patch simplifies the code to handle:
"{ FP_TO_SINT, FP_TO_UINT } x { f64, f32 } x { i64, i32, i16, i8 }"

Differential Revision: https://reviews.llvm.org/D46102

llvm-svn: 331778

c29229a6

Apr 18, 2018

[Power9]Legalize and emit code for converting Unsigned HWord/Char to Quad-Precision · 192c6ccf

Lei Huang authored Apr 18, 2018

Legalize and emit code for converting unsigned HWord/Char to QP:

xscvsdqp
xscvudqp

Only covering patterns for unsigned forms cause we don't have part-word
sign-extending integer loads into VSX registers.

Differential Revision: https://reviews.llvm.org/D45494

llvm-svn: 330278

192c6ccf

[Power9]Legalize and emit code for converting (Un)Signed Word to Quad-Precision · 198e6785

Lei Huang authored Apr 18, 2018

Legalize and emit code for converting (Un)Signed Word to quad-precision via:

xscvsdqp
xscvudqp

Differential Revision: https://reviews.llvm.org/D45389

llvm-svn: 330273

198e6785

Apr 12, 2018

[Power9]Legalize and emit code for converting (Un)Signed DWord to Quad-Precision · 10367eb4

Lei Huang authored Apr 12, 2018

Legalize and emit code for:

  * xscvsdqp
  * xscvudqp

Differential Revision: https://reviews.llvm.org/D45230

llvm-svn: 329931

10367eb4

Apr 04, 2018

[Power9]Legalize and emit code for quad-precision fma instructions · 09fda63a

Lei Huang authored Apr 04, 2018

Legalize and emit code for the following quad-precision fma:

  * xsmaddqp
  * xsnmaddqp
  * xsmsubqp
  * xsnmsubqp

Differential Revision: https://reviews.llvm.org/D44843

llvm-svn: 329206

09fda63a

Mar 26, 2018

[Power9]Legalize and emit code for quad-precision convert from double-precision · be0afb08

Lei Huang authored Mar 26, 2018

Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

llvm-svn: 328558

be0afb08

[PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place. · 26d4f923

Stefan Pintilie authored Mar 26, 2018

A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.

Differential Revision: https://reviews.llvm.org/D43086

llvm-svn: 328556

26d4f923

Mar 19, 2018

[Power9]Legalize and emit code for quad-precision copySign/abs/nabs/neg/sqrt · ecfede94

Lei Huang authored Mar 19, 2018

Legalize and emit code for quad-precision floating point operations:

  * xscpsgnqp
  * xsabsqp
  * xsnabsqp
  * xsnegqp
  * xssqrtqp

Differential Revision: https://reviews.llvm.org/D44530

llvm-svn: 327889

ecfede94

[PowerPC][Power9]Legalize and emit code for quad-precision add/div/mul/sub · 6d1596a9

Lei Huang authored Mar 19, 2018

Legalize and emit code for quad-precision floating point operations:

  * xsaddqp
  * xssubqp
  * xsdivqp
  * xsmulqp

Differential Revision: https://reviews.llvm.org/D44506

llvm-svn: 327878

6d1596a9

Mar 12, 2018
- [PowerPC][NFC] Explicitly state types on FP SDAG patterns in anticipation of adding the f128 type · cd4f3857
  Lei Huang authored Mar 12, 2018
```
llvm-svn: 327319
```
  cd4f3857
Feb 23, 2018

[PowerPC] Code cleanup. Remove instructions that were withdrawn from Power 9. · 15e6b10e

Stefan Pintilie authored Feb 23, 2018

The following set of instructions was originally planned to be added for Power 9
and so code was added to support them. However, a decision was made later on to
withdraw support for these instructions in the hardware.
xscmpnedp
xvcmpnesp
xvcmpnedp
This patch removes support for the instructions that were not added.

Differential Revision: https://reviews.llvm.org/D43641

llvm-svn: 325918

15e6b10e

Nov 27, 2017

[Power9] Improvements to vector extract with variable index exploitation · 48cb3c15

Zaara Syeda authored Nov 27, 2017

This patch extends on to rL307174 to not use the power9 vector extract with
variable index instructions when extracting word element 1. For such cases,
the existing selection of MFVSRWZ provides a better sequence.

Differential Revision: https://reviews.llvm.org/D38287

llvm-svn: 319049

48cb3c15

Nov 20, 2017

[PPC] Heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st. · 438bf4a6

Tony Jiang authored Nov 20, 2017

The VSX versions have the advantage of a full 64-register target whereas the FP
ones have the advantage of lower latency and higher throughput. So what we’re
after is using the faster instructions in low register pressure situations and
using the larger register file in high register pressure situations.

The heuristic chooses between the following 7 pairs of instructions.
PPC::LXSSPX vs PPC::LFSX
PPC::LXSDX vs PPC::LFDX
PPC::STXSSPX vs PPC::STFSX
PPC::STXSDX vs PPC::STFDX
PPC::LXSIWAX vs PPC::LFIWAX
PPC::LXSIWZX vs PPC::LFIWZX
PPC::STXSIWX vs PPC::STFIWX

Differential Revision: https://reviews.llvm.org/D38486

llvm-svn: 318651

438bf4a6

Nov 07, 2017

Use new vector insert half-word and byte instructions when we see... · 5cd044e8

Graham Yiu authored Nov 07, 2017

Use new vector insert half-word and byte instructions when we see insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases.

Differential Revision: https://reviews.llvm.org/D34630

llvm-svn: 317613

5cd044e8

Sep 21, 2017

[Power9] Spill gprs to vector registers rather than stack · fcd9697d

Zaara Syeda authored Sep 21, 2017

This patch updates register allocation to enable spilling gprs to
volatile vector registers rather than the stack. It can be enabled
 for Power9 with option -ppc-enable-gpr-to-vsr-spills.

Differential Revision: https://reviews.llvm.org/D34815

llvm-svn: 313886

fcd9697d

Sep 05, 2017
- [PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it more general. · 61ef1c54
  Tony Jiang authored Sep 05, 2017
```
Commit on behalf of Graham Yiu (gyiu@ca.ibm.com)

llvm-svn: 312547
```
  61ef1c54
Aug 14, 2017

[PowerPC] Add codegen for VSX word extract convert to FP · 451ef4ad

Lei Huang authored Aug 14, 2017

Add codegen for VSX word extract conversion from signed/unsigned to single/double
precision.

For UINT_TO_FP:
Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239.
Here we will add the missing extract integer and conversion to double. This
utilizes the new P9 instruction xxextractuw to extracting an integer element
when the result will be converted to double thereby saving 2 direct moves
(VSR <-> GPR).

For SINT_TO_FP:
We will implement the following sequence which will also reduce the number of
instructions by saving 2 direct moves.

v4i32->f32:
        xxspltw
        xvcvsxwsp
        xscvspdpn

v4i32->f64:
        xxspltw
        xvcvsxwdp

Differential Revision: https://reviews.llvm.org/D35859

llvm-svn: 310866

451ef4ad

Jul 13, 2017

[PowerPC] Ensure displacements for DQ-Form instructions are multiples of 16 · 3c7e276d

Nemanja Ivanovic authored Jul 13, 2017

As outlined in the PR, we didn't ensure that displacements for DQ-Form
instructions are multiples of 16. Since the instruction encoding encodes
a quad-word displacement, a sub-16 byte displacement is meaningless and
ends up being encoded incorrectly.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33671.

Differential Revision: https://reviews.llvm.org/D35007

llvm-svn: 307934

3c7e276d

Jul 05, 2017

[Power9] Exploit vector extract with variable index. · aa5a6a1c

Tony Jiang authored Jul 05, 2017

This patch adds the exploitation for new power 9 instructions which extract
variable elements from vectors:
VEXTUBLX
VEXTUBRX
VEXTUHLX
VEXTUHRX
VEXTUWLX
VEXTUWRX

Differential Revision: https://reviews.llvm.org/D34032
Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com)

llvm-svn: 307174

aa5a6a1c

[Power9] Exploit vector integer extend instructions when indices aren't correct. · 9a91a181

Tony Jiang authored Jul 05, 2017

This patch adds on to the exploitation added by https://reviews.llvm.org/D33510.
This now catches build vector nodes where the inputs are coming from sign
extended vector extract elements where the indices used by the vector extract
are not correct. We can still use the new hardware instructions by adding a
shuffle to move the elements to the correct indices. I introduced a new PPCISD
node here because adding a vector_shuffle and changing the elements of the
vector_extracts was getting undone by another DAG combine.

Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com)
Differential Revision: https://reviews.llvm.org/D34009

llvm-svn: 307169

9a91a181

Jun 12, 2017

[PowerPC] Match vec_revb builtins to P9 instructions. · 1a8eec14

Tony Jiang authored Jun 12, 2017

Power9 has instructions that will reverse the bytes within an element for all
sizes (half-word, word, double-word and quad-word). These can be used for the
vec_revb builtins in altivec.h. However, we implement these to match vector
shuffle nodes as that will cover both the builtins and vector shuffles that
occur in the SDAG through other means.

Differential Revision: https://reviews.llvm.org/D33690

llvm-svn: 305214

1a8eec14

Jun 08, 2017

[Power9] Exploit vector integer extend instructions · 79acbbe5

Zaara Syeda authored Jun 08, 2017

This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword

Differential Revision: https://reviews.llvm.org/D33510

llvm-svn: 304992

79acbbe5

May 31, 2017

[PowerPC] Fix a performance bug for PPC::XXPERMDI. · 60c247de

Tony Jiang authored May 31, 2017

There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI
Instruction, this patch recognizes them and does the selection to improve
the PPC performance.

Differential Revision: https://reviews.llvm.org/D33404

llvm-svn: 304298

60c247de

May 29, 2017

[PPC] Fix assertion failure during binary encoding with -mcpu=pwr9 · e3c14ebb

Hiroshi Inoue authored May 29, 2017

Summary
clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding.
The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate.

This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test).
The fix is from Nemanja Ivanovic @nemanjai.

Differential Revision: https://reviews.llvm.org/D33482

llvm-svn: 304133

e3c14ebb