Commits · b9d01aa29e5d0aa433c2fc62ace709fe69c45ceb · Lorenzo Albano / LLVM bpEVL

Jul 11, 2018

[Power9] Add remaining __flaot128 builtin support for FMA round to odd · b9d01aa2

Stefan Pintilie authored Jul 11, 2018

Implement this as it is done on GCC:

__float128 a, b, c, d;
a = __builtin_fmaf128_round_to_odd (b, c, d);         // generates xsmaddqpo
a = __builtin_fmaf128_round_to_odd (b, c, -d);        // generates xsmsubqpo
a = - __builtin_fmaf128_round_to_odd (b, c, d);       // generates xsnmaddqpo
a = - __builtin_fmaf128_round_to_odd (b, c, -d);      // generates xsnmsubpqp

Differential Revision: https://reviews.llvm.org/D48218

llvm-svn: 336754

b9d01aa2

Jul 09, 2018

[Power9] Add __float128 builtins for Rounding Operations · 133acb22

Stefan Pintilie authored Jul 09, 2018

Added __float128 support for a number of rounding operations:

trunc
rint
nearbyint
round
floor
ceil

Differential Revision: https://reviews.llvm.org/D48415

llvm-svn: 336601

133acb22

[Power9] [LLVM] Add __float128 support for trunc to double round to odd · 58e3e0a8

Stefan Pintilie authored Jul 09, 2018

Add support for this builtin:
double builtin_truncf128_round_to_odd(float128)

Differential Revision: https://reviews.llvm.org/D48483

llvm-svn: 336595

58e3e0a8

[Power9] Add __float128 builtins for Round To Odd · 83a5fe14

Stefan Pintilie authored Jul 09, 2018

GCC has builtins for these round to odd instructions:

__float128 __builtin_sqrtf128_round_to_odd (__float128)
__float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128)
__float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128)

Differential Revision: https://reviews.llvm.org/D47550

llvm-svn: 336578

83a5fe14

[Power9] Add __float128 support for compare operations · 3d76326d

Stefan Pintilie authored Jul 09, 2018

Added handling for the select f128.

Differential Revision: https://reviews.llvm.org/D48294

llvm-svn: 336548

3d76326d

Jul 06, 2018

[Power9] Add __float128 library call for frem · b351f09c

Stefan Pintilie authored Jul 06, 2018

Power 9 does not have a hardware instruction for frem but we can call fmodf128.

Differential Revision: https://reviews.llvm.org/D48552

llvm-svn: 336406

b351f09c

Jul 05, 2018

[Power9] Add lib calls for float128 operations with no equivalent PPC instructions · 5612b906

Lei Huang authored Jul 05, 2018

Map the following instructions to the proper float128 lib calls:
  pow[i], exp[2], log[2|10], sin, cos, fmin, fmax

Differential Revision: https://reviews.llvm.org/D48544

llvm-svn: 336361

5612b906

[Power9] Optimize codgen for conversions of int to float128 · 66e22c21

Lei Huang authored Jul 05, 2018

Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

llvm-svn: 336316

66e22c21

[Power9] Ensure float128 in non-homogenous aggregates are passed via VSX reg · a855e17f

Lei Huang authored Jul 05, 2018

Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.

This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.

Differential Revision: https://reviews.llvm.org/D48308

llvm-svn: 336310

a855e17f

[Power9]Legalize and emit code for quad-precision convert from single-precision · d17c39cc

Lei Huang authored Jul 05, 2018

Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

llvm-svn: 336307

d17c39cc

[Power9] Implement float128 parameter passing and return values · a26f3be4

Lei Huang authored Jul 05, 2018

This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

llvm-svn: 336306

a26f3be4

Jul 04, 2018

[Power9]Legalize and emit code for round & convert quad-precision values · 6270ab6c

Lei Huang authored Jul 04, 2018

Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

llvm-svn: 336299

6270ab6c

[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler · cb4f0c5c

Stefan Pintilie authored Jul 04, 2018

  We want to run the Machine Scheduler instead of the List Scheduler after RA.
  Checked with a performance run on a Power 9 machine with SPEC 2006 and while
  some benchmarks improved and others degraded the geomean was slightly improved
  with the Machine Scheduler.

  Differential Revision: https://reviews.llvm.org/D45265

llvm-svn: 336295

cb4f0c5c

Jul 02, 2018

[PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's... · 3b2aa2b4

QingShan Zhang authored Jul 02, 2018

[PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's multiple for i64 pre-inc load/store

For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse.

unsigned long long result = 0;
unsigned long long foo(char *p, unsigned long long n) {
  for (unsigned long long i = 0; i < n; i++) {
    unsigned long long x1 = *(unsigned long long *)(p - 50000 + i);
    unsigned long long x2 = *(unsigned long long *)(p - 61024 + i);
    unsigned long long x3 = *(unsigned long long *)(p - 62048 + i);
    unsigned long long x4 = *(unsigned long long *)(p - 64096 + i);
    result *= x1 * x2 * x3 * x4;
  }
  return result;
}

Patch by jedilyn(Kewen Lin).

Differential Revision: https://reviews.llvm.org/D48813 
--This line, and  those below, will be ignored--

M    lib/Target/PowerPC/PPCLoopPreIncPrep.cpp
A    test/CodeGen/PowerPC/preincprep-i64-check.ll

llvm-svn: 336074

3b2aa2b4

Jun 25, 2018

[PowerPC] Fix incorrectly encoded wait instruction · 5d109ee3

Lei Huang authored Jun 25, 2018

Encoding for the wait instruction was wrong. Fix according to ISA 3.0.

Differential Revision: https://reviews.llvm.org/D48550

llvm-svn: 335514

5d109ee3

Jun 19, 2018

[PowerPC] Fix label address calculation for ppc32 · bb2b00bb

Strahinja Petrovic authored Jun 19, 2018

This patch fixes calculating address of label on ppc32 (for -fPIC).

Differential Revision: https://reviews.llvm.org/D46582

llvm-svn: 335043

bb2b00bb

If the arch is P9, we will select the DFLOADf32/DFLOADf64 pseudo instruction... · 9f0fe9a3

QingShan Zhang authored Jun 19, 2018

If the arch is P9, we will select the DFLOADf32/DFLOADf64 pseudo instruction when we are loading a floating,
and expand it post RA basing on the register pressure. However, we miss to do the add-imm peephole for these pseudo instruction.

Differential Revision: https://reviews.llvm.org/D47568
Reviewed By: Nemanjai

llvm-svn: 335024

9f0fe9a3

Jun 15, 2018

[PowerPC] Add support for high and higha symbol modifiers on tls modifers. · cac28aeb

Sean Fertile authored Jun 15, 2018

Enables using the high and high-adjusted symbol modifiers on thread local
storage modifers in powerpc assembly. Needed to be able to support 64 bit
thread-pointer and dynamic-thread-pointer access sequences.

Differential Revision: https://reviews.llvm.org/D47754

llvm-svn: 334856

cac28aeb

[PPC64] Support "symbol@high" and "symbol@higha" symbol modifers. · 80b8f82f

Sean Fertile authored Jun 15, 2018

Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly.
The modifiers represent accessing the segment consiting of bits 16-31 of a
64-bit address/offset.

Differential Revision: https://reviews.llvm.org/D47729

llvm-svn: 334855

80b8f82f

Jun 13, 2018

[PowerPC] fix trivial typos in comment, NFC · 0f7f59f0
Hiroshi Inoue authored Jun 13, 2018
```
llvm-svn: 334583
```
0f7f59f0

[PowerPC] avoid verification failure due to PowerPC VSX Swap Removal pass · 9bffc94c

Hiroshi Inoue authored Jun 13, 2018

This patch fixes a failure in lnt tests with -verify-machineinstrs option.
When VSX Swap Removal pass swaps two register operands, it did not maintain kill flags associated with operands. This patch swaps flags as well as register number to avoid inconsistent kill flags information.

llvm-svn: 334579

9bffc94c

Jun 08, 2018
- [NFC] fix formatting · 863fb7a4
  Hiroshi Inoue authored Jun 08, 2018
```
llvm-svn: 334263
```
  863fb7a4
Jun 07, 2018

[PowerPC] avoid unprofitable Repl32 flag in BitPermutationSelector · 01ef4c2c

Hiroshi Inoue authored Jun 07, 2018

BitPermutationSelector sets Repl32 flag for bit groups which can be (potentially) benefit from 32-bit rotate-and-mask instructions with bit replication, i.e. rlwinm/rlwimi copies lower 32 bits into upper 32 bits on 64-bit PowerPC before rotation.
However, enforcing 32-bit instruction sometimes results in redundant generated code.
For example, the following simple code is compiled into rotldi + rlwimi while it can be compiled into only rldimi instruction if Repl32 flag is not set on the bit group for (a & 0xFFFFFFFF).

uint64_t func(uint64_t a, uint64_t b) {
	return (a & 0xFFFFFFFF) | (b << 32) ;
}

To avoid such problem, this patch checks the potential benefit of Repl32 flag before setting it. If a bit group does not require rotation (i.e. RLAmt == 0) and won't be merged into another group, we do not benefit from Repl32 flag on this group.

Differential Revision: https://reviews.llvm.org/D47867

llvm-svn: 334195

01ef4c2c

[PowerPC] fix trivial typos in comment, NFC · b5578460
Hiroshi Inoue authored Jun 07, 2018
```
llvm-svn: 334191
```
b5578460

Jun 06, 2018

[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup · 57f661bd

Peter Smith authored Jun 06, 2018

On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.

Differential Revision: https://reviews.llvm.org/D44928

llvm-svn: 334078

57f661bd

Jun 05, 2018

[PowerPC] reduce rotate in BitPermutationSelector · 955655f5

Hiroshi Inoue authored Jun 05, 2018

BitPermutationSelector builds the output value by repeating rotate-and-mask instructions with input registers.
Here, we may avoid one rotate instruction if we start building from an input register that does not require rotation.

For example of the test case bitfieldinsert.ll, it first rotates left r4 by 8 bits and then inserts some bits from r5 without rotation.
This can be executed by one rlwimi instruction, which rotates r4 by 8 bits and inserts its bits into r5.

This patch adds a check for rotation amounts in the comparator used in sorting to process the input without rotation first.

Differential Revision: https://reviews.llvm.org/D47765

llvm-svn: 334011

955655f5

Jun 04, 2018

Move Analysis/Utils/Local.h back to Transforms · 31b98d2e

David Blaikie authored Jun 04, 2018

Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)

llvm-svn: 333954

31b98d2e

Jun 01, 2018

[NFC] Zero initialize local variables · 9796b47d

Hiroshi Inoue authored Jun 01, 2018

This patch makes local variables zero initialized to avoid broken values in debug output.

llvm-svn: 333754

9796b47d

Set ADDE/ADDC/SUBE/SUBC to expand by default · 8467411d

Amaury Sechet authored Jun 01, 2018

Summary:
They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while.

Target that uses these opcodes are changed in order to ensure their behavior doesn't change.

Reviewers: efriedma, craig.topper, dblaikie, bkramer

Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D47422

llvm-svn: 333748

8467411d

May 29, 2018

[PowerPC] Fix the incorrect iterator inside peephole · 716103f1

Lei Huang authored May 29, 2018

Instruction selection can insert nodes into the underlying list after the root
node so iterating will thereby miss it. We should NOT assume that, the root node
is the last element in the DAG nodelist.

Patch by: steven.zhang (Qing Shan Zhang)

Differential Revision: https://reviews.llvm.org/D47437

llvm-svn: 333415

716103f1

May 28, 2018

[Power9]Legalize and emit code for HW/Byte vector extract and convert to QP · 651be449

Lei Huang authored May 28, 2018

Implemente patterns to extract HWord and Byte vector elements and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46774

llvm-svn: 333377

651be449

[PowerPC] Set isAsmParserOnly=1 for X-form TLS loads/stores · 6f3df02f

Zaara Syeda authored May 28, 2018

The X-form TLS load/store instructions added for optimizing the initial-exec
sequence in https://reviews.llvm.org/rL327635 fail to assemble. llvm-mc fails
with the error: invalid operand for instruction. This patch adds these
instructions into a block with isAsmParserOnly, similar to how ADD8TLS_ is
currently handled.

Differential Revision: https://reviews.llvm.org/D47382

llvm-svn: 333374

6f3df02f

May 24, 2018

[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX · f4ec6782

Lei Huang authored May 24, 2018

The match pattern in the definition of LXSDX is xoaddr, so the Pseudo
instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post
RA based on the register pressure. To avoid ambiguity, we need to remove the
select pattern for LXSDX, same as what was done for LXSD. STXSDX also have
the same issue.

Patch by Qing Shan Zhang (steven.zhang).

Differential Revision: https://reviews.llvm.org/D47178

llvm-svn: 333150

f4ec6782

May 23, 2018

[Power9]Legalize and emit code for W vector extract and convert to QP · 8b0da65b

Lei Huang authored May 23, 2018

Implemente patterns to extract [Un]signed Word vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46536

llvm-svn: 333115

8b0da65b

[Power9]Legalize and emit code for DW vector extract and convert to QP · 8990168a

Lei Huang authored May 23, 2018

Implemente patterns to extract [Un]signed DWord vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46333

llvm-svn: 333112

8990168a

May 21, 2018

MC: Separate creating a generic object writer from creating a target object writer. NFCI. · dcd7d6c3

Peter Collingbourne authored May 21, 2018

With this we gain a little flexibility in how the generic object
writer is created.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47045

llvm-svn: 332868

dcd7d6c3

MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. · 571a3301

Peter Collingbourne authored May 21, 2018

To make this work I needed to add an endianness field to MCAsmBackend
so that writeNopData() implementations know which endianness to use.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47035

llvm-svn: 332857

571a3301

May 18, 2018

Support: Simplify endian stream interface. NFCI. · e3f65297

Peter Collingbourne authored May 18, 2018

Provide some free functions to reduce verbosity of endian-writing
a single value, and replace the endianness template parameter with
a field.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47032

llvm-svn: 332757

e3f65297

May 14, 2018

[NFC] [Power] Fix instruction format for xsrqpi · 421a5960

Zaara Syeda authored May 14, 2018

xsrqpi is currently using Z23Form_1.
The instruction format is xsrqpi R,VRT,VRB,RMC.
Rathar than bits 11-15 being used for FRA, it should have
bits 11-14 reserved and bit 15 for R. This patch adds a new
class Z23Form_4 to fix the instruction format.

Differential Revision: https://reviews.llvm.org/D46761

llvm-svn: 332253

421a5960

Rename DEBUG macro to LLVM_DEBUG. · d34e60ca

Nicola Zaghen authored May 14, 2018

    
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240

d34e60ca