Commits · 438bf4a66b1573887823fe36c82c45a891107768 · Lorenzo Albano / LLVM bpEVL

Nov 20, 2017

[PPC] Heuristic to choose between a X-Form VSX ld/st vs a X-Form FP ld/st. · 438bf4a6

Tony Jiang authored Nov 20, 2017

The VSX versions have the advantage of a full 64-register target whereas the FP
ones have the advantage of lower latency and higher throughput. So what we’re
after is using the faster instructions in low register pressure situations and
using the larger register file in high register pressure situations.

The heuristic chooses between the following 7 pairs of instructions.
PPC::LXSSPX vs PPC::LFSX
PPC::LXSDX vs PPC::LFDX
PPC::STXSSPX vs PPC::STFSX
PPC::STXSDX vs PPC::STFDX
PPC::LXSIWAX vs PPC::LFIWAX
PPC::LXSIWZX vs PPC::LFIWZX
PPC::STXSIWX vs PPC::STFIWX

Differential Revision: https://reviews.llvm.org/D38486

llvm-svn: 318651

438bf4a6

Nov 17, 2017

Fix a bunch more layering of CodeGen headers that are in Target · b3bde2ea

David Blaikie authored Nov 17, 2017

All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490

b3bde2ea

Nov 16, 2017

[PPC] Change i32 constant in store instruction to i64 · 433e8d3e

Guozhi Wei authored Nov 16, 2017

This patch changes all i32 constant in store instruction to i64 with truncation, to increase the chance that the referenced constant can be shared with other i64 constant.

Differential Revision: https://reviews.llvm.org/D39352

llvm-svn: 318436

433e8d3e

Add backend name to Target to enable runtime info to be fed back into TableGen · 725584e2

Daniel Sanders authored Nov 15, 2017

Summary:
Make it possible to feed runtime information back to tablegen to enable
profile-guided tablegen-eration, detection of untested tablegen definitions, etc.

Being a cross-compiler by nature, LLVM will potentially collect data for multiple
architectures (e.g. when running 'ninja check'). We therefore need a way for
TableGen to figure out what data applies to the backend it is generating at the
time. This patch achieves that by including the name of the 'def X : Target ...'
for the backend in the TargetRegistry.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D39742

llvm-svn: 318352

725584e2

Nov 15, 2017

[PowerPC] Implement mayBeEmittedAsTailCall for PPC · 0f0837e8

Sean Fertile authored Nov 15, 2017

Implements TargetLowering callback 'mayBeEmittedAsTailCall' that enables
CodeGenPrepare to duplicate returns when they might enable a tail-call.

Differential Revision: https://reviews.llvm.org/D39777

llvm-svn: 318321

0f0837e8

[PowerPC] Split out the tailcall calling convention checks. NFC. · 7b056b30

Sean Fertile authored Nov 15, 2017

Move the calling convention checks for tail-call eligibility for the 64-bit
SysV ABI into a separate function. This is so that it can be shared with
'mayBeEmittedAsTailCall' in a subsequent change.

llvm-svn: 318305

7b056b30

[PowerPC] fix up in redundant compare elimination · 72a1f98a

Hiroshi Inoue authored Nov 15, 2017

This patch fixes a potential problem in my previous commit (https://reviews.llvm.org/rL312514) by introducing an additional check.

llvm-svn: 318266

72a1f98a

Nov 08, 2017

Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering · 3f833edc

David Blaikie authored Nov 08, 2017

This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647

3f833edc

Nov 07, 2017

Use new vector insert half-word and byte instructions when we see... · 5cd044e8

Graham Yiu authored Nov 07, 2017

Use new vector insert half-word and byte instructions when we see insertelement on '8 x i16' and '16 x i8' types. Also extended existing lit testcase to cover these cases.

Differential Revision: https://reviews.llvm.org/D34630

llvm-svn: 317613

5cd044e8

Nov 06, 2017

Fix buildbot breakages from r317503. Add parentheses to assignment when using... · 52a52a6c
Graham Yiu authored Nov 06, 2017
```
Fix buildbot breakages from r317503.  Add parentheses to assignment when using result as a condition.

llvm-svn: 317508
```
52a52a6c

Adds code to PPC ISEL lowering to recognize byte inserts from vector_shuffles,... · 030621bb

Graham Yiu authored Nov 06, 2017

Adds code to PPC ISEL lowering to recognize byte inserts from vector_shuffles, and use P9 shift and vector insert byte instructions instead of vperm. Extends tests from vector insert half-word.

Differential Revision: https://reviews.llvm.org/D34497

llvm-svn: 317503

030621bb

[PPC] Use xxbrd to speed up bswap64 · e3b8d9a3

Guozhi Wei authored Nov 06, 2017

Power doesn't have bswap instructions, so llvm generates following code sequence for bswap64.

  rotldi   5, 3, 16
  rotldi   4, 3, 8
  rotldi   9, 3, 24
  rotldi   10, 3, 32
  rotldi   11, 3, 48
  rotldi   12, 3, 56
  rldimi 4, 5, 8, 48
  rldimi 4, 9, 16, 40
  rldimi 4, 10, 24, 32
  rldimi 4, 11, 40, 16
  rldimi 4, 12, 48, 8
  rldimi 4, 3, 56, 0

But Power9 has vector bswap instructions, they can also be used to speed up scalar bswap intrinsic. With this patch, bswap64 can be translated to:

  mtvsrdd 34, 3, 3
  xxbrd 34, 34
  mfvsrld 3, 34

Differential Revision: https://reviews.llvm.org/D39510

llvm-svn: 317499

e3b8d9a3

Nov 03, 2017

Move TargetFrameLowering.h to CodeGen where it's implemented · 1be62f03

David Blaikie authored Nov 03, 2017

This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379

1be62f03

Nov 01, 2017

Adds code to PPC ISEL lowering to recognize half-word inserts from... · 67152614

Graham Yiu authored Nov 01, 2017

Adds code to PPC ISEL lowering to recognize half-word inserts from vector_shuffles, and use P9 shift and vector insert instructions instead of vperm.

Differential Revision: https://reviews.llvm.org/D34160

llvm-svn: 317111

67152614

Oct 30, 2017

Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat" · 6262fd4b

Stefan Pintilie authored Oct 30, 2017

Revert r316478.
A test case has failed.
Will recommit this change once we find and fix the failure.

This reverts commit 7c330fabaedaba3d02c58bc3cc1198896c895f34.

llvm-svn: 316952

6262fd4b

[PPC CodeGen] Fix the bitreverse.i64 intrinsic. · 2696db90

Fangrui Song authored Oct 30, 2017

Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270.

Reviewers: jtony, aaron.ballman

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D39163

llvm-svn: 316916

2696db90

[CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2). · b2c3eb8c

Clement Courbet authored Oct 30, 2017

 - Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905

b2c3eb8c

Oct 26, 2017

[PowerPC] Use record-form instruction for Less-or-Equal -1 and Greater-or-Equal 1 · b72b1fb0

Hiroshi Inoue authored Oct 26, 2017

Currently a record-form instruction is used for comparison of "greater than -1" and "less than 1" by modifying the predicate (e.g. LT 1 into LE 0) in addition to the naive case of comparison against 0.
This patch also enables emitting a record-form instruction for "less than or equal to -1" (i.e. "less than 0") and "greater than or equal to 1" (i.e. "greater than 0") to increase the optimization opportunities.

Differential Revision: https://reviews.llvm.org/D38941

llvm-svn: 316647

b72b1fb0

Oct 24, 2017

[PowerPC] Try to simplify a Swap if it feeds a Splat · 8f0c7830

Stefan Pintilie authored Oct 24, 2017

If we have the situation where a Swap feeds a Splat we can sometimes change the
  index on the Splat and then remove the Swap instruction.

Fixed the test case that was failing and recommit after pulling the original
  commit.

  Original revision is here: https://reviews.llvm.org/D39009

llvm-svn: 316478

8f0c7830

PowerPC: support the separator character in the IAS · fb490a0b

Saleem Abdulrasool authored Oct 24, 2017

PowerPC uses ; as a comment leader and the @ as a separator character.
Support this properly.

llvm-svn: 316454

fb490a0b

Oct 23, 2017

Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat" · 52bbd587

Stefan Pintilie authored Oct 23, 2017

Revert commit r316366.
Previous commit causes p8-scalar_vector_conversions.ll to fail.

This reverts commit 990e764ad8a2eec206ce5dda6aefab059ccd4e92.

llvm-svn: 316371

52bbd587

[PowerPC] Try to simplify a Swap if it feeds a Splat · feafa1d7

Stefan Pintilie authored Oct 23, 2017

If we have the situation where a Swap feeds a Splat we can sometimes change the
index on the Splat and then remove the Swap instruction.

Differential Revision: https://reviews.llvm.org/D39009

llvm-svn: 316366

feafa1d7

Oct 21, 2017

Reverting r316270 due to failing build bots. · fc02869c

Aaron Ballman authored Oct 21, 2017

http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951

llvm-svn: 316276

fc02869c

[PPC CodeGen] Fix the bitreverse.i64 intrinsic. · c7b749bd

Fangrui Song authored Oct 21, 2017

Summary: The two 32-bit words were swapped.

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D38705

llvm-svn: 316270

c7b749bd

Oct 20, 2017

Disabling the transformation introduced in r315888 · 0026c06e

Nemanja Ivanovic authored Oct 20, 2017

The commit at https://reviews.llvm.org/rL315888 is causing some failures
with internal testing. Disabling this code until we can resolve the issues.

llvm-svn: 316199

0026c06e

Oct 19, 2017

The cost of splitting a large vector instruction is not being taken into... · 488782ef

Graham Yiu authored Oct 19, 2017

The cost of splitting a large vector instruction is not being taken into account by the getUserCost function. This was leading to some loops being over unrolled. The cost of a vector instruction is now being multiplied by the cost of the type legalization. This will return a more accurate cost.

Committing on behalf on Brad Nemanich (brad.nemanich@ibm.com)

Differential Revision: https://reviews.llvm.org/D38961

llvm-svn: 316174

488782ef

Oct 18, 2017

[PowerPC] Use helper functions to check sign-/zero-extended value · 5388e66d

Hiroshi Inoue authored Oct 18, 2017

Helper functions to identify sign- and zero-extending machine instruction is introduced in rL315888.
This patch makes PPCInstrInfo::optimizeCompareInstr use the helper functions. It simplifies the code and also makes possible more optimizations since the helper can do more analysis than the original check code; I observed about 5000 more compare instructions are eliminated while building LLVM.

Also, this patch fixes a bug in helpers on ANDIo instruction handling due to the order of checks. This bug causes a failure in an existing test case for optimizeCompareInstr.

Differential Revision: https://reviews.llvm.org/D38988

llvm-svn: 316071

5388e66d

Oct 16, 2017

Add iterator range MachineRegisterInfo::liveins(), adopt users, NFC · 72518eaa
Krzysztof Parzyszek authored Oct 16, 2017
```
llvm-svn: 315927
```
72518eaa

[PowerPC] fix up in sign-/zero-extension elimination · a7eb78b4

Hiroshi Inoue authored Oct 16, 2017

This patch fixes a potential problem in my previous commit (https://reviews.llvm.org/rL315888) by adding a null check.

llvm-svn: 315900

a7eb78b4

[PowerPC] Eliminate sign- and zero-extensions if already sign- or zero-extended · e3a3e3c9

Hiroshi Inoue authored Oct 16, 2017

This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass.
If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated.
One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI. 
For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated.

void int_func(int);
void ii_test(int a) {
    if (a & 1) return int_func(a);
}

Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG.

Differential Revision: https://reviews.llvm.org/D31319

llvm-svn: 315888

e3a3e3c9

Oct 15, 2017

Reverting r315590; it did not include changes for llvm-tblgen, which is... · 615eb470

Aaron Ballman authored Oct 15, 2017

Reverting r315590; it did not include changes for llvm-tblgen, which is causing link errors for several people.

Error LNK2019 unresolved external symbol "public: void __cdecl `anonymous namespace'::MatchableInfo::dump(void)const " (?dump@MatchableInfo@?A0xf4f1c304@@QEBAXXZ) referenced in function "public: void __cdecl `anonymous namespace'::AsmMatcherEmitter::run(class llvm::raw_ostream &)" (?run@AsmMatcherEmitter@?A0xf4f1c304@@QEAAXAEAVraw_ostream@llvm@@@Z) llvm-tblgen D:\llvm\2017\utils\TableGen\AsmMatcherEmitter.obj 1

llvm-svn: 315854

615eb470

Oct 13, 2017

DAG: Add opcode and source type to isFPExtFree · f2db97d8

Matt Arsenault authored Oct 13, 2017

This is only currently used for mad/fma transforms.
This is the only case where it should be used for AMDGPU,
so add an opcode to be sure.

llvm-svn: 315740

f2db97d8

Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine" · bb8507e6

Matthias Braun authored Oct 12, 2017

Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

llvm-svn: 315637

bb8507e6

TargetMachine: Merge TargetMachine and LLVMTargetMachine · 3a9c114b

Matthias Braun authored Oct 12, 2017

Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

llvm-svn: 315633

3a9c114b

Oct 12, 2017

[PowerPC] Add profitablilty check for conversion to mtctr loops · 0724fea2

Lei Huang authored Oct 12, 2017

Add profitability checks for modifying counted loops to use the mtctr instruction.

The latency of mtctr is only justified if there are more than 4 comparisons that
will be removed as a result. Usually counted loops are formed relatively early
and before unrolling, so most low trip count loops often don't survive. However
we want to ensure that if they do, we do not mistakenly update them to mtctr loops.

Use CodeMetrics to ensure we are only doing this for small loops with small trip counts.

Differential Revision: https://reviews.llvm.org/D38212

llvm-svn: 315592

0724fea2

[dump] Remove NDEBUG from test to enable dump methods [NFC] · 3e0199f7

Don Hinton authored Oct 12, 2017

Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.

Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.

Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.

Differential Revision: https://reviews.llvm.org/D38406

llvm-svn: 315590

3e0199f7

Oct 11, 2017

[PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex... · 263dc4ef

Lei Huang authored Oct 11, 2017

[PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex elimination to only use `lis/addi` if necessary.

Currently we produce a bunch of unnecessary code when emitting the
prologue/epilogue for spills/restores.  Namely, if the load from stack
slot/store to stack slot instruction is an X-Form instruction, we will
always produce an LIS/ORI sequence for the stack offset.

Furthermore, we have not exploited the P9 vector D-Form loads/stores for this
purpose.

This patch address both issues.

Specifying the D-Form load as the instruction to use for stack spills/reloads
should be safe because:

1. The stack should be aligned according to the ABI
2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will
   check for the offset being a multiple of 16 and will convert it to an
   X-Form instruction if it isn't.

Differential Revision : https://reviews.llvm.org/D38758

llvm-svn: 315500

263dc4ef

[Asm] Add debug tracing in table-generated assembly matcher · 4191b9ea

Oliver Stannard authored Oct 11, 2017

This adds debug tracing to the table-generated assembly instruction matcher,
enabled by the -debug-only=asm-matcher option.

The changes in the target AsmParsers are to add an MCInstrInfo reference under
a consistent name, so that we can use it from table-generated code. This was
already being used this way for targets that use deprecation warnings, but 5
targets did not have it, and Hexagon had it under a different name to the other
backends.

llvm-svn: 315445

4191b9ea

Oct 10, 2017

[MC] Add a missing <memory> include left out of r315327. · 3a67075a
Lang Hames authored Oct 10, 2017
```
llvm-svn: 315331
```
3a67075a

[MC] Thread unique_ptr<MCObjectWriter> through the create.*ObjectWriter · 60fbc7cc

Lang Hames authored Oct 10, 2017

functions.

This makes the ownership of the resulting MCObjectWriter clear, and allows us
to remove one instance of MCObjectStreamer's bizarre "holding ownership via
someone else's reference" trick.

llvm-svn: 315327

60fbc7cc