Commits · 61ef1c540cd2f6245be54684e966cdffa9f394f6 · Lorenzo Albano / LLVM bpEVL

Sep 05, 2017

[PPC][NFC] Renaming things with 'xxinsert' moniker to 'vecinsert' to make it more general. · 61ef1c54
Tony Jiang authored Sep 05, 2017
```
Commit on behalf of Graham Yiu (gyiu@ca.ibm.com)

llvm-svn: 312547
```
61ef1c54

[AVX512] Remove patterns for (v8f32 (X86vzmovl (insert_subvector undef, (v4f32... · 33caeadd

Craig Topper authored Sep 05, 2017

[AVX512] Remove patterns for (v8f32 (X86vzmovl (insert_subvector undef, (v4f32 (scalar_to_vector FR32X:)), (iPTR 0)))) and the same for v4f64.

We don't have this same pattern for AVX2 so I don't believe we should have it for AVX512. We also didn't have it for v16f32.

llvm-svn: 312543

33caeadd

AMDGPU/NFC: Cleanup/refactor SIMemoryLegalizer [2]: · 1aa667fe

Konstantin Zhuravlyov authored Sep 05, 2017

  - Make SIMemOpInfo a class
  - Add accessor methods to SIMemOpInfo
  - Move get*Info methods to SIMemOpInfo

Differential Revision: https://reviews.llvm.org/D37395

llvm-svn: 312541

1aa667fe

AMDGPU/NFC: Cleanup/refactor SIMemoryLegalizer [1]: · 844845ae

Konstantin Zhuravlyov authored Sep 05, 2017

  - Rename MemOpInfo -> SIMemOpInfo
  - Move SIMemOpInfo class out of SIMemoryLegalizer class

Differential Revision: https://reviews.llvm.org/D37394

llvm-svn: 312540

844845ae

[X86] Limit store merge size when implicitfloat is enabled (PR34421) · 49f9ba37

Simon Pilgrim authored Sep 05, 2017

As suggested by @niravd : https://bugs.llvm.org/show_bug.cgi?id=34421#c2

Differential Revision: https://reviews.llvm.org/D37464

llvm-svn: 312534

49f9ba37

Strip trailing whitespace. NFCI. · 60ea09ea
Simon Pilgrim authored Sep 05, 2017
```
llvm-svn: 312531
```
60ea09ea

[Decompression] Fail gracefully when out of memory · 0992d382

Jonas Devlieghere authored Sep 05, 2017

This patch adds failing gracefully when running out of memory when
allocating a buffer for decompression.

This provides a work-around for:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3224

Differential revision: https://reviews.llvm.org/D37447

llvm-svn: 312526

0992d382

[ARM] GlobalISel: Minor cleanups in inst selector · ac15473c

Diana Picus authored Sep 05, 2017

Use the STI member of ARMInstructionSelector instead of
TII.getSubtarget() and also make use of STI's methods instead of
checking the object format manually.

llvm-svn: 312522

ac15473c

[ARM] GlobalISel: Support global variables for RWPI · abb08869

Diana Picus authored Sep 05, 2017

In RWPI code, globals that are not read-only are accessed relative to
the SB register (R9). This is achieved by explicitly generating an ADD
instruction between SB and an offset that we either load from a constant
pool or movw + movt into a register.

llvm-svn: 312521

abb08869

[X86] Add hasSideEffects=0 and mayLoad=1 to some instructions that recently... · c228d790
Craig Topper authored Sep 05, 2017
```
[X86] Add hasSideEffects=0 and mayLoad=1 to some instructions that recently had their patterns removed.

llvm-svn: 312520
```
c228d790
[InstCombine] Move foldSelectICmpAnd helper function earlier in the file to... · 28d6d962
Craig Topper authored Sep 05, 2017
```
[InstCombine] Move foldSelectICmpAnd helper function earlier in the file to enable reuse in a future patch.

llvm-svn: 312518
```
28d6d962

[InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know... · 4c766a05

Craig Topper authored Sep 05, 2017

[InstCombine] In foldSelectIntoOp, avoid creating a Constant before we know for sure we're going to use it and avoid an unnecessary call to m_APInt.

Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt.

This is a refactor for a future patch that will do some more checks of the constant values here.

llvm-svn: 312517

4c766a05

[PowerPC] eliminate redundant compare instruction · 614453b7

Hiroshi Inoue authored Sep 05, 2017

If multiple conditional branches are executed based on the same comparison, we can execute multiple conditional branches based on the result of one comparison on PPC. For example,

if (a == 0) { ... }
else if (a < 0) { ... }

can be executed by one compare and two conditional branches instead of two pairs of a compare and a conditional branch.

This patch identifies a code sequence of the two pairs of a compare and a conditional branch and merge the compares if possible.
To maximize the opportunity, we do canonicalization of code sequence before merging compares.
For the above example, the input for this pass looks like:

cmplwi r3, 0
beq    0, .LBB0_3
cmpwi  r3, -1
bgt    0, .LBB0_4

So, before merging two compares, we canonicalize it as

cmpwi  r3, 0       ; cmplwi and cmpwi yield same result for beq
beq    0, .LBB0_3
cmpwi  r3, 0       ; greather than -1 means greater or equal to 0
bge    0, .LBB0_4

The generated code should be

cmpwi  r3, 0
beq    0, .LBB0_3
bge    0, .LBB0_4

Differential Revision: https://reviews.llvm.org/D37211

llvm-svn: 312514

614453b7

[ORC] Add a pair of ORC layers that forward object-layer operations via RPC. · 617fc356

Lang Hames authored Sep 05, 2017

This patch introduces RemoteObjectClientLayer and RemoteObjectServerLayer,
which can be used to forward ORC object-layer operations from a JIT stack in
the client to a JIT stack (consisting only of object-layers) in the server.

This is a new way to support remote-JITing in LLVM. The previous approach
(supported by OrcRemoteTargetClient and OrcRemoteTargetServer) used a
remote-mapping memory manager that sat "beneath" the JIT stack and sent
fully-relocated binary blobs to the server. The main advantage of the new
approach is that relocatable objects can be cached on the server and re-used
(if the code that they represent hasn't changed), whereas fully-relocated blobs
can not (since the addresses they have been permanently bound to will change
from run to run).

llvm-svn: 312511

617fc356

NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect... · f9c9455d

Daniel Berlin authored Sep 05, 2017

NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes.  We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant.

llvm-svn: 312510

f9c9455d

NewGVN: Fix PR 34452 by passing instruction all the way down when we do... · 54a92fcc
Daniel Berlin authored Sep 05, 2017
```
NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification

llvm-svn: 312509
```
54a92fcc
NewGVN: Detect copies through predicateinfo · 1a582582
Daniel Berlin authored Sep 05, 2017
```
llvm-svn: 312508
```
1a582582

NewGVN: Change where check for original instruction in phi of ops leader... · 4ad7e8d2

Daniel Berlin authored Sep 05, 2017

NewGVN: Change where check for original instruction in phi of ops leader finding is done. Where we had it before, we would stop looking when we hit the original instruction, but skip it. Now we skip it and keep looking.

llvm-svn: 312507

4ad7e8d2

Sep 04, 2017

Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" · f71bb198
Sam McCall authored Sep 04, 2017
```
This crashes on boringSSL on PPC (will send reduced testcase)

This reverts commit r312328.

llvm-svn: 312490
```
f71bb198

[X86][AVX512] Add support for VPERMILPS v16f32 shuffle lowering (PR34382) · 91751b42

Simon Pilgrim authored Sep 04, 2017

Avoid use of VPERMPS where we don't need it by instead using the variable mask version of VPERMILPS for unary shuffles.

llvm-svn: 312486

91751b42

[DebugInfo] - Fix for lld DWARF parsing of base address selection entries in range lists. · 2f95c8bc

George Rimar authored Sep 04, 2017

It solves issue of wrong section index evaluating for ranges when
base address is used.

Based on David Blaikie's patch D36097.

Differential revision: https://reviews.llvm.org/D37214

llvm-svn: 312477

2f95c8bc

[GlobalISel][X86] G_PHI support. · 2661ae48
Igor Breger authored Sep 04, 2017
```
llvm-svn: 312473
```
2661ae48

LoopVectorize: MaxVF should not be larger than the loop trip count · 9a087a35

Zvi Rackover authored Sep 04, 2017

Summary:
Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count.

Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow:
1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks.
2. Compute MaxVF returned 16 on a target supporting AVX512.
3. OptForSize -> choose VF:=MaxVF.
4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop.

With this patch step 2. will choose MaxVF=8 based on TripCount.

Reviewers: Ayal, dorit, mkuper, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D37425

llvm-svn: 312472

9a087a35

[LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop · 7cd826a3

Sam Parker authored Sep 04, 2017

Debug information can be, and was, corrupted when the runtime
remainder loop was fully unrolled. This is because a !null node can
be created instead of a unique one describing the loop. In this case,
the original node gets incorrectly updated with the NewLoopID
metadata.

In the case when the remainder loop is going to be quickly fully
unrolled, there isn't the need to add loop metadata for it anyway.

Differential Revision: https://reviews.llvm.org/D37338

llvm-svn: 312471

7cd826a3

[X86] Remove duplicate FMA patterns from the isel table. · 69e22789

Craig Topper authored Sep 04, 2017

This reorders some patterns to get tablegen to detect them as duplicates. Tablegen only detects duplicates when creating variants for commutable operations. It does not detect duplicates between the patterns as written in the td file. So we need to ensure all the FMA patterns in the td file are unique.

This also uses null_frag to remove some other unneeded patterns.

llvm-svn: 312470

69e22789

[X86] Mark the FMA nodes as commutable so tablegen will auto generate the patterns. · af0b992b

Craig Topper authored Sep 04, 2017

This uses the capability introduced in r312464 to make SDNode patterns commutable on the first two operands.

This allows us to remove some of the extra FMA patterns that have to put loads and mask operands in different places to cover all cases. This even includes patterns that were missing to support match a load in the first operand with FMA4. Non-broadcast loads with masking for AVX512.

I believe this is causing us to generate some duplicate patterns because tablegen's isomorphism checks don't catch isomorphism between the patterns as written in the td. It only detects isomorphism in the commuted variants it tries to create. The the unmasked 231 and 132 memory forms are isomorphic as written in the td file so we end up keeping both. I think we precommute the 132 pattern to fix this.

We also need a follow up patch to go back to the legacy FMA3 instructions and add patterns to the 231 and 132 forms which we currently don't have.

llvm-svn: 312469

af0b992b

[XRay][CodeGen] Use PIC-friendly code in XRay sleds and remove synthetic references in .text · ebc16590

Dean Michael Berris authored Sep 04, 2017

Summary:
This is a re-roll of D36615 which uses PLT relocations in the back-end
to the call to __xray_CustomEvent() when building in -fPIC and
-fxray-instrument mode.

Reviewers: pcc, djasper, bkramer

Subscribers: sdardis, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D37373

llvm-svn: 312466

ebc16590

[X86] Add a combine to recognize when we have two insert subvectors that... · 76f44015

Craig Topper authored Sep 04, 2017

[X86] Add a combine to recognize when we have two insert subvectors that together write the whole vector, but the starting vector isn't undef.

In this case we should replace the starting vector with undef.

llvm-svn: 312462

76f44015

[X86] Remove some unnecessary curly braces and blank line. NFC · 959fc08f
Craig Topper authored Sep 04, 2017
```
llvm-svn: 312461
```
959fc08f

[X86] Add a combine to turn (insert_subvector zero, (insert_subvector zero, X,... · bc13af84

Craig Topper authored Sep 03, 2017

[X86] Add a combine to turn (insert_subvector zero, (insert_subvector zero, X, Idx), Idx) into an insert of X into the larger zero vector.

llvm-svn: 312460

bc13af84

[X86] Add more patterns to use moves to zero the upper portions of a vector... · fcf6bc55
Craig Topper authored Sep 03, 2017
```
[X86] Add more patterns to use moves to zero the upper portions of a vector register that I missed in r312450.

llvm-svn: 312459
```
fcf6bc55
[X86] Combine inserting a vector of zeros into a vector of zeros just the larger vector. · 788fbe08
Craig Topper authored Sep 03, 2017
```
llvm-svn: 312458
```
788fbe08

Sep 03, 2017

[X86] Add patterns to turn an insert into lower subvector of a zero vector... · 8ee36ffb

Craig Topper authored Sep 03, 2017

[X86] Add patterns to turn an insert into lower subvector of a zero vector into a move instruction which will implicitly zero the upper elements.

Ideally we'd be able to emit the SUBREG_TO_REG without the explicit register->register move, but we'd need to be sure the producing operation would select something that guaranteed the upper bits were already zeroed.

llvm-svn: 312450

8ee36ffb

[X86] Add VBLENDPS/VPBLENDD to the execution domain fixing tables. · fa82efb5
Craig Topper authored Sep 03, 2017
```
llvm-svn: 312449
```
fa82efb5

[X86] Canonicalize (concat_vectors X, zero) -> (insert_subvector zero, X, 0). · bb6506d2

Craig Topper authored Sep 03, 2017

In a future patch, I plan to teach isel to use a small vector move with implicit zeroing of the upper elements when it sees the (insert_subvector zero, X, 0) pattern.

llvm-svn: 312448

bb6506d2

[X86] Fix crash on assert of non-simple type after type-legalization · 44cde949

Ayman Musa authored Sep 03, 2017

The function combineShuffleToVectorExtend in DAGCombine might generate an illegal typed node after "legalize types" phase, causing assertion on non-simple type to fail afterwards.

Adding a type check in case the combine is running after the type legalize pass.

Differential Revision: https://reviews.llvm.org/D37330

llvm-svn: 312438

44cde949

[X86] Add output register to BTC/BTR/BTS instructions. · fe96ff73
Craig Topper authored Sep 03, 2017
```
llvm-svn: 312432
```
fe96ff73

[ORC] Add an Error return to the JITCompileCallbackManager::grow method. · 8a6bab78

Lang Hames authored Sep 03, 2017

Calling grow may result in an error if, for example, this is a callback
manager for a remote target. We need to be able to return this error to the
callee.

llvm-svn: 312429

8a6bab78

Move some CLI utils out of llvm-isel-fuzzer and into the library · 7f28d732

Justin Bogner authored Sep 02, 2017

FuzzMutate might not be the best place for these, but it makes more
sense than an entirely new library for now. This will make setting up
fuzz targets with consistent CLI handling easier.

llvm-svn: 312425

7f28d732

Sep 02, 2017

[X86] Teach fastisel to handle zext/sext i8->i16 and sext i1->i8/i16/i32/i64 · 619b759a

Craig Topper authored Sep 02, 2017

Summary:
ZExt and SExt from i8 to i16 aren't implemented in the autogenerated fast isel table because normal isel does a zext/sext to 32-bits and a subreg extract to avoid a partial register write or false dependency on the upper bits of the destination. This means without handling in fast isel we end up triggering a fast isel abort.

We had no custom sign extend handling at all so while I was there I went ahead and implemented sext i1->i8/i16/i32/i64 which was also missing. This generates an i1->i8 sign extend using a mask with 1, then an 8-bit negate, then continues with a sext from i8. A better sequence would be a wider and/negate, but would require more custom code.

Fast isel tests are a mess and I couldn't find a good home for the tests so I created a new one.

The test pr34381.ll had to have fast-isel removed because it was relying on a fast isel abort to hit the bug. The test case still seems valid with fast-isel disabled though some of the instructions changed.

Reviewers: spatel, zvi, igorb, guyblank, RKSimon

Reviewed By: guyblank

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37320

llvm-svn: 312422

619b759a