Commits · 052f14ef5a662b58d6db4acfae253252bcb6bdcd · Roger Ferrer / llvm-epi

Jan 24, 2018

Fix up and document controlling ccache via CMake options. · 052f14ef
Paul Robinson authored Jan 24, 2018
```
Patch by Matthew Davis!

Differential Revision: https://reviews.llvm.org/D41757

llvm-svn: 323357
```
052f14ef

[AMDGPU] Make sure all super regs of reserved regs are marked reserved. · c4796d47

Geoff Berry authored Jan 24, 2018

Summary:
Move reserveRegisterTuples into AMDGPURegisterInfo and use it in
R600RegisterInfo::getReservedRegs and
R600InstrInfo::reserveIndirectRegisters to ensure that all super
registers of reserved registers are also marked as reserved.

Before this change, under certain circumstances, the registers %t1_x and
%t1_xyzw would be marked as reserved, but %t1_xy and %t1_xyz would not
be, leading to the register allocator sometimes assigning a register to
%t1_xy, which is invalid since %t1_x is reserved.

Reviewers: arsenm, tstellar, MatzeB, qcolombet

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D42448

llvm-svn: 323356

c4796d47

Revert r321751, "StructurizeCFG: Fix broken backedge detection" · 4afb64e4

Nicolai Haehnle authored Jan 24, 2018

It causes regressions in various OpenGL test suites.

Keep the test cases introduced by r321751 as XFAIL, and add a test case
for the regression.

Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015
llvm-svn: 323355

4afb64e4

[ARM] Expand long shifts for Thumb1 to __aeabi_ calls · 665784f1

Weiming Zhao authored Jan 24, 2018

Summary: For long shifts, the inlined version takes about 20 instructions on Thumb1. To avoid the code bloat, expand to __aeabi_ calls if target is Thumb1.

Reviewers: samparker

Reviewed By: samparker

Subscribers: samparker, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42401

llvm-svn: 323354

665784f1

[X86] Fix some inconsistencies in the itineraries and Sched for (V)PEXTRW/(V)PINSRW · 05af43fb
Craig Topper authored Jan 24, 2018
```
The weirdest being that PEXTRWrr was tagged as a memory operation.

llvm-svn: 323353
```
05af43fb

[X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 for... · b85b484f

Craig Topper authored Jan 24, 2018

[X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 for consistency and to maybe enable more regular expression compaction in the scheduler models. NFCI

llvm-svn: 323352

b85b484f

[X86] Remove '(_REV)?' from a bunch of scheduler regular expressions. NFC · 23cc866c

Craig Topper authored Jan 24, 2018

The regexs are treated as a prefix match already so the checking for optional text at the end provides no value. Instead it prevents the binary search optimization in tablegen from kicking in due to the top level question mark.

llvm-svn: 323351

23cc866c

[ThinLTO] Add call edges' relative block frequency to per-module summary. · 5f7aff9a

Easwaran Raman authored Jan 24, 2018

Summary:
This allows relative block frequency of call edges to be passed to the
thinlink stage where it will be used to compute synthetic entry counts
of functions.

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, llvm-commits, inglorion

Differential Revision: https://reviews.llvm.org/D42212

llvm-svn: 323349

5f7aff9a

[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle. · 4bd8e533

Alexey Bataev authored Jan 24, 2018

Summary:
If the same value is going to be vectorized several times in the same
tree entry, this entry is considered to be a gather entry and cost of
this gather is counter as cost of InsertElementInstrs for each gathered
value. But we can consider these elements as ShuffleInstr with
SK_PermuteSingle shuffle kind.

Reviewers: spatel, RKSimon, mkuper, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38697

llvm-svn: 323348

4bd8e533

[Hexagon] Run late copy propagation and dead code elimination passes · cf3ad584
Krzysztof Parzyszek authored Jan 24, 2018
```
llvm-svn: 323346
```
cf3ad584
Handle R_386_PLT32 in RuntimeDyldELF. · fc16f76e
Rafael Espindola authored Jan 24, 2018
```
This should fix the 32 bit buildbots.

llvm-svn: 323344
```
fc16f76e

InstSimplify: If divisor element is undef simplify to undef · 51f0d64b

Zvi Rackover authored Jan 24, 2018

Summary:
If any vector divisor element is undef, we can arbitrarily choose it be
zero which would make the div/rem an undef value by definition.

Reviewers: spatel, reames

Reviewed By: spatel

Subscribers: magabari, llvm-commits

Differential Revision: https://reviews.llvm.org/D42485

llvm-svn: 323343

51f0d64b

[globalisel] Introduce LegalityQuery to better encapsulate the legalizer decisions. NFC. · 262ed0ec

Daniel Sanders authored Jan 24, 2018

Summary:
`getAction(const InstrAspect &) const` breaks encapsulation by exposing
the smaller components that are used to decide how to legalize an
instruction.

This is a problem because we need to change the implementation of
LegalizerInfo so that it's able to describe particular type combinations
rather than just cartesian products of types.

For example, declaring the following
  setAction({..., 0, s32}, Legal)
  setAction({..., 0, s64}, Legal)
  setAction({..., 1, s32}, Legal)
  setAction({..., 1, s64}, Legal)
currently declares these type combinations as legal:
  {s32, s32}
  {s64, s32}
  {s32, s64}
  {s64, s64}
but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES has relationships between the types that are currently
described incorrectly.

Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).

This patch introduces LegalityQuery which provides all the information
needed by the legalizer to make a decision on whether something is legal
and how to legalize it.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner

Reviewed By: bogner

Subscribers: bogner, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D42244

llvm-svn: 323342

262ed0ec

[NFC] Make magic number for DJB hash function customizable. · 5803a674

Jonas Devlieghere authored Jan 24, 2018

This allows us to specify the magic number for the DJB hash function.
This feature is needed by dsymutil to emit Apple types accelerator
table.

llvm-svn: 323341

5803a674

[dsymutil] Make NonRelocatableStringPool a wrapper around DwarfStringPoolEntry. NFC · e7d3d907

Jonas Devlieghere authored Jan 24, 2018

This is needed in order to use our StringPool entries in the Apple
accelerator tables.

As this is NFC we rely on the existing tests for correctness.

llvm-svn: 323339

e7d3d907

[ValueTracking] add recursion depth param to matchSelectPattern · 1d91ec34

Sanjay Patel authored Jan 24, 2018

We're getting bug reports:
https://bugs.llvm.org/show_bug.cgi?id=35807
https://bugs.llvm.org/show_bug.cgi?id=35840
https://bugs.llvm.org/show_bug.cgi?id=36045
...where we blow up the stack in value tracking because other passes are sending 
in selects that have an operand that is itself the select.

We don't currently have a reliable way to avoid analyzing dead code that may take 
non-standard forms, so bail out when things go too far.

This mimics the recursion depth limitations in other parts of value tracking.

Unfortunately, this pushes the underlying problems for other passes (jump-threading,
simplifycfg, correlated-propagation) into hiding. If someone wants to uncover those
again, the first draft of this patch on Phab would do that (it would assert rather
than bail out).

Differential Revision: https://reviews.llvm.org/D42442

llvm-svn: 323331

1d91ec34

X86 Tests: Add more sdiv combine cases. NFC · 22bfa7e5
Zvi Rackover authored Jan 24, 2018
```
Add cases with vector non-splat pow2 contant divider.

llvm-svn: 323329
```
22bfa7e5
Regenerate shuffle sink test · f15886eb
Simon Pilgrim authored Jan 24, 2018
```
llvm-svn: 323328
```
f15886eb
Reverted 323321. · d53504e3
Amjad Aboud authored Jan 24, 2018
```
llvm-svn: 323326
```
d53504e3

[AArch64] Avoid unnecessary vector byte-swapping in big-endian · 9b3d4c01

Pablo Barrio authored Jan 24, 2018

Summary:
Loads/stores of some NEON vector types are promoted to other vector
types with different lane sizes but same vector size. This is not a
problem in little-endian but, when in big-endian, it requires
additional byte reversals required to preserve the lane ordering
while keeping the right endianness of the data inside each lane.
For example:

%1 = load <4 x half>, <4 x half>* %p

results in the following assembly:

ld1 { v0.2s }, [x1]
rev32 v0.4h, v0.4h

This patch changes the promotion of these loads/stores so that the
actual vector load/store (LD1/ST1) takes care of the endianness
correctly and there is no need for further byte reversals. The
previous code now results in the following assembly:

ld1 { v0.4h }, [x1]

Reviewers: olista01, SjoerdMeijer, efriedma

Reviewed By: efriedma

Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D42235

llvm-svn: 323325

9b3d4c01

[Hexagon] Remove unused HexagonISD opcodes, NFC · 5aef4b59
Krzysztof Parzyszek authored Jan 24, 2018
```
llvm-svn: 323324
```
5aef4b59

[DebugInfo] Emit DWARF reference for DIVariable 'count' in DISubrange · dc00becd

Sander de Smalen authored Jan 24, 2018

Summary:
This patch implements the codegen of DWARF debug info for non-constant
'count' fields for DISubrange.

This is patch [2/3] in a series to extend LLVM's DISubrange Metadata
node to support debugging of C99 variable length arrays and vectors with
runtime length like the Scalable Vector Extension for AArch64. It is
also a first step towards representing more complex cases like arrays
in Fortran.

Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie

Reviewed By: aprantl

Subscribers: fhahn, aemerson, rengolin, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41696

llvm-svn: 323323

dc00becd

[InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine). · e4453233

Amjad Aboud authored Jan 24, 2018

Combine expression patterns to form expressions with fewer, simple instructions.
This pass does not modify the CFG.

For example, this pass reduce width of expressions post-dominated by TruncInst
into smaller width when applicable.

It differs from instcombine pass in that it contains pattern optimization that
requires higher complexity than the O(1), thus, it should run fewer times than
instcombine pass.

Differential Revision: https://reviews.llvm.org/D38313

llvm-svn: 323321

e4453233

[X86][SSE] Avoid calls to combineX86ShufflesRecursively that can't combine to... · f26df478

Simon Pilgrim authored Jan 24, 2018

[X86][SSE] Avoid calls to combineX86ShufflesRecursively that can't combine to target shuffles (PR32037)

Don't bother making recursive calls to combineX86ShufflesRecursively if we have more shuffle source operands than will be combined together with the remaining recursive depth.

See https://bugs.llvm.org/show_bug.cgi?id=32037#c26 and https://bugs.llvm.org/show_bug.cgi?id=32037#c27 for the reduction in compile times from this patch.

Differential Revision: https://reviews.llvm.org/D42378

llvm-svn: 323320

f26df478

Fix typos of occurred and occurrence · 21e545d0
Malcolm Parsons authored Jan 24, 2018
```
llvm-svn: 323318
```
21e545d0
Fixes Sphinx issue ('undefined label') introduced in r323313. · 1cb9431e
Sander de Smalen authored Jan 24, 2018
```
(and also slightly reformatted the related lines to look better in
the rendered HTML)

llvm-svn: 323317
```
1cb9431e
[llvm-opt-fuzzer] Add couple of popular passes · 50acecf2
Igor Laevsky authored Jan 24, 2018
```
Differential Revision: https://reviews.llvm.org/D42410

llvm-svn: 323314
```
50acecf2

[Metadata] Extend 'count' field of DISubrange to take a metadata node · fdf40917

Sander de Smalen authored Jan 24, 2018

Summary:
This patch extends the DISubrange 'count' field to take either a
(signed) constant integer value or a reference to a DILocalVariable
or DIGlobalVariable.

This is patch [1/3] in a series to extend LLVM's DISubrange Metadata
node to support debugging of C99 variable length arrays and vectors with
runtime length like the Scalable Vector Extension for AArch64. It is
also a first step towards representing more complex cases like arrays
in Fortran.

Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie

Reviewed By: aprantl

Subscribers: rnk, probinson, fhahn, aemerson, rengolin, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D41695

llvm-svn: 323313

fdf40917

[DAGCombiner] Bail out if vector size is not a multiple · e8404780

Sven van Haastregt authored Jan 24, 2018

For the included test case, the DAG transformation
  concat_vectors(scalar, undef) -> scalar_to_vector(sclr)
would attempt to create a v2i32 vector for a v9i8
concat_vector.  Bail out to avoid creating a bitcast with
mismatching sizes later on.

Differential Revision: https://reviews.llvm.org/D42379

llvm-svn: 323312

e8404780

[Doc] Guideline on adding exception handling support for a target · 83a56158

David Chisnall authored Jan 24, 2018

Summary:
This is the first attempt to write down a guideline on adding exception handling support for a target. The content basically bases on the discussion on [1]. If you guys know who is exception handling expert, please add him as the reviewer. Thanks.

[1] http://lists.llvm.org/pipermail/llvm-dev/2018-January/120405.html

Reviewers: t.p.northover, theraven, nemanjai

Reviewed By: theraven

Subscribers: sdardis, llvm-commits

Differential Revision: https://reviews.llvm.org/D42178

llvm-svn: 323311

83a56158

[NFC] Remove overconfident assert from IRCE · 0f720e12

Max Kazantsev authored Jan 24, 2018

This patch removes assert that SCEV is able to prove that a value is
non-negative. In fact, SCEV can sometimes be unable to do this because
its cache does not update properly. This assert will be returned once this
problem is resolved.

llvm-svn: 323309

0f720e12

[ARM] Call __chkstk for dynamic stack allocation in all windows environments · 4ed94a06

Martin Storsjö authored Jan 24, 2018

This matches what MSVC does for alloca() function calls on ARM.
Even if MSVC doesn't support VLAs at the language level, it does
support the alloca function.

On the clang level, both the _alloca() (when emulating MSVC, which is
what the alloca() function expands to) and __builtin_alloca() builtin
functions, and VLAs, map to the same LLVM IR "alloca" function - so
within LLVM they're not distinguishable from each other.

Differential Revision: https://reviews.llvm.org/D42292

llvm-svn: 323308

4ed94a06

[GlobalMerge] Don't merge dllexport globals · e8248f2e

Martin Storsjö authored Jan 24, 2018

Merging such globals loses the dllexport attribute. Add a test
to check that normal globals still are merged.

Differential Revision: https://reviews.llvm.org/D42127

llvm-svn: 323307

e8248f2e

[X86] Move 'Y' to correct place in FMA4 regular expression in Znver1 scheduler model. · 069e1dd8

Craig Topper authored Jan 24, 2018

I think these instructions used to be named differently and the regular expression reflected that. I guess we must have correct itinerary information that made this not matter for the scheduler test?

llvm-svn: 323305

069e1dd8

[X86] Rename 256-bit VFRCZ instructions to have the Y before the rr/rm to... · a55ac7b7
Craig Topper authored Jan 24, 2018
```
[X86] Rename 256-bit VFRCZ instructions to have the Y before the rr/rm to match other instructions. NFC

llvm-svn: 323304
```
a55ac7b7
[X86] Remove redundant regular expression from the Znver1 scheduler model. NFC · fd68c2d0
Craig Topper authored Jan 24, 2018
```
llvm-svn: 323303
```
fd68c2d0
[NFC] fix trivial typos in comments · 501931b1
Hiroshi Inoue authored Jan 24, 2018
```
"the the" -> "the"

llvm-svn: 323302
```
501931b1

[X86] Use ISD::SIGN_EXTEND instead of X86ISD::VSEXT for mask to xmm/ymm/zmm conversion · 0321ebc0

Craig Topper authored Jan 24, 2018

There are a couple tricky things with this patch.

I had to add an override of isVectorLoadExtDesirable to stop DAG combine from combining sign_extend with loads after legalization since we legalize sextload using a load+sign_extend. Overriding this hook actually prevents a lot sextloads from being created in the first place.

I also had to add isel patterns because DAG combine blindly combines sign_extend+truncate to a smaller sign_extend which defeats what legalization was trying to do.

Differential Revision: https://reviews.llvm.org/D42407

llvm-svn: 323301

0321ebc0

[Dominators] Introduce DomTree verification levels · ffb4fb7f

Jakub Kuderski authored Jan 24, 2018

Summary:
Currently, there are 2 ways to verify a DomTree:
* `DT.verify()` -- runs full tree verification and checks all the properties and gives a reason why the tree is incorrect. This is run by when EXPENSIVE_CHECKS are enabled or when `-verify-dom-info` flag is set.
* `DT.verifyDominatorTree()` -- constructs a fresh tree and compares it against the old one. This does not check any other tree properties (DFS number, levels), nor ensures that the construction algorithm is correct. Used by some passes inside assertions.

This patch introduces DomTree verification levels, that try to close the gape between the two ways of checking trees by introducing 3 verification levels:
- Full -- checks all properties, but can be slow (O(N^3)). Used when manually requested (e.g. `assert(DT.verify())`) or when  `-verify-dom-info` is set.
- Basic -- checks all properties except the sibling property, and compares the current tree with a freshly constructed one instead. This should catch almost all errors, but does not guarantee that the construction algorithm is correct. Used when EXPENSIVE checks are enabled.
- Fast -- checks only basic properties (reachablility, dfs numbers, levels, roots), and compares with a fresh tree. This is meant to replace the legacy `DT.verifyDominatorTree()` and in my tests doesn't cause any noticeable performance impact even in the most pessimistic examples.

When used to verify dom tree wrapper pass analysis on sqlite3, the 3 new levels make `opt -O3` take the following amount of time on my machine:
- no verification: 8.3s
- `DT.verify(VerificationLevel::Fast)`: 10.1s
- `DT.verify(VerificationLevel::Basic)`: 44.8s
- `DT.verify(VerificationLevel::Full)`: 1m 46.2s
(and the previous `DT.verifyDominatorTree()` is within the noise of the Fast level)

This patch makes `DT.verifyDominatorTree()` pick between the 3 verification levels depending on EXPENSIVE_CHECKS and `-verify-dom-info`.

Reviewers: dberlin, brzycki, davide, grosser, dmgreen

Reviewed By: dberlin, brzycki

Subscribers: MatzeB, llvm-commits

Differential Revision: https://reviews.llvm.org/D42337

llvm-svn: 323298

ffb4fb7f

Don't assume a null GV is local for ELF and MachO. · 432a587c

Rafael Espindola authored Jan 24, 2018

This is already a simplification, and should help with avoiding a plt
reference when calling an intrinsic with -fno-plt.

With this change we return false for null GVs, so the caller only
needs to check the new metadata to decide if it should use foo@plt or
*foo@got.

llvm-svn: 323297

432a587c