Commits · e35a4491404bfa1a73c4f07cb65a97190a0c7f68 · Lorenzo Albano / LLVM bpEVL

Aug 17, 2017

[Dominators] Teach LoopUnswitch to use the incremental API · e35a4491

Jakub Kuderski authored Aug 17, 2017

Summary:
This patch makes LoopUnswitch use new incremental API for updating dominators.
It also updates SplitCriticalEdge, as it is called in LoopUnswitch.

There doesn't seem to be any noticeable performance difference when bootstrapping clang with this patch.

Reviewers: dberlin, davide, sanjoy, grosser, chandlerc

Reviewed By: davide, grosser

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35528

llvm-svn: 311093

e35a4491

[AVX512] Don't switch unmasked subvector insert/extract instructions when AVX512DQI is enabled. · 3a622a14

Craig Topper authored Aug 17, 2017

There's no reason to switch instructions with and without DQI. It just creates extra isel patterns and test divergences.

There is however value in enabling the masked version of the instructions with DQI.

This required introducing some new multiclasses to enabling this splitting.

Differential Revision: https://reviews.llvm.org/D36661

llvm-svn: 311091

3a622a14

[X86] Remove memopmmx pattern fragment · 59608480

Craig Topper authored Aug 17, 2017

Summary: Just like the FIXME says, there is no alignment requirement for MMX.

Reviewers: RKSimon, zvi, igorb

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36815

llvm-svn: 311090

59608480

[dfsan] Add explicit zero extensions for shadow parameters in function wrappers. · b5205c69

Simon Dardis authored Aug 17, 2017

In the case where dfsan provides a custom wrapper for a function,
shadow parameters are added for each parameter of the function.
These parameters are i16s. For targets which do not consider this
a legal type, the lack of sign extension information would cause
LLVM to generate anyexts around their usage with phi variables
and calling convention logic.

Address this by introducing zero exts for each shadow parameter.

Reviewers: pcc, slthakur

Differential Revision: https://reviews.llvm.org/D33349

llvm-svn: 311087

b5205c69

[DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c · 8be9f4af
Simon Pilgrim authored Aug 17, 2017
```
llvm-svn: 311083
```
8be9f4af
[X86] Refactoring of X86TargetLowering::EmitLoweredSelect. NFC. · 19f15843
Amjad Aboud authored Aug 17, 2017
```
Authored by aivchenk
Differential Revision: https://reviews.llvm.org/D35685

llvm-svn: 311082
```
19f15843

[Verifier] Avoid visiting DIGlobalVariables twice. · 903fd3ea

Davide Italiano authored Aug 17, 2017

We currently visit them twice.
Once, through `visitMDNode()` -> (the code generated by)
  `../include/llvm/IR/Metadata.def:109` -> `visitDIGlobalVariable()`
Then, through `visitMDNode()` -> `visitDIGlobalVariableExpression()`
  -> `visitDIGlobalVariable()`

This results in verification failures printed twice, e.g.:

  $ ./opt -verify ../../test/DebugInfo/pr34186.ll
  missing global variable type
  !4 = distinct !DIGlobalVariable(name: "pat", scope: !0,
    file: !1, line: 27, isLocal: true, isDefinition: true)
  missing global variable type
  !4 = distinct !DIGlobalVariable(name: "pat", scope: !0,
    file: !1, line: 27, isLocal: true, isDefinition: true)
  ./opt: ../../test/DebugInfo/pr34186.ll: error: input module is broken!

The patch removes one call so we ensure each GV is visited exactly once.

Differential Revision:  https://reviews.llvm.org/D36797

llvm-svn: 311081

903fd3ea

[LV] Using VPlan to model the vectorized code and drive its transformation · 66278833

Ayal Zaks authored Aug 17, 2017

VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
patch introduces the VPlan model into LV and uses it to represent the vectorized
code and drive the generation of vectorized IR.

In this patch VPlan models the vectorized loop body: the vectorized control-flow
is represented using VPlan's Hierarchical CFG, with predication refactored from
being a post-vectorization-step into a vectorization planning step modeling
if-then VPRegionBlocks, and generating code inline with non-predicated code. The
vectorized code within each VPBasicBlock is represented as a sequence of
Recipes, each responsible for modelling and generating a sequence of IR
instructions. To keep the size of this commit manageable the Recipes in this
patch are coarse-grained and capture large chunks of LV's code-generation logic.
The constructed VPlans are dumped in dot format under -debug.

This commit retains current vectorizer output, except for minor instruction
reorderings; see associated modifications to lit tests.

For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
and its references.

Authors: Gil Rapaport and Ayal Zaks

Differential Revision: https://reviews.llvm.org/D32871

llvm-svn: 311077

66278833

Re-commit: [globalisel][tablegen] Support zero-instruction emission. · edd0784b

Daniel Sanders authored Aug 17, 2017

Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.

The previous commit failed on Windows machines due to a flaw in the sort
predicate which allowed both A < B < C and B == C to be satisfied
simultaneously. The cause of this was some sloppiness in the priority order of
G_CONSTANT instructions compared to other instructions. These had equal priority
because it makes no difference, however there were operands had higher priority
than G_CONSTANT but lower priority than any other instruction. As a result, a
priority order between G_CONSTANT and other instructions must be enforced to
ensure the predicate defines a strict weak order.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36084

llvm-svn: 311076

edd0784b

[SystemZ] Also wrap TII with #ifndef NDEBUG in constructor initilizer list. · 593d49c0
Jonas Paulsson authored Aug 17, 2017
```
TII needs to be wrapped with #ifndef NDEBUG to silece compiler warnings.

llvm-svn: 311075
```
593d49c0

[SystemZ] Add a wrapping with #ifndef NDEBUG to silence warning. · d346924a

Jonas Paulsson authored Aug 17, 2017

SystemZHazardRecognizer::TII is only used for debug output, so it needs
also to be wrapped with #ifndef NDEBUG.

llvm-svn: 311074

d346924a

[SystemZ, MachineScheduler] Improve post-RA scheduling. · 57a705d9

Jonas Paulsson authored Aug 17, 2017

The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

llvm-svn: 311072

57a705d9

[SelectionDAG] Teach the vector-types operand scalarizer about SETCC · 124d3282

Elad Cohen authored Aug 17, 2017

When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651

llvm-svn: 311071

124d3282

[llvm-dlltool] Improve an error message when unable to open files. NFC. · caff3268
Martin Storsjö authored Aug 17, 2017
```
Differential Revision: https://reviews.llvm.org/D36818

llvm-svn: 311069
```
caff3268
[llvm-dlltool] Don't crash if no def file is provided or it can't be opened · 9d8ecb43
Martin Storsjö authored Aug 17, 2017
```
Differential Revision: https://reviews.llvm.org/D36780

llvm-svn: 311068
```
9d8ecb43

[CGP] Fix the rematerialization of gc.relocates · 9e5604db

Serguei Katkov authored Aug 17, 2017

If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.

Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.

The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.

Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462

llvm-svn: 311067

9e5604db

Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" · 4e38e02e

Geoff Berry authored Aug 17, 2017

This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

llvm-svn: 311062

4e38e02e

ARM: mark CPSR as clobbered for Windows VLAs · dd8c16b5

Saleem Abdulrasool authored Aug 17, 2017

When lowering a VLA, we emit a __chstk call.  However, this call can
internally clobber CPSR.  We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`.  In such a case, the CPSR could be clobbered, and the
check invalidated.  When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case.  Mark the register as clobbered to fix a possible
state corruption.

llvm-svn: 311061

dd8c16b5

[X86] Exchange the memory op predicate for PALIGNR/VPALIGNR. I accidentally swapped them. · 2f9743d2
Craig Topper authored Aug 17, 2017
```
llvm-svn: 311060
```
2f9743d2

[X86] Cleanup multiclasses for SSE/AVX2 PALIGNR. Add missing load patterns. · 5357526c

Craig Topper authored Aug 17, 2017

We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences.

We were also missing load patterns, though we had them for the AVX-512 version.

llvm-svn: 311059

5357526c

[X86] Remove patterns for PALIGNR with non-vXi8 types. · bbe3e46b
Craig Topper authored Aug 17, 2017
```
llvm-svn: 311058
```
bbe3e46b

Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators · fd5c5c91

Jakub Kuderski authored Aug 17, 2017

Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

The patch was originally committed in r311039 and reverted in r311049.
This revision fixes the problem with not adding a dependency on the
DominatorTreeWrapperPass for the LegacyPassManager.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311057

fd5c5c91

[X86] Put multiclass closer to its use and simplify slightly. NFC · 42a53535
Craig Topper authored Aug 16, 2017
```
llvm-svn: 311055
```
42a53535
[X86] Use a static array instead of a SmallVector for a small fixed size array. NFC · 9025579e
Craig Topper authored Aug 16, 2017
```
llvm-svn: 311054
```
9025579e

[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including... · 86111c66

Amjad Aboud authored Aug 16, 2017

[InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount)

Differential Revision: https://reviews.llvm.org/D36784

llvm-svn: 311050

86111c66

Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" · cbcffb17

Jakub Kuderski authored Aug 16, 2017

This reverts commit r311039. The patch caused the
`test/Bindings/OCaml/Output/scalar_opts.ml` to fail.

llvm-svn: 311049

cbcffb17

[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings;... · bb1b2d09
Eugene Zelenko authored Aug 16, 2017
```
[Analysis] Fix some Clang-tidy modernize and  Include What You Use warnings; other minor fixes (NFC).

llvm-svn: 311048
```
bb1b2d09

Aug 16, 2017

[InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) +... · 882f2963

Craig Topper authored Aug 16, 2017

[InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors

This also uses decomposeBitTestICmp to decode the compare.

Differential Revision: https://reviews.llvm.org/D36781

llvm-svn: 311044

882f2963

[ADCE][Dominators] Teach ADCE to preserve dominators · 4552e9de

Jakub Kuderski authored Aug 16, 2017

Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

llvm-svn: 311039

4552e9de

[MachineCopyPropagation] Extend pass to do COPY source forwarding · 87f8d251

Geoff Berry authored Aug 16, 2017

This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311038

87f8d251

[LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution · 40549ad1

Geoff Berry authored Aug 16, 2017

Summary:
Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
ScalarEvolution since they do not alter loop structure and should not
alter any SCEV values (though LoopDataPrefetch may introduce new
instructions that won't have cached SCEV values yet).

This can result in slight code differences, mainly w.r.t. nsw/nuw flags
on SCEVs, since these are computed somewhat lazily when a zext/sext
instruction is encountered.  As a result, passes after the modified
passes may see SCEVs with more nsw/nuw flags present.

Reviewers: sanjoy, anemet

Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36716

llvm-svn: 311032

40549ad1

Add a convenience overload of DWARFDie::dump() for debugging purposes. · 3d523a65
Adrian Prantl authored Aug 16, 2017
```
llvm-svn: 311026
```
3d523a65
Add more comment · 5a57b842
Xinliang David Li authored Aug 16, 2017
```
llvm-svn: 311025
```
5a57b842
[PGO] Fix ThinLTO crash · 71ecaa19
Xinliang David Li authored Aug 16, 2017
```
Differential Revsion: http://reviews.llvm.org/D36640

llvm-svn: 311023
```
71ecaa19
[AMDGPU] NFC: test commit · bf975176
Evgeny Mankov authored Aug 16, 2017
```
llvm-svn: 311019
```
bf975176
AMDGPU/NFC: Sort files in CMakeLists.txt alphabetically · d3d89efa
Konstantin Zhuravlyov authored Aug 16, 2017
```
llvm-svn: 311017
```
d3d89efa

[Dominators] Introduce batch updates · 624463a0

Jakub Kuderski authored Aug 16, 2017

Summary:
This patch introduces a way of informing the (Post)DominatorTree about multiple CFG updates that happened since the last tree update. This makes performing tree updates much easier, as it internally takes care of applying the updates in lockstep with the (virtual) updates to the CFG, which is done by reverse-applying future CFG updates.

The batch updater is able to remove redundant updates that cancel each other out. In the future, it should be also possible to reorder updates to reduce the amount of work needed to perform the updates.

Reviewers: dberlin, sanjoy, grosser, davide, brzycki

Reviewed By: brzycki

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D36167

llvm-svn: 311015

624463a0

[BDCE] Don't check demanded bits on unsized types · 9e54b709

Hal Finkel authored Aug 16, 2017

To clear assumptions that are potentially invalid after trivialization, we need
to walk the use/def chain. Normally, the only way to reach an instruction with
an unsized type is via an instruction that has side effects (or otherwise will
demand its input bits). That would stop the walk. However, if we have a
readnone function that returns an unsized type (e.g., void), we must avoid
asking for the demanded bits of the function call's return value. A
void-returning readnone function is always dead (and so we can stop walking the
use/def chain here), but the check is necessary to avoid asserting.

Fixes PR34211.

llvm-svn: 311014

9e54b709

[Verifier] Reject globals without a type associated. · cd21378f
Davide Italiano authored Aug 16, 2017
```
llvm-svn: 311012
```
cd21378f

[AMDGPU][MC][GFX9] Added op_sel support for v_mad_*16, v_fma_f16, v_div_fixup_f16 · b865ef53

Dmitry Preobrazhensky authored Aug 16, 2017

This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322

Reviewers: SamWot, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D36694

llvm-svn: 311011

b865ef53