Commits · 3616891046e7f13a758e53dcc6fa73a7c3232b35 · Lorenzo Albano / LLVM bpEVL

Nov 18, 2018

Vedant Kumar authored Nov 18, 2018

Fix all of the missing debug location errors in CVP found by debugify.

This includes the missing-location-after-udiv-truncation case described
in llvm.org/PR38178.

llvm-svn: 347147

35f504c1

Nov 17, 2018
- Use llvm::copy. NFC · 75709329
  Fangrui Song authored Nov 17, 2018
```
llvm-svn: 347126
```
  75709329
Nov 16, 2018

[SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch with · 2e3e224e

Fedor Sergeev authored Nov 16, 2018

We need to control exponential behavior of loop-unswitch so we do not get
run-away compilation.

Suggested solution is to introduce a multiplier for an unswitch cost that
makes cost prohibitive as soon as there are too many candidates and too
many sibling loops (meaning we have already started duplicating loops
by unswitching).

It does solve the currently known problem with compile-time degradation
(PR 39544).

Tests are built on top of a recently implemented CHECK-COUNT-<num>
FileCheck directives.

Reviewed By: chandlerc, mkazantsev
Differential Revision: https://reviews.llvm.org/D54223

llvm-svn: 347097

2e3e224e

GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics. · 83d87520
Adrian Prantl authored Nov 16, 2018
```
This fixes PR39669.
https://bugs.llvm.org/show_bug.cgi?id=39669

llvm-svn: 347065
```
83d87520

[ThinLTO] Internalize readonly globals · bf46e741

Eugene Leviant authored Nov 16, 2018

An attempt to recommit r346584 after failure on OSX build bot.
Fixed cache key computation in ThinLTOCodeGenerator and added
test case

llvm-svn: 347033

bf46e741

Nov 15, 2018

[LTO] Load sample profile in LTO link step. · 642c8d35

Xin Tong authored Nov 15, 2018

Summary:
Load sample profile in LTO link step.
ThinLTO calls populateModulePassManager to load the profile

Reviewers: tejohnson, davidxl, danielcdh

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54564

llvm-svn: 346971

642c8d35

[InstCombine] fix rotate narrowing bug for non-pow-2 types · bc56b243
Sanjay Patel authored Nov 15, 2018
```
llvm-svn: 346968
```
bc56b243

Nov 14, 2018

[InstCombine] Remove a couple of asserts based on incorrect assumptions · 0905fc77

Mandeep Singh Grang authored Nov 14, 2018

Summary:
These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same.
This fixes PR39595.

Reviewers: craig.topper, spatel, dmgreen

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54359

llvm-svn: 346874

0905fc77

[InstCombine] fix formatting for matchBSwap(); NFC · 60728427

Sanjay Patel authored Nov 14, 2018

We should have a similar function for matching rotate and/or 
funnel shift, so tidy up the related existing call.

llvm-svn: 346871

60728427

[VPlan, SLP] Use SmallPtrSet for Candidates. · 6df11868
Florian Hahn authored Nov 14, 2018
```
This slightly improves the candidate handling in getBest().

llvm-svn: 346870
```
6df11868
[VPlan] Remove LLVM_DEBUG from VPlanSlp::dumpBundle. · 02cb67de
Florian Hahn authored Nov 14, 2018
```
The caller should take care of only calling it with debug enabled.

llvm-svn: 346860
```
02cb67de
[VPlan] Update ifdef. · 2eca3728
Florian Hahn authored Nov 14, 2018
```
llvm-svn: 346858
```
2eca3728

[VPlan, SLP] Add simple SLP analysis on top of VPlan. · 09e516c5

Florian Hahn authored Nov 14, 2018

This patch adds an initial implementation of the look-ahead SLP tree
construction described in 'Look-Ahead SLP: Auto-vectorization in the Presence
of Commutative Operations, CGO 2018 by Vasileios Porpodas, Rodrigo C. O. Rocha,
Luís F. W. Góes'.

It returns an SLP tree represented as VPInstructions, with combined
instructions represented as a single, wider VPInstruction.

This initial version does not support instructions with multiple
different users (either inside or outside the SLP tree) or
non-instruction operands; it won't generate any shuffles or
insertelement instructions.

It also just adds the analysis that builds an SLP tree rooted in a set
of stores. It does not include any cost modeling or memory legality
checks. The plan is to integrate it with VPlan based cost modeling, once
available and to only apply it to operations that can be widened.

A follow-up patch will add a support for replacing instructions in a
VPlan with their SLP counter parts.

Reviewers: Ayal, mssimpso, rengolin, mkuper, hfinkel, hsaito, dcaballe, vporpo, RKSimon, ABataev

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D4949

llvm-svn: 346857

09e516c5

Recommit r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site). · 505091a8
Florian Hahn authored Nov 14, 2018
```
The underlying problem causing the expensive-check failure was fixed in
rL346769.

llvm-svn: 346843
```
505091a8

Revert r346810 "Preserve loop metadata when splitting exit blocks" · 41390b47

Reid Kleckner authored Nov 14, 2018

It broke the Windows self-host:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457

llvm-svn: 346823

41390b47

[InstCombine] fold funnel shift amount based on demanded bits · a1395648

Sanjay Patel authored Nov 13, 2018

The shift amount of a funnel shift is modulo the scalar bitwidth:
http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic
...so we can use demanded bits analysis on that operand to simplify it
when we have a power-of-2 bitwidth.

This is another step towards canonicalizing {shift/shift/or} to the 
intrinsics in IR.

Differential Revision: https://reviews.llvm.org/D54478

llvm-svn: 346814

a1395648

Preserve loop metadata when splitting exit blocks · 3c87c2a3

Craig Topper authored Nov 13, 2018

LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.)

Patch by Chang Lin (clin1)

Differential Revision: https://reviews.llvm.org/D53876

llvm-svn: 346810

3c87c2a3

Nov 13, 2018

[InstCombine] canonicalize rotate patterns with cmp/select · f8f12272

Sanjay Patel authored Nov 13, 2018

The cmp+branch variant of this pattern is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924
...and as discussed there, we probably can't transform
that without a rotate intrinsic. We do have that now
via funnel shift, but we're not quite ready to 
canonicalize IR to that form yet. The case with 'select'
should already be transformed though, so that's this patch.

The sequence with negation followed by masking is what we
use in the backend and partly in clang (though that part 
should be updated).

https://rise4fun.com/Alive/TplC
  %cmp = icmp eq i32 %shamt, 0
  %sub = sub i32 32, %shamt
  %shr = lshr i32 %x, %shamt
  %shl = shl i32 %x, %sub
  %or = or i32 %shr, %shl
  %r = select i1 %cmp, i32 %x, i32 %or
  =>
  %neg = sub i32 0, %shamt
  %masked = and i32 %shamt, 31
  %maskedneg = and i32 %neg, 31
  %shl2 = lshr i32 %x, %masked
  %shr2 = shl i32 %x, %maskedneg
  %r = or i32 %shl2, %shr2

llvm-svn: 346807

f8f12272

[CSP, Cloning] Update DuplicateInstructionsInSplitBetween to use DomTreeUpdater. · 107d0a87

Florian Hahn authored Nov 13, 2018

This patch updates DuplicateInstructionsInSplitBetween to update a DTU
instead of applying updates to the DT directly.

Given that there only are 2 users, also updated them in this patch to
avoid churn.

I slightly moved the code in CallSiteSplitting around to reduce the
places where we have to pass in DTU. If necessary, I could split those
changes in a separate patch.

This fixes missing DT updates when dealing with musttail calls in
CallSiteSplitting, by using DTU->deleteBB.

Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki

Reviewed By: NutshellySima

llvm-svn: 346769

107d0a87

Revert "[ThinLTO] Internalize readonly globals" · fa43892d
Steven Wu authored Nov 13, 2018
```
This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2.

llvm-svn: 346768
```
fa43892d

[VPlan] VPlan version of InterleavedAccessInfo. · a4dc7fee

Florian Hahn authored Nov 13, 2018

This patch turns InterleaveGroup into a template with the instruction type
being a template parameter. It also adds a VPInterleavedAccessInfo class, which
only contains a mapping from VPInstructions to their respective InterleaveGroup.
As we do not have access to scalar evolution in VPlan, we can re-use
convert InterleavedAccessInfo to VPInterleavedAccess info.


Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito

Reviewed By: rengolin

Differential Revision: https://reviews.llvm.org/D49489

llvm-svn: 346758

a4dc7fee

Introduce DebugCounter into ConstProp pass · cc633af5

Zhizhou Yang authored Nov 13, 2018

Summary:
This patch introduces DebugCounter into ConstProp pass at per-transformation level.

It will provide an option to skip first n or stop after n transformations for the whole ConstProp pass.

This will make debug easier for the pass, also providing chance to do transformation level bisecting.

Reviewers: davide, fhahn

Reviewed By: fhahn

Subscribers: llozano, george.burgess.iv, llvm-commits

Differential Revision: https://reviews.llvm.org/D50094

llvm-svn: 346720

cc633af5

Nov 12, 2018

[InstCombine] narrow width of rotate patterns, part 3 · 35b1c2d1

Sanjay Patel authored Nov 12, 2018

This is a longer variant for the pattern handled in
rL346713 
This one includes zexts. 

Eventually, we should canonicalize all rotate patterns 
to the funnel shift intrinsics, but we need a bit more
infrastructure to make sure the vectorizers handle those
intrinsics as well as the shift+logic ops.

https://rise4fun.com/Alive/FMn

Name: narrow rotateright
  %neg = sub i8 0, %shamt
  %rshamt = and i8 %shamt, 7
  %rshamtconv = zext i8 %rshamt to i32
  %lshamt = and i8 %neg, 7
  %lshamtconv = zext i8 %lshamt to i32
  %conv = zext i8 %x to i32
  %shr = lshr i32 %conv, %rshamtconv
  %shl = shl i32 %conv, %lshamtconv
  %or = or i32 %shl, %shr
  %r = trunc i32 %or to i8
  =>
  %maskedShAmt2 = and i8 %shamt, 7
  %negShAmt2 = sub i8 0, %shamt
  %maskedNegShAmt2 = and i8 %negShAmt2, 7
  %shl2 = lshr i8 %x, %maskedShAmt2
  %shr2 = shl i8 %x, %maskedNegShAmt2
  %r = or i8 %shl2, %shr2
llvm-svn: 346716

35b1c2d1

[InstCombine] narrow width of rotate patterns, part 2 (PR39624) · 98e427cc

Sanjay Patel authored Nov 12, 2018

The sub-pattern for the shift amount in a rotate can take on
several different forms, and there's apparently no way to
canonicalize those without seeing the entire rotate sequence.

This is the form noted in:
https://bugs.llvm.org/show_bug.cgi?id=39624

https://rise4fun.com/Alive/qnT

  %zx = zext i8 %x to i32
  %maskedShAmt = and i32 %shAmt, 7
  %shl = shl i32 %zx, %maskedShAmt
  %negShAmt = sub i32 0, %shAmt
  %maskedNegShAmt = and i32 %negShAmt, 7
  %shr = lshr i32 %zx, %maskedNegShAmt
  %rot = or i32 %shl, %shr
  %r = trunc i32 %rot to i8
  =>
  %truncShAmt = trunc i32 %shAmt to i8
  %maskedShAmt2 = and i8 %truncShAmt, 7
  %shl2 = shl i8 %x, %maskedShAmt2
  %negShAmt2 = sub i8 0, %truncShAmt
  %maskedNegShAmt2 = and i8 %negShAmt2, 7
  %shr2 = lshr i8 %x, %maskedNegShAmt2
  %r = or i8 %shl2, %shr2

llvm-svn: 346713

98e427cc

[InstCombine] refactor code for matching shift amount of a rotate; NFC · ceab2329

Sanjay Patel authored Nov 12, 2018

As shown in existing test cases and with:
https://bugs.llvm.org/show_bug.cgi?id=39624
...we're missing at least 2 more patterns for rotate narrowing.

llvm-svn: 346711

ceab2329

[GC][InstCombine] Fix a potential iteration issue · b8d8db30

Philip Reames authored Nov 12, 2018

Noticed via inspection.  Appears to be largely innocious in practice, but slight code change could have resulted in either visit order dependent missed optimizations or infinite loops.  May be a minor compile time problem today.

llvm-svn: 346698

b8d8db30

[CostModel] Add more realistic SK_ExtractSubvector generic costs. · 631f2bf5

Simon Pilgrim authored Nov 12, 2018

Instead of defaulting to a cost = 1, expand to element extract/insert like we do for other shuffles.

This exposes an issue in LoopVectorize which could call SK_ExtractSubvector with a scalar subvector type.

llvm-svn: 346656

631f2bf5

[LICM] Hoist guards from non-header blocks · 7d49a3a8

Max Kazantsev authored Nov 12, 2018

This patch relaxes overconservative checks on whether or not we could write
memory before we execute an instruction. This allows us to hoist guards out of
loops even if they are not in the header block.

Differential Revision: https://reviews.llvm.org/D50891
Reviewed By: fedor.sergeev

llvm-svn: 346643

7d49a3a8

[GCOV] Add options to filter files which must be instrumented. · c6fabeac

Calixte Denizet authored Nov 12, 2018

Summary:
When making code coverage, a lot of files (like the ones coming from /usr/include) are removed when post-processing gcno/gcda so finally they doen't need to be instrumented nor to appear in gcno/gcda.
The goal of the patch is to be able to filter the files we want to instrument, there are several advantages to do that:
- improve speed (no overhead due to instrumentation on files we don't care)
- reduce gcno/gcda size
- it gives the possibility to easily instrument only few files (e.g. ones modified in a patch) without changing the build system
- need to accept this patch to be enabled in clang: https://reviews.llvm.org/D52034

Reviewers: marco-c, vsk

Reviewed By: marco-c

Subscribers: llvm-commits, sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D52033

llvm-svn: 346641

c6fabeac

Nov 11, 2018

[IPSCCP,PM] Preserve PDT in the new pass manager. · 9026d4ee

Florian Hahn authored Nov 11, 2018

Reviewers: kuhar, chandlerc, NutshellySima, brzycki

Reviewed By: NutshellySima, brzycki

Differential Revision: https://reviews.llvm.org/D54317

llvm-svn: 346618

9026d4ee

Nov 10, 2018

[InstCombine] simplify code for merging stores; NFCI · 4a12aa97
Sanjay Patel authored Nov 10, 2018
```
llvm-svn: 346596
```
4a12aa97

[ThinLTO] Internalize readonly globals · be8d1996

Eugene Leviant authored Nov 10, 2018

This patch allows internalising globals if all accesses to them
(from live functions) are from non-volatile load instructions

Differential revision: https://reviews.llvm.org/D49362

llvm-svn: 346584

be8d1996

Nov 09, 2018

[JumpThreading] Fix exponential time algorithm computing known values. · 15930bf3

Eli Friedman authored Nov 09, 2018

ComputeValueKnownInPredecessors has a "visited" set to prevent infinite
loops, since a value can be visited more than once.  However, the
implementation didn't prevent the algorithm from taking exponential
time. Instead of removing elements from the RecursionSet one at a time,
we should keep around the whole set until
ComputeValueKnownInPredecessors finishes, then discard it.

The testcase is synthetic because I was having trouble effectively
reducing the original.  But it's basically the same idea.

Instead of failing, we could theoretically cache the result instead.
But I don't think it would help substantially in practice.

Differential Revision: https://reviews.llvm.org/D54239

llvm-svn: 346562

15930bf3

Revert r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site). · 9f878e9b
Florian Hahn authored Nov 09, 2018
```
This cause a failure with EXPENSIVE_CHECKS

llvm-svn: 346492
```
9f878e9b

[IPSCCP,PM] Preserve DT in the new pass manager. · a1062f4b

Florian Hahn authored Nov 09, 2018

After D45330, Dominators are required for IPSCCP and can be preserved.

This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.

Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima

Reviewed By: chandlerc, kuhar, NutshellySima

Differential Revision: https://reviews.llvm.org/D47259

llvm-svn: 346486

a1062f4b

[CallSiteSplitting] Only record conditions up to the IDom(call site). · 52578f95

Florian Hahn authored Nov 09, 2018

We can stop recording conditions once we reached the immediate dominator
for the block containing the call site. Conditions in predecessors of the
that node will be the same for all paths to the call site and splitting
is not beneficial.

This patch makes CallSiteSplitting dependent on the DT anlysis. because
the immediate dominators seem to be the easiest way of finding the node
to stop at.

I had to update some exiting tests, because they were checking for
conditions that were true/false on all paths to the call site. Those
should now be handled by instcombine/ipsccp.

Reviewers: davide, junbuml

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D44627

llvm-svn: 346483

52578f95

[DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG. · fa9cf897

Carlos Alberto Enciso authored Nov 09, 2018

In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53390

llvm-svn: 346481

fa9cf897

[NFC] Add utility function for SafetyInfo updates for moveBefore · 9883d1e1
Max Kazantsev authored Nov 09, 2018
```
llvm-svn: 346472
```
9883d1e1

Nov 08, 2018

[LoopInterchange] Support reductions across inner and outer loop. · a684a994

Florian Hahn authored Nov 08, 2018

This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.

With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.

Fixes https://bugs.llvm.org/show_bug.cgi?id=30472

Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43245

llvm-svn: 346438

a684a994

[LTO] Drop non-prevailing definitions only if linkage is not local or appending · e61652a3

Pirama Arumuga Nainar authored Nov 08, 2018

Summary:
This fixes PR 37422

In ELF, non-weak symbols can also be non-prevailing.  In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.

Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.

To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D54125

llvm-svn: 346436

e61652a3