Commits · 9b22b29f684f6f5fbe3dd1478704270023f24226 · Lorenzo Albano / LLVM bpEVL

Jun 17, 2020

[OpenMPOPT][NFC] Introducing OMPInformationCache. · 7cfd267c

sstefan1 authored Jun 13, 2020

Summary:
Introduction of OpenMP-specific information cache based on Attributor's `InformationCache`. This should make it easier to share information between them.

Reviewers: jdoerfert, JonChesterfield, hamax97, jhuber6, uenoku

Subscribers: yaxunl, hiraditya, guansong, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81798

7cfd267c

ScalarEvolution.h - reduce LoopInfo.h include to forward declarations. NFC. · a5f1f9c9

Simon Pilgrim authored Jun 17, 2020

Move ScalarEvolution::forgetLoopDispositions implementation to ScalarEvolution.cpp to remove the dependency.

Add implicit header dependency to source files where necessary.

a5f1f9c9

Follow up of rGe345d547a0d5, and attempt to pacify buildbot: · c1034d04

Sjoerd Meijer authored Jun 17, 2020

"error: 'get' is deprecated: The base class version of get with the scalable
argument defaulted to false is deprecated."

Changed VectorType::get() -> FixedVectorType::get().

c1034d04

Recommit "[LV] Emit @llvm.get.active.lane.mask for tail-folded loops" · e345d547
Sjoerd Meijer authored Jun 17, 2020
```
Fixed ARM regression test.

Please see the original commit message rG47650451738c for details.
```
e345d547

[LSR] Filter for postinc formulae · 076e08aa

David Green authored May 29, 2020

In more complicated loops we can easily hit the complexity limits of
loop strength reduction. If we do and filtering occurs, it's all too
easy to remove the wrong formulae for post-inc preferring accesses due
to it attempting to maximise register re-use. The patch adds an
alternative filtering step when the target is preferring postinc to pick
postinc formulae instead, hopefully lowering the complexity to below the
limit so that aggressive filtering is not needed.

There is also a change in here to stop considering existing addrecs as
free under postinc. We should already be modelling them as a reg so
don't want it to cause us to get the cost wrong. (I'm not sure that code
makes sense in general, but there are X86 tests specifically for it
where it seems to be helping so have left it around for the standard
non-post-inc case).

Differential Revision: https://reviews.llvm.org/D80273

076e08aa

Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant" · 5bf0858c

Sam Parker authored Jun 17, 2020

I originally reverted the patch because it was causing performance
issues, but now I think it's just enabling simplify-cfg to do
something that I don't want instead :)

Sorry for the noise.

This reverts commit 3e39760f.

5bf0858c

[IR] Don't copy profile metadata in createCallMatchingInvoke() · 16ad6eeb

Hans Wennborg authored Jun 17, 2020

The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.

Differential revision: https://reviews.llvm.org/D81996

16ad6eeb

Fix LoopIdiomRecognize pass return status · 1cafd8a5

serge-sans-paille authored Jun 04, 2020

Introduce an helper class to aggregate the cleanup in case of rollback.

Differential Revision: https://reviews.llvm.org/D81230

1cafd8a5

Revert "[LV] Emit @llvm.get.active.mask for tail-folded loops" · d4e183f6
Sjoerd Meijer authored Jun 17, 2020
```
This reverts commit 47650451
while I investigate the build bot failures.
```
d4e183f6
[SCCP] Move common code to simplify basic block to helper (NFC). · 773353be
Florian Hahn authored Jun 17, 2020
```
Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81755
```
773353be

[LV] Emit @llvm.get.active.mask for tail-folded loops · 47650451

Sjoerd Meijer authored Jun 10, 2020

This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised
loops if the intrinsic is supported by the backend, which is checked by
querying TargetTransform hook emitGetActiveLaneMask.

This intrinsic creates a mask representing active and inactive vector lanes,
which is used by the masked load/store instructions that are created for
tail-folded loops. The semantics of @llvm.get.active.mask are described here in
LangRef:

https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics

This intrinsic is also used to provide a hint to the backend. That is, the
second argument of the intrinsic represents the back-edge taken count of the
loop. For MVE, for example, we use that to set up tail-predication, which is a
new form of predication in MVE for vector loops that implicitely predicates the
last vector loop iteration by implicitely setting active/inactive lanes, i.e.
the tail loop is predicated. In order to set up a tail-predicated vector loop,
we need to know the number of data elements processed by the vector loop, which
corresponds the the tripcount of the scalar loop, which we can now reconstruct
using @llvm.get.active.mask.

Differential Revision: https://reviews.llvm.org/D79100

47650451

Jun 16, 2020

[SVE] Eliminate calls to default-false VectorType::get() from Vectorize · ff628f5f

Christopher Tetreault authored Jun 16, 2020

Reviewers: efriedma, fhahn, spatel, sdesmalen, kmclaughlin

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81521

ff628f5f

[VectorCombine] scalarize compares with insertelement operand(s) · ed67f5e7

Sanjay Patel authored Jun 16, 2020

Generalize scalarization (recently enhanced with D80885)
to allow compares as well as binops.
Similar to binops, we are avoiding scalarization of a loaded
value because that could avoid a register transfer in codegen.
This requires 1 extra predicate that I am aware of: we do not
want to scalarize the condition value of a vector select. That
might also invert a transform that we do in instcombine that
prefers a vector condition operand for a vector select.

I think this is the final step in solving PR37463:
https://bugs.llvm.org/show_bug.cgi?id=37463

Differential Revision: https://reviews.llvm.org/D81661

ed67f5e7

Revert "[AssumeBundles] add cannonicalisation to the assume builder" · d7deef12
Tyker authored Jun 16, 2020
```
This reverts commit 90c50cad.
```
d7deef12

[AssumeBundles] add cannonicalisation to the assume builder · 90c50cad

Tyker authored Jun 15, 2020

Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79650

90c50cad

[NFC][OpenMPOpt] Provide function-specific foreachUse. · e099c7b6
sstefan1 authored Jun 16, 2020

e099c7b6
Revert "[IR] Clean up dead instructions after simplifying a conditional branch" · 6fdd5a28
Jay Foad authored Jun 16, 2020
```
This reverts commit 69bdfb07.

Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343
```
6fdd5a28

[MSAN] Pass Origin by parameter to __msan_warning functions · b0ffa8be

Gui Andrade authored Jun 15, 2020

Summary:
Normally, the Origin is passed over TLS, which seems like it introduces unnecessary overhead. It's in the (extremely) cold path though, so the only overhead is in code size.

But with eager-checks, calls to __msan_warning functions are extremely common, so this becomes a useful optimization.

This can save ~5% code size.

Reviewers: eugenis, vitalybuka

Reviewed By: eugenis, vitalybuka

Subscribers: hiraditya, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D81700

b0ffa8be

Jun 15, 2020

[DSE,MSSA] Port partial store merging. · 120c0592

Florian Hahn authored Jun 15, 2020

Port partial constant store merging logic to MemorySSA backed DSE. The
heavy lifting is done by the existing helper function. It is used in
context where we already ensured that the later instruction can
eliminate the earlier one, if it is a complete overwrite.

120c0592

[DSE] Hoist partial store merging code into function (NFC). · 71a91b98
Florian Hahn authored Jun 15, 2020
```
Hoist the general logic into a new function, because it can be re-used
by the MemorySSA backed DSE as well.
```
71a91b98
[DSE,MSSA] Delete instructions after printing it. · 8c61f13a
Florian Hahn authored Jun 15, 2020
```
Also enables a now-passing test case, that exposed a crash caused by the
wrong order.
```
8c61f13a

[CostModel] getCFInstrCost in getUserCost. · 2596da31

Sam Parker authored Jun 15, 2020

Have BasicTTI call the base implementation so that both agree on the
default behaviour, which the default being a cost of '1'. This has
required an X86 specific implementation as it seems to be very
reliant on those instructions being free. Changes are also made to
AMDGPU so that their implementations distinguish between cost kinds,
so that the unrolling isn't affected. PowerPC also has its own
implementation to prevent changes to the reg-usage vectorizer test.

The cost model test changes now reflect that ret instructions are not
generally free.

Differential Revision: https://reviews.llvm.org/D79164

2596da31

[NFC] Bail early simplifying unconditional branches · 60da4369
Max Kazantsev authored Jun 15, 2020

60da4369
Revert "Return "[InstCombine] Simplify compare of Phi with constant inputs against a constant"" · 3e39760f
Sam Parker authored Jun 15, 2020
```
This reverts commit 23291b98.

This caused performance regressions.
```
3e39760f

Jun 14, 2020

[LoopUnroll] Allow loops with multiple exiting blocks where loop latch · 5225cd43

Whitney Tsang authored Jun 14, 2020

is not necessary one of them.

Summary: Currently LoopUnrollPass already allow loops with multiple
exiting blocks, but it is only allowed when the loop latch is one of the
exiting blocks.
When the loop latch is not an exiting block, then only single exiting
block is supported.
When possible, the single loop latch or the single exiting block
terminator is optimized to an unconditional branch in the unrolled loop.

This patch allows loops with multiple exiting blocks even if the loop
latch is not one of them. However, the optimization of exiting block
terminator to unconditional branch is not done when there exists more
than one exiting block.
Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour
Reviewed By: efriedma
Subscribers: hiraditya, zzheng, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D81053

5225cd43

[PassManager] restore early-cse to vector cleanup · 098e48a6

Sanjay Patel authored Jun 14, 2020

As noted in D80236 - the early-cse pass was included here before:
D75145 / rG71a316883d50
But it got moved outside of the "extra" option there, then it
got dropped while adjusting -vector-combine:
rG6438ea45e053
rG57bb4787d72f

So this is restoring the behavior and adding a test to prevent
accidental changes again. I don't see an equivalent option for
the new pass manager.

098e48a6

[InstCombine] reassociate FP diff of sums into sum of diffs · b5fb2695

Sanjay Patel authored Jun 14, 2020

(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) -->
(a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3])

This should be the last step in solving PR43953:
https://bugs.llvm.org/show_bug.cgi?id=43953

We started emitting reduction intrinsics with:
D80867/ rGe50059f6b6b3
So it's a relatively easy pattern match now to re-order those ops.
Also, I have not seen any complaints for the switch to intrinsics
yet, so I'll propose to remove the "experimental" tag from the
intrinsics soon.

Differential Revision: https://reviews.llvm.org/D81491

b5fb2695

[InstCombine] allow undef elements when comparing vector constants for min/max bailout · aeb50448

Sanjay Patel authored Jun 14, 2020

This is a hacky, but low-risk fix to avoid the infinite loop in PR46271:
https://bugs.llvm.org/show_bug.cgi?id=46271

As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict
with a transform that wants to pull a 'not' op through min/max via
SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include
undefined elements in vector constants to avoid that. Alternatively, we could
improve or cripple the demanded elements analysis, but that could create even
more problems.

The likely better, safer alternative will be to create min/max intrinsics, so
we can remove all of the hacks related to min/max matching in instcombine.

Differential Revision: https://reviews.llvm.org/D81698

aeb50448

Jun 13, 2020
- [NFCI][AggressiveInstCombiner] Add `STATISTIC()`s for transforms · e987ee63
  Roman Lebedev authored Jun 13, 2020
  
  e987ee63
- [DSE,MSSA] Fix location order in isOverwrite call. · 97e7147e
  Florian Hahn authored Jun 13, 2020
```
isOverwrite expects the later location as first argument and the earlier
result later. The adjusted call is intended to check whether CC
overwrites DefLoc.
```
  97e7147e
Jun 12, 2020

Temporarily revert "[MemCpyOptimizer] Simplify API of processStore and processMem* functions" · b422fe7d
Eric Christopher authored Jun 12, 2020
```
as it seems to be causing some internal crashes in AA after
email with the author.

This reverts commit f79e6a88.
```
b422fe7d
[NFCI] VectorCombine: add statistic for bitcast(shuf()) -> shuf(bitcast()) xform · 7aeb41b3
Roman Lebedev authored Jun 12, 2020

7aeb41b3
[NFC] OpenMPOpt: add a statistic for num of parallel regions deleted · 55eb714a
Roman Lebedev authored Jun 12, 2020

55eb714a

[ASan][NFC] Refactor redzone size calculation · 8af7fa07

Marco Elver authored Jun 12, 2020

Refactor redzone size calculation. This will simplify changing the
redzone size calculation in future.

Note that AddressSanitizer.cpp violates the latest LLVM style guide in
various ways due to capitalized function names. Only code related to the
change here was changed to adhere to the style guide.

No functional change intended.

Reviewed By: andreyknvl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D81367

8af7fa07

[BreakCritEdges] Add option to opt-out of perserving loop-simplify. · 4495a6b1

Florian Hahn authored Jun 12, 2020

This patch adds a new option to CriticalEdgeSplittingOptions to control
whether loop-simplify form must be preserved. It is them used by GVN to
indicate that loop-simplify form does not have to be preserved.

This fixes a crash exposed by 189efe29.

If the critical edge we are splitting goes from a block inside a loop to
a block outside the loop, splitting the edge will create a new exit
block. As a result, the new block will branch to the original exit
block, which will add a non-loop predecessor, breaking loop-simplify
form. To preserve loop-simplify form, the predecessor blocks of the
original exit are split, but that does not work for blocks with
indirectbr terminators. If preserving loop-simplify form is requested,
bail out , before making any changes.

Reviewers: reames, hfinkel, davide, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D81582

4495a6b1

[VPlan] Reject loops without computable backedge taken counts · 3a846d4d

Florian Hahn authored Jun 12, 2020

getOrCreateTripCount is used to generate code for the outer loop, but it
requires a computable backedge taken counts. Check that in the VPlan
native path.

Reviewers: Ayal, gilr, rengolin, sguggill

Reviewed By: sguggill

Differential Revision: https://reviews.llvm.org/D81088

3a846d4d

[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0" · 012909dc

EgorBo authored Jun 12, 2020

Summary:
"X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two)
However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression:
"X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj

This is my first contribution to LLVM so I hope I didn't mess things up

Reviewers: lebedev.ri, spatel

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79369

012909dc

[JumpThreading] Handle zero !prof branch_weights · 707836ed

Yevgeny Rouban authored Jun 12, 2020

Avoid division by zero in updatePredecessorProfileMetadata().

Reviewers: yamauchi
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81499

707836ed

Verify MemorySSA after all updates. · 519b019a
Alina Sbirlea authored Jun 11, 2020
```
Verify after completing all updates.
Resolves PR46275.
```
519b019a
[VectorCombine] remove unused parameters; NFC · 039ff29e
Sanjay Patel authored Jun 11, 2020

039ff29e