Commits · 5f7a5e3bdba3663c8ec9704e28f75f6b2850f9f8 · Lorenzo Albano / LLVM bpEVL

May 13, 2020

[LSR][ARM] Add new TTI hook to mark some LSR chains as profitable · 2668775f

Pierre-vh authored May 05, 2020

This patch adds a new TTI hook to allow targets to tell LSR that
a chain including some instruction is already profitable and
should not be optimized. This patch also adds an implementation
of this TTI hook for ARM so LSR doesn't optimize chains that include
the VCTP intrinsic.

Differential Revision: https://reviews.llvm.org/D79418

2668775f

Recommit #2: "[LV] Induction Variable does not remain scalar under tail-folding." · 9529597c

Sjoerd Meijer authored May 12, 2020

This was reverted because of a miscompilation. At closer inspection, the
problem was actually visible in a changed llvm regression test too. This
one-line follow up fix/recommit will splat the IV, which is what we are trying
to avoid if unnecessary in general, if tail-folding is requested even if all
users are scalar instructions after vectorisation. Because with tail-folding,
the splat IV will be used by the predicate of the masked loads/stores
instructions. The previous version omitted this, which caused the
miscompilation. The original commit message was:

If tail-folding of the scalar remainder loop is applied, the primary induction
variable is splat to a vector and used by the masked load/store vector
instructions, thus the IV does not remain scalar. Because we now mark
that the IV does not remain scalar for these cases, we don't emit the vector IV
if it is not used. Thus, the vectoriser produces less dead code.

Thanks to Ayal Zaks for the direction how to fix this.

9529597c

[StructurizeCFG] Fix region nodes ordering · 897d8ee5

Ehud Katz authored May 13, 2020

This is a reimplementation of the `orderNodes` function, as the old
implementation didn't take into account all cases.

Fix PR41509

Differential Revision: https://reviews.llvm.org/D79037

897d8ee5

[BrachProbablityInfo] Set edge probabilities at once. NFC. · eef95f27

Yevgeny Rouban authored May 13, 2020

Hide the method that allows setting probability for particular
edge and introduce a public method that sets probabilities for
all outgoing edges at once.
Setting individual edge probability is error prone. More over
it is difficult to check that the total probability is 1.0
because there is no easy way to know when the user finished
setting all the probabilities.

Reviewers: yamauchi, ebrevnov
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79396

eef95f27

[LoopReroll] Fix rerolling loop with use outside the loop · 272bc25b

KAWASHIMA Takahiro authored May 07, 2020

Fixes PR41696

The loop-reroll pass generates an invalid IR (or its assertion
fails in debug build) if values of the base instruction and
other root instructions (terms used in the loop-reroll pass)
are used outside the loop block. See IRs written in PR41696
as examples.

The current implementation of the loop-reroll pass can reroll
only loops that don't have values that are used outside the
loop, except reduced values (the last values of reduction chains).
This is described in the comment of the `LoopReroll::reroll`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600

This is checked in the `LoopReroll::DAGRootTracker::validate`
function.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393

However, the base instruction and other root instructions skip
this check in the validation loop.
https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229

Moving the check in front of the skip is the logically simplest
fix. However, inserting the check in an earlier stage is better
in terms of compilation time of unrerollable loops. This fix
inserts the check for the base instruction into the function
to validate possible base/root instructions. Check for other
root instructions is unnecessary because they don't match any
base instructions if they have uses outside the loop.

Differential Revision: https://reviews.llvm.org/D79549

272bc25b

[Attributor][FIX] Stabilize the state of AAReturnedValues each update · af48351c

Johannes Doerfert authored May 12, 2020

For AAReturnedValues we treated new and existing information differently
in the updateImpl. Only the latter was properly analyzed and
categorized. The former was thought to be analyzed in the subsequent
update. Since the Attributor does not support "self-updates" we need to
make sure the state is "stable" after each updateImpl invocation. That
is, if the surrounding information does not change, the state is valid.
Now we make sure all return values have been handled and properly
categorized each iteration. We might not update again if we have not
requested a non-fix attribute so we cannot "wait" for the next update to
analyze a new return value.

Bug reported by @sdmitriev.

af48351c

Add nomerge function attribute to supress tail merge optimization in simplifyCFG · cb22ab74

Zequan Wu authored May 12, 2020

We want to add a way to avoid merging identical calls so as to keep the
separate debug-information for those calls. There is also an asan
usecase where having this attribute would be beneficial to avoid
alternative work-arounds.

Here is the link to the feature request:
https://bugs.llvm.org/show_bug.cgi?id=42783.

`nomerge` is different from `noline`. `noinline` prevents function from
inlining at callsites, but `nomerge` prevents multiple identical calls
from being merged into one.

This patch adds `nomerge` to disable the optimization in IR level. A
followup patch will be needed to let backend understands `nomerge` and
avoid tail merge at backend.

Reviewed By: asbirlea, rnk

Differential Revision: https://reviews.llvm.org/D78659

cb22ab74

May 12, 2020

[Attributor] Fixup block addresses after rewriting function signature · 32f5ee83

Sergey Dmitriev authored May 12, 2020

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79801

32f5ee83

[ValueTracking] Let propagatesPoison support binops/unaryops/cast/etc. · e5f602d8

Juneyoung Lee authored Apr 21, 2020

Summary:
This patch makes propagatesPoison be more accurate by returning true on
more bin ops/unary ops/casts/etc.

The changed test in ScalarEvolution/nsw.ll was introduced by
https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 .
IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has
no-overflow flags even if the loop isn't in the wanted form.
It becomes more accurate with this patch, so think this is okay.

Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy

Reviewed By: spatel, nikic

Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78615

e5f602d8

[gcov] Default coverage version to '408*' and delete CC1 option -coverage-exit-block-before-body · b56b1e67

Fangrui Song authored May 11, 2020

gcov 4.8 (r189778) moved the exit block from the last to the second.
The .gcda format is compatible with 4.7 but

* decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the
  exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings,
  and print wrong `" returned %s` for branch statistics (-b).
* decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues.

Also, rename "return block" to "exit block" because the latter is the
appropriate term.

b56b1e67

Fix typos encountered while working on pass pipeline for O1. · a42e53cc
Eric Christopher authored May 12, 2020

a42e53cc

May 11, 2020

[Attributor][FIX] Disallow function signature rewrite for casted calls · 8d94d3c3

Johannes Doerfert authored May 11, 2020

We will now ensure ensure the return type of called function is the type
of all call sites we are going to rewrite. This avoids a problem
partially fixed by D79680. The part that was not covered is a use of
this "weird" casted call site (see `@func3` in `misc_crash.ll`).

misc_crash.ll checks are auto-generated now.

8d94d3c3

[Attributor] Make AAIsDead dependences optional to prevent top state · c115a78f

Johannes Doerfert authored May 11, 2020

We should never give up on AAIsDead as it guards other AAs from
unreachable code (in which SSA properties are meaningless). We did
however use required dependences on some queries in AAIsDead which
caused us to invalidate AAIsDead if the queried AA got invalidated.
We now use optional dependences instead. The bug that exposed this is
added to the liveness.ll test and other test changes show the impact.

Bug report by @sdmitriev.

c115a78f

[Attributor] Force update of "newly live" abstract attributes · c86fd333

Johannes Doerfert authored May 11, 2020

During an update of AAIsDead, new instructions become live. If we query
information from them, the result is often just the initial state, e.g.,
for call site `noreturn` and `nounwind`. We will now trigger an update
for cached attributes during the AAIsDead update, though other AAs might
later use the same API.

c86fd333

[VectorCombine] account for extra uses in scalarization cost · 5f730b64
Sanjay Patel authored May 11, 2020
```
Follow-up to D79452.
Mimics the extra use cost formula for the inverse transform with extracts.
```
5f730b64

[llvm][NFC] Move inlining decision-related APIs in InliningAdvisor. · 48fa355e

Mircea Trofin authored May 07, 2020

Summary: Factoring out in preparation to https://reviews.llvm.org/D79042

Reviewers: dblaikie, davidxl

Subscribers: mgorny, eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79613

48fa355e

[Attributor] Fix for a crash on RAUW when rewriting function signature · 3df40007

Sergey Dmitriev authored May 11, 2020

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: uenoku

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79680

3df40007

[AssumeBundles] fix crashes · 78d85c20

Tyker authored May 11, 2020

Summary:
this patch fixe crash/asserts found in the test-suite.
the AssumeptionCache cannot be assumed to have all assumes contrary to what i tought.
prevent generation of information for terminators, because this can create broken IR in transfromation where we insert the new terminator before removing the old one.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79458

78d85c20

[NFC][DwarfDebug] Add test for variables with a single location which · da100de0

OCHyams authored May 07, 2020

don't span their entire scope.

The previous commit (6d1c40c1) is an older version of the test.

Reviewed By: aprantl, vsk

Differential Revision: https://reviews.llvm.org/D79573

da100de0

Remove an unused Module param · 44e5aaf9

Xun Li authored May 10, 2020

Summary:
In D65848 the function getFuncNameInModule was refactored to no longer use module.
This diff removes the parameter and rename the function name to avoid confusion.

Reviewers: wenlei, wmi, davidxl

Reviewed By: wenlei

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79310

44e5aaf9

[Attributor] Merge the query set into AbstractAttribute · 3a8740bd

Johannes Doerfert authored Apr 16, 2020

The old QuerriedAAs contained two vectors, one for required one for
optional dependences (=queries). We now use a single vector and encode
the kind directly in the pointer.

This reduces memory consumption and makes the connection between
abstract attributes and their dependences clearer.

No functional change is intended, changes in the test are due to
different order in the query map. Neither the order before nor now is in
any way special.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 543734 (329735/s)
temporary memory allocations: 105895 (64217/s)
peak heap memory consumption: 19.19MB
peak RSS (including heaptrack overhead): 102.26MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 513292 (341511/s)
temporary memory allocations: 106028 (70544/s)
peak heap memory consumption: 13.35MB
peak RSS (including heaptrack overhead): 95.64MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -30442 (208506/s)
temporary memory allocations: 133 (-910/s)
peak heap memory consumption: -5.84MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D78729

3a8740bd

[Attributor][FIX] Carefully handle/ignore/forget `argmemonly` · 5e06b251

Johannes Doerfert authored May 09, 2020

When we have an existing `argmemonly` or `inaccessiblememorargmemonly`
we used to "know" that information. However, interprocedural constant
propagation can invalidate these attributes. We now ignore and remove
these attributes for internal functions (which may be affected by IP
constant propagation), if we are deriving new attributes for the
function.

5e06b251

[Attributor] Use "simplify to constant" in genericValueTraversal · 713ee3aa

Johannes Doerfert authored May 09, 2020

As we replace values with constants interprocedurally, we also need to
do this "look-through" step during the generic value traversal or we
would derive properties from replaced values. While this is often not
problematic, it is when we use the "kind" of a value for reasoning,
e.g., accesses to arguments allow `argmemonly`.

713ee3aa

[Attributor] Ignore illegal accesses to `null` · 513ac6e9

Johannes Doerfert authored May 10, 2020

When we categorize a pointer value we bailed at `null` before. If we
know `null` is not a valid memory location we can ignore it as there
won't be an access at all.

513ac6e9

[Attributor] Use existing helpers to determine IR facts · 31c03b92

Johannes Doerfert authored May 10, 2020

We now use getPointerDereferenceableBytes to determine `nonnull` and
`dereferenceable` facts from the IR. We also use getPointerAlignment in
AAAlign for the same reason. The latter can interfere with callbacks so
we do restrict it to non-function-pointers for now.

31c03b92

[Attributor][NFC] Clang format Attributor*.cpp · a9ee8b49
Johannes Doerfert authored May 09, 2020

a9ee8b49

[gcov] Default coverage version to '407*' and delete CC1 option -coverage-cfg-checksum · 25544ce2

Fangrui Song authored May 10, 2020

Defaulting to -Xclang -coverage-version='407*' makes .gcno/.gcda
compatible with gcov [4.7,8)

In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum.
We can infer the information from the version.

With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7.
We don't need other -Xclang -coverage* options.
There may be a mismatching version warning, though.

(Note, GCC r173147 "split checksum into cfg checksum and line checksum"
 made gcov 4.7 incompatible with previous versions.)

25544ce2

May 10, 2020

[gcov] Delete CC1 option -coverage-no-function-names-in-data · 13a633b4

Fangrui Song authored May 10, 2020

rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION
(this might be part of the reasons the header says
"We emit files in a corrupt version of GCOV's "gcda" file format").

rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data
to work around the issue. (However, the description is wrong.
libgcov never writes function names, even before GCC 4.2).

In reality, the linker command line has to look like:

clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data

Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7
either produce wrong results (for one gcov-4.9 program, I see "No executable lines")
or segfault (gcov-7).
(gcov-8 uses an incompatible format.)

This patch deletes -coverage-no-function-names-in-data and the related
function names support from libclang_rt.profile

13a633b4

[AssumeBundles] Remove non-determinisme from assume builder · 5957e058

Tyker authored May 10, 2020

Summary:
The assume builder was non-deterministic when working on unamed values.
this patch fixes this.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78616

5957e058

[AssumeBundles] Prevent generation of some redundant assumes · 821a0f23

Tyker authored May 10, 2020

Summary: with this patch the assume salvageKnowledge will not generate assume if all knowledge is already available in an assume with valid context. assume bulider can also in some cases update an existing assume with better information.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78014

821a0f23

[LAA] Move runtime-check generation to Transforms/Utils/loopUtils (NFC) · 8528186b

Florian Hahn authored May 10, 2020

Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D78460

8528186b

[InstCombine] canonicalize bitcast after insertelement into undef · 856cc60b

Sanjay Patel authored May 10, 2020

We have a transform in the opposite direction only for the x86 MMX type,
Other types are not handled either way before this patch.

The motivating case from PR45748:
https://bugs.llvm.org/show_bug.cgi?id=45748
...is the last test diff. In that example, we are triggering an existing
bitcast transform, so we reduce the number of casts, and that should give
us the ideal x86 codegen.

Differential Revision: https://reviews.llvm.org/D79171

856cc60b

[InstCombine] matchOrConcat - match BITREVERSE · bab44a69

Simon Pilgrim authored May 10, 2020

Fold or(zext(bitreverse(x)),shl(zext(bitreverse(y)),bw/2) -> bitreverse(or(zext(x),shl(zext(y),bw/2))

Practically this is the same as the BSWAP pattern so we might as well handle it.

bab44a69

Recommit "[LAA] Remove one addRuntimeChecks function (NFC)." · 96c63f54

Florian Hahn authored May 10, 2020

The failing assertion has been fixed and the problematic test case has
been added.

This reverts the revert commit fc44617f.

96c63f54

Revert "[LAA] Remove one addRuntimeChecks function (NFC)." · fc44617f

Florian Hahn authored May 10, 2020

This reverts commit c28114c8.

This causes some bots to fail:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-android/builds/30596/steps/build%20android%2Faarch64/logs/stdio

fc44617f

[LAA] Remove one addRuntimeChecks function (NFC). · c28114c8

Florian Hahn authored May 10, 2020

In order to reduce the API surface area (preparation for D78460), remove
a addRuntimeChecks() function and do the additional check in the single
caller.

Reviewers: Ayal, anemet

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D79679

c28114c8

[InstCombine] fold fpext into exact integer-to-FP cast · a62533c2

Sanjay Patel authored May 10, 2020

We can combine a floating-point extension cast with a conversion
from integer if we know the earlier cast is exact.

This is an optimization suggested in PR36617:
https://bugs.llvm.org/show_bug.cgi?id=36617#c19

However, this patch does not change the example suggested there.
This patch only uses the existing analysis to handle cases where
the integer source value magnitude is narrower than the
intermediate FP mantissa (guarantees that the conversion to FP is
exact). Follow-up patches to the analysis function can enable
more cases.

Differential Revision: https://reviews.llvm.org/D79116

a62533c2

Add missing pass initialization · 73a9b7de

Arthur Eubanks authored May 08, 2020

Summary: This was preventing MemorySanitizerLegacyPass from appearing in --print-after-all.

Reviewers: vitalybuka

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79661

73a9b7de

[sanitizer] Enable whitelist/blacklist in new PM · a72b9dfd

Jinsong Ji authored May 10, 2020

https://reviews.llvm.org/D63616 added `-fsanitize-coverage-whitelist`
and `-fsanitize-coverage-blacklist` for clang.

However, it was done only for legacy pass manager.
This patch enable it for new pass manager as well.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D79653

a72b9dfd

May 09, 2020

InstCombine: Broaden copy-constant-to-alloca optimization · 16295d52

Matt Arsenault authored May 08, 2020

Consider any constant memory type, not just global constants. AMDGPU
kernel parameters are effectively global constants, but appear as
either reads from an intrinsic derived pointer or function argument.

16295d52