Commits · a05f1e5ae4e0d0fc789a4caeff108fe4a50dc652 · Lorenzo Albano / LLVM bpEVL

Jun 01, 2020

[PGO] Improve the working set size heuristics under the partial sample PGO. · 6c27c61d

Hiroshi Yamauchi authored Apr 08, 2020

Summary:
The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize)
under the partial sample PGO may not be accurate because the profile is partial
and the number of hot profile counters in the ProfileSummary may not reflect the
actual working set size of the program being compiled.

To improve this, the (approximated) ratio of the the number of profile counters
of the program being compiled to the number of profile counters in the partial
sample profile is computed (which is called the partial profile ratio) and the
working set size of the profile is scaled by this ratio to reflect the working
set size of the program being compiled and used for the working set size
heuristics.

The partial profile ratio is approximated based on the number of the basic
blocks in the program and the NumCounts field in the ProfileSummary and computed
through the thin LTO indexing. This means that there is the limitation that the
scaled working set size is available to the thin LTO post link passes only.

Reviewers: davidxl

Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79831

6c27c61d

May 29, 2020
- Preserve DbgLoc when DeadArgumentElimination rewrites a 'ret'. · 8c2d2d97
  Paul Robinson authored May 28, 2020
```
Fixes PR46002.
```
  8c2d2d97
May 27, 2020

[llvm]NFC] Simplify ProfileSummaryInfo state transitions · fa3b5871

Mircea Trofin authored May 13, 2020

ProfileSummaryInfo is updated seldom, as result of very specific
triggers. This patch clearly demarcates state updates from read-only uses.
This, arguably, improves readability and maintainability.

fa3b5871

May 26, 2020
- [Transforms] Check validity of profile reader before invoking it · c1c9eb0a
  Yi Kong authored May 26, 2020
```
Although an invalid sampling profile would fail the compilation anyway,
this avoids crashing the compiler.
```
  c1c9eb0a
May 25, 2020
- [Transforms] Fix typos. NFC · bc93c2d7
  Marek Kurdej authored May 25, 2020
  
  bc93c2d7
May 24, 2020

[Pass Manager] remove EarlyCSE as clean-up for VectorCombine · 57bb4787

Sanjay Patel authored May 24, 2020

EarlyCSE was added with D75145, but the motivating test is
not regressed by removing the extra pass now. That might be
because VectorCombine altered the way it processes instructions,
or it might be from (re)moving VectorCombine in the pipeline.

The extra round of EarlyCSE appears to cost approximately
0.26% in compile-time as discussed in D80236, so we need some
evidence to justify its inclusion here, but we do not have
that (yet).

I suspect that between SLP and VectorCombine, we are creating
patterns that InstCombine and/or codegen are not prepared for,
but we will need to reduce those examples and include them as
PhaseOrdering and/or test-suite benchmarks.

57bb4787

May 23, 2020

[Align] Remove operations on MaybeAlign that asserted that it had a defined value. · 7392820f

Craig Topper authored May 22, 2020

If the caller needs to reponsible for making sure the MaybeAlign
has a value, then we should just make the caller convert it to an Align
with operator*.

I explicitly deleted the relational comparison operators that
were being inherited from Optional. It's unclear what the meaning
of two MaybeAligns were one is defined and the other isn't
should be. So make the caller reponsible for defining the behavior.

I left the ==/!= operators from Optional. But now that exposed a
weird quirk that ==/!= between Align and MaybeAlign required the
MaybeAlign to be defined. But now we use the operator== from
Optional that takes an Optional and the Value.

Differential Revision: https://reviews.llvm.org/D80455

7392820f

May 22, 2020

[VectorCombine] position pass after SLP in the optimization pipeline rather than before · 6438ea45

Sanjay Patel authored May 22, 2020

There are 2 known problem patterns shown in the test diffs here:
vector horizontal ops (an x86 specialization) and vector reductions.

SLP has greater ability to match and fold those than vector-combine,
so let SLP have first chance at that.

This is a quick fix while we continue to improve vector-combine and
possibly canonicalize to reduction intrinsics.

In the longer term, we should improve matching of these patterns
because if they were created in the "bad" forms shown here, then we
would miss optimizing them.

I'm not sure what is happening with alias analysis on the addsub test.
The old pass manager now shows an extra line for that, and we see an
improvement that comes from SLP vectorizing a store. I don't know
what's missing with the new pass manager to make that happen.
Strangely, I can't reproduce the behavior if I compile from C++ with
clang and invoke the new PM with "-fexperimental-new-pass-manager".

Differential Revision: https://reviews.llvm.org/D80236

6438ea45

May 21, 2020

Make Value::getPointerAlignment() return an Align, not a MaybeAlign. · f26bdb53

Eli Friedman authored May 16, 2020

If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.

Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer.  This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated.  I
updated clang's handling of C++ references, and added a release note for
this.

Differential Revision: https://reviews.llvm.org/D80072

f26bdb53

May 20, 2020

Reland [X86] Codegen for preallocated · 8a887556

Arthur Eubanks authored Mar 16, 2020

See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689

8a887556

Revert "[X86] Codegen for preallocated" · b8cbff51
Arthur Eubanks authored May 20, 2020
```
This reverts commit 810567dc.

Some tests are unexpectedly passing
```
b8cbff51

[X86] Codegen for preallocated · 810567dc

Arthur Eubanks authored Mar 16, 2020

See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.

In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.

This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.

The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.

The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.

Force any function containing a preallocated call to use the frame
pointer.

Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.

Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).

Aside from the tests added here, I checked that this codegen produces
correct code for something like

```
struct A {
        A();
        A(A&&);
        ~A();
};

void bar() {
        foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```

by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77689

810567dc

May 18, 2020
- [llvm][NFC] Fixed non-compliant style in InlineAdvisor.h · 691980eb
  Mircea Trofin authored May 18, 2020
```
Changed OnPass{Entry|Exit} -> onPass{Entry|Exit}

Also fixed a small typo in a comment.
```
  691980eb
May 16, 2020

AllocaInst should store Align instead of MaybeAlign. · 4f04db4b

Eli Friedman authored May 15, 2020

Along the lines of D77454 and D79968. Unlike loads and stores, the
default alignment is getPrefTypeAlign, to match the existing handling in
various places, including SelectionDAG and InstCombine.

Differential Revision: https://reviews.llvm.org/D80044

4f04db4b

May 15, 2020

Revert "Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs"" · 08e2386d

Mircea Trofin authored May 14, 2020

This reverts commit 454de99a.

The problem was that one of the ctor arguments of CallAnalyzer was left
to be const std::function<>&. A function_ref was passed for it, and then
the ctor stored the value in a function_ref field. So a std::function<>
would be created as a temporary, and not survive past the ctor
invocation, while the field would.

Tested locally by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild

Original Differential Revision: https://reviews.llvm.org/D79917

08e2386d

Revert "[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs" · 454de99a
Mircea Trofin authored May 14, 2020
```
This reverts commit 767db5be.
```
454de99a

[llvm][NFC] Cleanup uses of std::function in Inlining-related APIs · 767db5be

Mircea Trofin authored May 13, 2020

Summary:
Replacing uses of std::function pointers or refs, or Optional, to
function_ref, since the usage pattern allows that. If the function is
optional, using a default parameter value (nullptr). This led to a few
parameter reshufles, to push all optionals to the end of the parameter
list.

Reviewers: davidxl, dblaikie

Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79917

767db5be

May 14, 2020

[Attributor] Improve the alignment of the loads · 425333c2

Omar Ahmed authored May 13, 2020

This patch introduces an improvement in the Alignment of the loads
generated in createReplacementValues() by querying AAAlign attribute for
the best Alignment for the base.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76550

425333c2

May 13, 2020

[Attributor] Use AAValueConstantRange to infer dereferencability. · e5780776
Kuter Dinel authored May 13, 2020
```
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D76208
```
e5780776
Remove unused Debugging variable. · d6e3e55c
Eric Christopher authored May 13, 2020

d6e3e55c

[llvm] Add interface to drive inlining decision using ML model · d6695e18

Mircea Trofin authored Apr 28, 2020

Summary:

This change introduces InliningAdvisor (and related APIs), the interface
that abstracts decision making away from the inlining pass. We will use
this interface to delegate decision making to a trained ML model,
subsequently (see referenced RFC).

RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140763.html

Reviewers: davidxl, eraman, dblaikie

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79042

d6695e18

[NewPassManager] Add assertions when getting statefull cached analysis. · bd541b21

Alina Sbirlea authored Jan 14, 2020

Summary:
Analyses that are statefull should not be retrieved through a proxy from
an outer IR unit, as these analyses are only invalidated at the end of
the inner IR unit manager.
This patch disallows getting the outer manager and provides an API to
get a cached analysis through the proxy. If the analysis is not
stateless, the call to getCachedResult will assert.

Reviewers: chandlerc

Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72893

bd541b21

OpenMPOpt Remarks Support · 4d4ea9ac

Huber, Joseph authored May 13, 2020

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D79359

4d4ea9ac

[Attributor][FIX] Stabilize the state of AAReturnedValues each update · af48351c

Johannes Doerfert authored May 12, 2020

For AAReturnedValues we treated new and existing information differently
in the updateImpl. Only the latter was properly analyzed and
categorized. The former was thought to be analyzed in the subsequent
update. Since the Attributor does not support "self-updates" we need to
make sure the state is "stable" after each updateImpl invocation. That
is, if the surrounding information does not change, the state is valid.
Now we make sure all return values have been handled and properly
categorized each iteration. We might not update again if we have not
requested a non-fix attribute so we cannot "wait" for the next update to
analyze a new return value.

Bug reported by @sdmitriev.

af48351c

May 12, 2020

[Attributor] Fixup block addresses after rewriting function signature · 32f5ee83

Sergey Dmitriev authored May 12, 2020

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: jdoerfert

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79801

32f5ee83

May 11, 2020

[Attributor][FIX] Disallow function signature rewrite for casted calls · 8d94d3c3

Johannes Doerfert authored May 11, 2020

We will now ensure ensure the return type of called function is the type
of all call sites we are going to rewrite. This avoids a problem
partially fixed by D79680. The part that was not covered is a use of
this "weird" casted call site (see `@func3` in `misc_crash.ll`).

misc_crash.ll checks are auto-generated now.

8d94d3c3

[Attributor] Make AAIsDead dependences optional to prevent top state · c115a78f

Johannes Doerfert authored May 11, 2020

We should never give up on AAIsDead as it guards other AAs from
unreachable code (in which SSA properties are meaningless). We did
however use required dependences on some queries in AAIsDead which
caused us to invalidate AAIsDead if the queried AA got invalidated.
We now use optional dependences instead. The bug that exposed this is
added to the liveness.ll test and other test changes show the impact.

Bug report by @sdmitriev.

c115a78f

[Attributor] Force update of "newly live" abstract attributes · c86fd333

Johannes Doerfert authored May 11, 2020

During an update of AAIsDead, new instructions become live. If we query
information from them, the result is often just the initial state, e.g.,
for call site `noreturn` and `nounwind`. We will now trigger an update
for cached attributes during the AAIsDead update, though other AAs might
later use the same API.

c86fd333

[llvm][NFC] Move inlining decision-related APIs in InliningAdvisor. · 48fa355e

Mircea Trofin authored May 07, 2020

Summary: Factoring out in preparation to https://reviews.llvm.org/D79042

Reviewers: dblaikie, davidxl

Subscribers: mgorny, eraman, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79613

48fa355e

[Attributor] Fix for a crash on RAUW when rewriting function signature · 3df40007

Sergey Dmitriev authored May 11, 2020

Reviewers: jdoerfert, sstefan1, uenoku

Reviewed By: uenoku

Subscribers: hiraditya, uenoku, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79680

3df40007

[NFC][DwarfDebug] Add test for variables with a single location which · da100de0

OCHyams authored May 07, 2020

don't span their entire scope.

The previous commit (6d1c40c1) is an older version of the test.

Reviewed By: aprantl, vsk

Differential Revision: https://reviews.llvm.org/D79573

da100de0

Remove an unused Module param · 44e5aaf9

Xun Li authored May 10, 2020

Summary:
In D65848 the function getFuncNameInModule was refactored to no longer use module.
This diff removes the parameter and rename the function name to avoid confusion.

Reviewers: wenlei, wmi, davidxl

Reviewed By: wenlei

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79310

44e5aaf9

[Attributor] Merge the query set into AbstractAttribute · 3a8740bd

Johannes Doerfert authored Apr 16, 2020

The old QuerriedAAs contained two vectors, one for required one for
optional dependences (=queries). We now use a single vector and encode
the kind directly in the pointer.

This reduces memory consumption and makes the connection between
abstract attributes and their dependences clearer.

No functional change is intended, changes in the test are due to
different order in the query map. Neither the order before nor now is in
any way special.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 543734 (329735/s)
temporary memory allocations: 105895 (64217/s)
peak heap memory consumption: 19.19MB
peak RSS (including heaptrack overhead): 102.26MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 513292 (341511/s)
temporary memory allocations: 106028 (70544/s)
peak heap memory consumption: 13.35MB
peak RSS (including heaptrack overhead): 95.64MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: -30442 (208506/s)
temporary memory allocations: 133 (-910/s)
peak heap memory consumption: -5.84MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

Reviewed By: uenoku

Differential Revision: https://reviews.llvm.org/D78729

3a8740bd

[Attributor][FIX] Carefully handle/ignore/forget `argmemonly` · 5e06b251

Johannes Doerfert authored May 09, 2020

When we have an existing `argmemonly` or `inaccessiblememorargmemonly`
we used to "know" that information. However, interprocedural constant
propagation can invalidate these attributes. We now ignore and remove
these attributes for internal functions (which may be affected by IP
constant propagation), if we are deriving new attributes for the
function.

5e06b251

[Attributor] Use "simplify to constant" in genericValueTraversal · 713ee3aa

Johannes Doerfert authored May 09, 2020

As we replace values with constants interprocedurally, we also need to
do this "look-through" step during the generic value traversal or we
would derive properties from replaced values. While this is often not
problematic, it is when we use the "kind" of a value for reasoning,
e.g., accesses to arguments allow `argmemonly`.

713ee3aa

[Attributor] Ignore illegal accesses to `null` · 513ac6e9

Johannes Doerfert authored May 10, 2020

When we categorize a pointer value we bailed at `null` before. If we
know `null` is not a valid memory location we can ignore it as there
won't be an access at all.

513ac6e9

[Attributor] Use existing helpers to determine IR facts · 31c03b92

Johannes Doerfert authored May 10, 2020

We now use getPointerDereferenceableBytes to determine `nonnull` and
`dereferenceable` facts from the IR. We also use getPointerAlignment in
AAAlign for the same reason. The latter can interfere with callbacks so
we do restrict it to non-function-pointers for now.

31c03b92

[Attributor][NFC] Clang format Attributor*.cpp · a9ee8b49
Johannes Doerfert authored May 09, 2020

a9ee8b49

May 08, 2020

[Attributor][FIX] Record dependences for assumed dead abstract attributes · edf03914

Johannes Doerfert authored May 07, 2020

In a recent patch we introduced a problem with abstract attributes that
were assumed dead at some point. Since `Attributor::updateAA` was
introduced in 95e0d28b, we did not
remember the dependence on the liveness AA when an abstract attribute
was assumed dead and therefore not updated.

Explicit reproducer added in liveness.ll.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 509242 (345483/s)
temporary memory allocations: 98666 (66937/s)
peak heap memory consumption: 18.60MB
peak RSS (including heaptrack overhead): 103.29MB
total memory leaked: 269.10KB
```

After:
```
calls to allocation functions: 529332 (355494/s)
temporary memory allocations: 102107 (68574/s)
peak heap memory consumption: 19.40MB
peak RSS (including heaptrack overhead): 102.79MB
total memory leaked: 269.10KB
```

Difference:
```
calls to allocation functions: 20090 (1339333/s)
temporary memory allocations: 3441 (229400/s)
peak heap memory consumption: 801.45KB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

edf03914

[Attributor] Mark dependence as optional · 675334da
Johannes Doerfert authored May 07, 2020

675334da