Commits · d4be9f4b8d2ca9d0525960629172e8cc9a792500 · Lorenzo Albano / LLVM bpEVL

Jan 22, 2017
- [PM] Add some debug logging to the new PM inliner to make it easier to · d4be9f4b
  Chandler Carruth authored Jan 22, 2017
```
trace its behavior.

llvm-svn: 292756
```
  d4be9f4b
Jan 21, 2017

MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info · 910dc8de

Anmol P. Paralkar authored Jan 21, 2017

Summary:
Under option -mergefunc-preserve-debug-info we:
- Do not create a new function for a thunk.
- Retain the debug info for a thunk's parameters (and associated
  instructions for the debug info) from the entry block.
  Note: -debug will display the algorithm at work.
- Create debug-info for the call (to the shared implementation) made by
  a thunk and its return value.
- Erase the rest of the function, retaining the (minimally sized) entry
  block to create a thunk.
- Preserve a thunk's call site to point to the thunk even when both occur
  within the same translation unit, to aid debugability. Note that this
  behaviour differs from the underlying -mergefunc implementation which
  modifies the thunk's call site to point to the shared implementation
  when both occur within the same translation unit.

Reviewers: echristo, eeckstein, dblaikie, aprantl, friss

Reviewed By: aprantl

Subscribers: davide, fhahn, jfb, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28075

llvm-svn: 292702

910dc8de

LowerTypeTests: Fix use-after-free. Found by asan/msan. · b365d921
Peter Collingbourne authored Jan 21, 2017
```
llvm-svn: 292700
```
b365d921
LowerTypeTests: Simplify; always create SizeM1 with type IntPtrTy, move... · 67addbca
Peter Collingbourne authored Jan 20, 2017
```
LowerTypeTests: Simplify; always create SizeM1 with type IntPtrTy, move initialization out of if statement.

llvm-svn: 292674
```
67addbca

Jan 20, 2017

Add indirect call promotion to SamplePGO · 77079003

Dehao Chen authored Jan 20, 2017

Summary: This patch adds metadata for indirect call promotion in the sample profile loader.

Reviewers: xur, davidxl, dnovillo

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28923

llvm-svn: 292672

77079003

Improve PGO support for the new inliner · 12585b01

Easwaran Raman authored Jan 20, 2017

This adds the following to the new PM based inliner in PGO mode:

* Use block frequency analysis to derive callsite's profile count and use
that to adjust thresholds of hot and cold callsites.

* Incrementally update the BFI of the caller after a callee gets inlined
into it. This incremental update is only within an invocation of the run
method - BFI is not preserved across calls to run.
Update the function entry count of the callee after inlining it into a
caller.

* I've tuned the thresholds for the hot and cold callsites using a hacked
up version of the old inliner that explicitly computes BFI on a set of
internal benchmarks and spec. Once the new PM based pipeline stabilizes
(IIRC Chandler mentioned there are known issues) I'll benchmark this
again and adjust the thresholds if required.
Inliner PGO support.

Differential revision: https://reviews.llvm.org/D28331

llvm-svn: 292666

12585b01

IPO, LTO: Plumb the summary from the LTO API into the pass manager. · e02b74e2
Peter Collingbourne authored Jan 20, 2017
```
Differential Revision: https://reviews.llvm.org/D28840

llvm-svn: 292661
```
e02b74e2

[ThinLTO] Drop non-prevailing non-ODR weak to declarations · 4566c6db

Teresa Johnson authored Jan 20, 2017

Summary:
Allow non-ODR weak/linkonce non-prevailing copies to be marked
as available_externally in the index. Add support for dropping these to
declarations in the backend.

Reviewers: mehdi_amini, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28806

llvm-svn: 292656

4566c6db

LowerTypeTests: Implement importing of type identifiers. · f04a3900

Peter Collingbourne authored Jan 20, 2017

To import a type identifier we read the summary and create external
references to the symbols defined when exporting.

Differential Revision: https://reviews.llvm.org/D28546

llvm-svn: 292654

f04a3900

LowerTypeTests: Compute SizeM1BitWidth in exportTypeId. NFCI. · ee141603
Peter Collingbourne authored Jan 20, 2017
```
This avoids needing to store it in a separate field in TypeIdLowering.

llvm-svn: 292647
```
ee141603
clang-format SampleProfile.cpp (NFC) · 94f369fc
Dehao Chen authored Jan 19, 2017
```
llvm-svn: 292533
```
94f369fc

Jan 19, 2017

LowerTypeTests: Implement exporting of type identifiers. · 22d9d3cd

Peter Collingbourne authored Jan 19, 2017

Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
  identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
  containing fine-grained information about the type identifier.

Differential Revision: https://reviews.llvm.org/D28424

llvm-svn: 292462

22d9d3cd

Jan 18, 2017
- ThinLTOBitcodeWriter: Clear comdats on filtered globals. · 20a00933
  Peter Collingbourne authored Jan 18, 2017
```
Differential Revision: https://reviews.llvm.org/D28839

llvm-svn: 292431
```
  20a00933
Jan 13, 2017

Apply clang-tidy's performance-unnecessary-value-param to LLVM. · 061f4a5f

Benjamin Kramer authored Jan 13, 2017

With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904

061f4a5f

Jan 12, 2017

[ThinLTO] Import static functions from the same module as caller · 83aaf358

Teresa Johnson authored Jan 12, 2017

Summary:
We can sometimes end up with multiple copies of a local function that
have the same GUID in the index. This happens when there are local
functions with the same name that are in different source files with the
same name (but in different directories), and they were compiled in
their own directory so had the same path at compile time.

In this case make sure we import the copy in the caller's module. While
it isn't a correctness problem (the renamed reference which is based on the
module IR hash will be unique since the module must have had an
externally visible function that was imported), importing the wrong copy
will result in lost performance opportunity since it won't be referenced
and inlined.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28440

llvm-svn: 291841

83aaf358

Jan 11, 2017

LowerTypeTests: Represent the memory region size with the constant size-1. · 7636532c

Peter Collingbourne authored Jan 11, 2017

This means that we can use a shorter instruction sequence in the case where
the size is a power of two and on the boundary between two representations.

Differential Revision: https://reviews.llvm.org/D28421

llvm-svn: 291706

7636532c

Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase... · 6bca5a0d

Peter Collingbourne authored Jan 11, 2017

Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.", with a fix for an off-by-one error.

llvm-svn: 291699

6bca5a0d

Revert rL291205 because it breaks Chrome tests under CFI. · 42e6b4fd

Ivan Krasin authored Jan 11, 2017

Summary:
Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.

This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.

Original URL: https://reviews.llvm.org/D28341

Reviewers: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28532

llvm-svn: 291684

42e6b4fd

Jan 07, 2017

LowerTypeTests: Thread summary and action from the API and command line into the pass. · d79e49d8

Peter Collingbourne authored Jan 07, 2017

Also move command line handling out of the pass constructor and into
a separate function.

Differential Revision: https://reviews.llvm.org/D28422

llvm-svn: 291323

d79e49d8

Jan 06, 2017

LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase. · 81271b7b

Peter Collingbourne authored Jan 06, 2017

This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.

Differential Revision: https://reviews.llvm.org/D28341

llvm-svn: 291205

81271b7b

Jan 05, 2017

ThinLTO: add early "dead-stripping" on the Index · 6c475a75

Teresa Johnson authored Jan 05, 2017

Summary:
Using the linker-supplied list of "preserved" symbols, we can compute
the list of "dead" symbols, i.e. the one that are not reachable from
a "preserved" symbol transitively on the reference graph.
Right now we are using this information to mark these functions as
non-eligible for import.

The impact is two folds:
- Reduction of compile time: we don't import these functions anywhere
  or import the function these symbols are calling.
- The limited number of import/export leads to better internalization.

Patch originally by Mehdi Amini.

Reviewers: mehdi_amini, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23488

llvm-svn: 291177

6c475a75

[ThinLTO] Subsume all importing checks into a single flag · 519465b9

Teresa Johnson authored Jan 05, 2017

Summary:
This adds a new summary flag NotEligibleToImport that subsumes
several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
and IsNotViableToInline). It also subsumes the checking of references
on the summary that was being done during the thin link by
eligibleForImport() for each candidate. It is much more efficient to
do that checking once during the per-module summary build and record
it in the summary.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28169

llvm-svn: 291108

519465b9

IR: Module summary representation for type identifiers; summary test... · b2ce2b68

Peter Collingbourne authored Jan 05, 2017

IR: Module summary representation for type identifiers; summary test scaffolding for lowertypetests.

Set up basic YAML I/O support for module summaries, plumb the summary into
the pass and add a few command line flags to test YAML I/O support. Bitcode
support to come separately, as will the code in LowerTypeTests that actually
uses the summary. Also add a couple of tests that pass by virtue of the pass
doing nothing with the summary (which happens to be the correct thing to do
for those tests).

Differential Revision: https://reviews.llvm.org/D28041

llvm-svn: 291069

b2ce2b68

Jan 04, 2017

Use lazy-loading of Metadata in MetadataLoader when importing is enabled (NFC) · 19ef4fad

Mehdi Amini authored Jan 04, 2017

Summary:
This is a relatively simple scheme: we use the index emitted in the
bitcode to avoid loading all the global metadata. Instead we load
the index with their position in the bitcode so that we can load each
of them individually. Materializing the global metadata block in this
condition only triggers loading the named metadata, and the ones
referenced from there (transitively). When materializing a function,
metadata from the global block are loaded lazily as they are
referenced.

Two main current limitations are:

1) Global values other than functions are not materialized on demand,
so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records
(and their transitive dependencies).
2) When we load a single metadata, we don't recurse on the operands,
instead we use a placeholder or a temporary metadata. Unfortunately
tepmorary nodes are very expensive. This is why we don't have it
always enabled and only for importing.

These two limitations can be lifted in a subsequent improvement if
needed.

With this change, the total link time of opt with ThinLTO and Debug
Info enabled is going down from 282s to 224s (~20%).

Reviewers: pcc, tejohnson, dexonsmith

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28113

llvm-svn: 291027

19ef4fad

Jan 02, 2017
- [PMBuilder] Remove RunFloat2Int cl::opt. · b672537c
  Davide Italiano authored Jan 02, 2017
```
The pass has been on by default for a long time without problems.

llvm-svn: 290814
```
  b672537c
Dec 28, 2016

[PM] Teach the inliner's call graph update to handle inserting new edges · 9900d18b

Chandler Carruth authored Dec 28, 2016

when they are call edges at the leaf but may (transitively) be reached
via ref edges.

It turns out there is a simple rule: insert everything as a ref edge
which is a safe conservative default. Then we let the existing update
logic handle promoting some of those to call edges.

Note that it would be fairly cheap to make these call edges right away
if that is desirable by testing whether there is some existing call path
from the source to the target. It just seemed like slightly more
complexity in this code path that isn't strictly necessary. If anyone
feels strongly about handling this differently I'm happy to change it.

llvm-svn: 290649

9900d18b

Dec 27, 2016

[PM] Add one of the features left out of the initial inliner patch: · 141bf5d1

Chandler Carruth authored Dec 27, 2016

skipping indirectly recursive inline chains.

To do this, we implicitly build an inline stack for each callsite and
check prior to inlining that doing so would not form a cycle. This uses
the exact same technique and even shares some code with the legacy PM
inliner.

This solution remains deeply unsatisfying to me because it means we
cannot actually iterate the inliner externally. Doing so would not be
able to easily detect and avoid such cycles. Some day I would very much
like to have a solution that works without this internal state to detect
cycles, but this is not that day.

llvm-svn: 290590

141bf5d1

[PM] Teach the inliner in the new PM to merge attributes after inlining. · 03130d98
Chandler Carruth authored Dec 27, 2016
```
Also enable the new PM in the attributes test case which caught this
issue.

llvm-svn: 290572
```
03130d98

[PM] Teach the always inliner in the new pass manager to support · 6e9bb7e0

Chandler Carruth authored Dec 26, 2016

removing fully-dead comdats without removing dead entries in comdats
with live members.

This factors the core logic out of the current inliner's internals to
a reusable utility and leverages that in both places. The factored out
code should also be (minorly) more efficient in cases where we have very
few dead functions or dead comdats to consider.

I've added a test case to cover this behavior of the always inliner.
This is the last significant bug in the new PM's always inliner I've
found (so far).

llvm-svn: 290557

6e9bb7e0

Dec 26, 2016

[NewGVN] Add a flag to enable the pass via `-mllvm`. · fe7a3ee5

Davide Italiano authored Dec 26, 2016

NewGVN can be tested passing `-mllvm -enable-newgvn` to clang.

Differential Revision:  https://reviews.llvm.org/D28059

llvm-svn: 290548

fe7a3ee5

Dec 24, 2016

[PM] Teach the always inlining test case to be much more strict about · 4eaff12b

Chandler Carruth authored Dec 23, 2016

whether functions are removed, and fix the new PM's always inliner to
actually pass this test.

Without this, the new PM's always inliner leaves all the functions
kicking around which won't work out very well given the semantics of
always inline.

Doing this really highlights how frustrating the current alwaysinline
semantic contract is though -- why can we put it on *external*
functions, etc?

Also I've added a number of tricky and interesting test cases for
removing functions with the always inliner. There is one remaining case
not handled -- fully removing comdats -- and I've left a FIXME about
this.

llvm-svn: 290457

4eaff12b

Dec 23, 2016
- Function-import: Disable IRVerifier on lazy-loaded modules: the ODR... · 94f86ad4
  Mehdi Amini authored Dec 23, 2016
```
Function-import: Disable IRVerifier on lazy-loaded modules: the ODR TypeUniquing generates invalid debug info.

llvm-svn: 290442
```
  94f86ad4
- Fix build after r290437 (missing include) · fc06b83e
  Mehdi Amini authored Dec 23, 2016
```
llvm-svn: 290438
```
  fc06b83e
- FunctionImport: fix typo '#ifndef NDEBUG' instead of '#ifndef DEBUG' · 9a9077fd
  Mehdi Amini authored Dec 23, 2016
```
llvm-svn: 290437
```
  9a9077fd
- [ThinLTO] Verify lazy-loaded source module for function importing when assertions are enabled (NFC) · 96cdc493
  Mehdi Amini authored Dec 23, 2016
```
llvm-svn: 290416
```
  96cdc493
Dec 22, 2016

[cfi] Emit jump tables as a function-level inline asm. · 27d4c9b7

Evgeniy Stepanov authored Dec 22, 2016

Use a dummy private function with inline asm calls instead of module
level asm blocks for CFI jumptables.

The main advantage is that now jumptable codegen can be affected by
the function attributes (like target_cpu on ARM). Module level asm
gets the default subtarget based on the target triple, which is often
not good enough.

This change also uses asm constraints/arguments to reference
jumptable targets and aliases directly. We no longer do asm name
mangling in an IR pass.

Differential Revision: https://reviews.llvm.org/D28012

llvm-svn: 290384

27d4c9b7

Pass GetAssumptionCache to InlineFunctionInfo constructor · 180bd9f6
Easwaran Raman authored Dec 22, 2016
```
Differential revision: https://reviews.llvm.org/D28038

llvm-svn: 290295
```
180bd9f6

Dec 21, 2016

[LDist] Match behavior between invoking via optimization pipeline or opt -loop-distribute · 32e6a34c

Adam Nemet authored Dec 21, 2016

In r267672, where the loop distribution pragma was introduced, I tried
it hard to keep the old behavior for opt: when opt is invoked
with -loop-distribute, it should distribute the loop (it's off by
default when ran via the optimization pipeline).

As MichaelZ has discovered this has the unintended consequence of
breaking a very common developer work-flow to reproduce compilations
using opt: First you print the pass pipeline of clang
with -debug-pass=Arguments and then invoking opt with the returned
arguments.

clang -debug-pass will include -loop-distribute but the pass is invoked
with default=off so nothing happens unless the loop carries the pragma.
While through opt (default=on) we will try to distribute all loops.

This changes opt's default to off as well to match clang.  The tests are
modified to explicitly enable the transformation.

llvm-svn: 290235

32e6a34c

IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI. · 598bd2a2

Peter Collingbourne authored Dec 21, 2016

No existing client is passing a non-null value here. This will come back
in a slightly different form as part of the type identifier summary work.

Differential Revision: https://reviews.llvm.org/D28006

llvm-svn: 290222

598bd2a2

Dec 20, 2016

[PM] Provide an initial, minimal port of the inliner to the new pass manager. · 1d963114

Chandler Carruth authored Dec 20, 2016

This doesn't implement *every* feature of the existing inliner, but
tries to implement the most important ones for building a functional
optimization pipeline and beginning to sort out bugs, regressions, and
other problems.

Notable, but intentional omissions:
- No alloca merging support. Why? Because it isn't clear we want to do
  this at all. Active discussion and investigation is going on to remove
  it, so for simplicity I omitted it.
- No support for trying to iterate on "internally" devirtualized calls.
  Why? Because it adds what I suspect is inappropriate coupling for
  little or no benefit. We will have an outer iteration system that
  tracks devirtualization including that from function passes and
  iterates already. We should improve that rather than approximate it
  here.
- Optimization remarks. Why? Purely to make the patch smaller, no other
  reason at all.

The last one I'll probably work on almost immediately. But I wanted to
skip it in the initial patch to try to focus the change as much as
possible as there is already a lot of code moving around and both of
these *could* be skipped without really disrupting the core logic.

A summary of the different things happening here:

1) Adding the usual new PM class and rigging.

2) Fixing minor underlying assumptions in the inline cost analysis or
   inline logic that don't generally hold in the new PM world.

3) Adding the core pass logic which is in essence a loop over the calls
   in the nodes in the call graph. This is a bit duplicated from the old
   inliner, but only a handful of lines could realistically be shared.
   (I tried at first, and it really didn't help anything.) All told,
   this is only about 100 lines of code, and most of that is the
   mechanics of wiring up analyses from the new PM world.

4) Updating the LazyCallGraph (in the new PM) based on the *newly
   inlined* calls and references. This is very minimal because we cannot
   form cycles.

5) When inlining removes the last use of a function, eagerly nuking the
   body of the function so that any "one use remaining" inline cost
   heuristics are immediately refined, and queuing these functions to be
   completely deleted once inlining is complete and the call graph
   updated to reflect that they have become dead.

6) After all the inlining for a particular function, updating the
   LazyCallGraph and the CGSCC pass manager to reflect the
   function-local simplifications that are done immediately and
   internally by the inline utilties. These are the exact same
   fundamental set of CG updates done by arbitrary function passes.

7) Adding a bunch of test cases to specifically target CGSCC and other
   subtle aspects in the new PM world.

Many thanks to the careful review from Easwaran and Sanjoy and others!

Differential Revision: https://reviews.llvm.org/D24226

llvm-svn: 290161

1d963114