Commits · 3616891046e7f13a758e53dcc6fa73a7c3232b35 · Lorenzo Albano / LLVM bpEVL

Nov 16, 2018

GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics. · 83d87520
Adrian Prantl authored Nov 16, 2018
```
This fixes PR39669.
https://bugs.llvm.org/show_bug.cgi?id=39669

llvm-svn: 347065
```
83d87520

[ThinLTO] Internalize readonly globals · bf46e741

Eugene Leviant authored Nov 16, 2018

An attempt to recommit r346584 after failure on OSX build bot.
Fixed cache key computation in ThinLTOCodeGenerator and added
test case

llvm-svn: 347033

bf46e741

Nov 15, 2018

[LTO] Load sample profile in LTO link step. · 642c8d35

Xin Tong authored Nov 15, 2018

Summary:
Load sample profile in LTO link step.
ThinLTO calls populateModulePassManager to load the profile

Reviewers: tejohnson, davidxl, danielcdh

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D54564

llvm-svn: 346971

642c8d35

Nov 13, 2018
- Revert "[ThinLTO] Internalize readonly globals" · fa43892d
  Steven Wu authored Nov 13, 2018
```
This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2.

llvm-svn: 346768
```
  fa43892d
Nov 11, 2018

[IPSCCP,PM] Preserve PDT in the new pass manager. · 9026d4ee

Florian Hahn authored Nov 11, 2018

Reviewers: kuhar, chandlerc, NutshellySima, brzycki

Reviewed By: NutshellySima, brzycki

Differential Revision: https://reviews.llvm.org/D54317

llvm-svn: 346618

9026d4ee

Nov 10, 2018

[ThinLTO] Internalize readonly globals · be8d1996

Eugene Leviant authored Nov 10, 2018

This patch allows internalising globals if all accesses to them
(from live functions) are from non-volatile load instructions

Differential revision: https://reviews.llvm.org/D49362

llvm-svn: 346584

be8d1996

Nov 09, 2018

[IPSCCP,PM] Preserve DT in the new pass manager. · a1062f4b

Florian Hahn authored Nov 09, 2018

After D45330, Dominators are required for IPSCCP and can be preserved.

This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.

Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima

Reviewed By: chandlerc, kuhar, NutshellySima

Differential Revision: https://reviews.llvm.org/D47259

llvm-svn: 346486

a1062f4b

Nov 08, 2018

[LTO] Drop non-prevailing definitions only if linkage is not local or appending · e61652a3

Pirama Arumuga Nainar authored Nov 08, 2018

Summary:
This fixes PR 37422

In ELF, non-weak symbols can also be non-prevailing.  In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.

Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.

To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D54125

llvm-svn: 346436

e61652a3

[MergeFuncs] Improve ordering of equal functions · 73cb9784

whitequark authored Nov 08, 2018

Summary:
MergeFunctions currently tries to process strong functions before
weak functions, because weak functions can simply call strong
functions, while a strong/weak function cannot call a weak function
(a backing strong function is needed).

This patch additionally tries to process external functions before
local functions, because we definitely have to keep the external
function, but may be able to drop the local one (and definitely
can if it is also unnamed_addr).

Unfortunately, this exposes an existing bug in the implementation:
The FnTree and FNodesInTree structures can currently go out of
sync in the case where two weak functions are merged, because the
function in FnTree/FNodesInTree is RAUWed. This leaves it behind in
FnTree (this is intended, as it is the strong backing function which
should be used for further merges), while it is replaced in
FNodesInTree (this is not intended).

This is fixed by switching FNodesInTree from using a ValueMap to
using a DenseMap of AssertingVH.

This exposes another minor issue: Currently FNodesInTree is not
cleared after MergeFunctions finishes running. Currently, this is
potentially dangerous (e.g. if something else wants to RAUW a function
with a non-function), but at the very least it is unnecessary/inefficient.
After the change to use AssertingVH it becomes more problematic,
because there are certainly passes that remove functions.

This issue is fixed by clearing FNodesInTree at the end of the pass.

Reviewers: jfb, whitequark

Reviewed By: whitequark

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D53271

llvm-svn: 346386

73cb9784

[MergeFuncs] Call removeUsers() prior to unnamed_addr RAUW · 3580ac61

whitequark authored Nov 08, 2018

Summary:
For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities.

Fix this by calling removeUsers() prior to RAUW.

Reviewers: jfb, whitequark

Reviewed By: whitequark

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D53262

llvm-svn: 346385

3580ac61

Nov 06, 2018

[ThinLTO] Split NotEligibleToImport into legality and inlinability flags · cb397461

Teresa Johnson authored Nov 06, 2018

Summary:
The NotEligibleToImport flag on the GlobalValueSummary was set if it
isn't legal to import (e.g. because it references unpromotable locals)
and when it can't be inlined (in which case importing is pointless).

I split out the inlinable piece into a separate flag on the
FunctionSummary (doesn't make sense for aliases or global variables),
because in the future we may want to import for reasons other than
inlining.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53345

llvm-svn: 346261

cb397461

Nov 05, 2018

[HotColdSplitting] Use TTI to inform outlining threshold · d2a895a9

Vedant Kumar authored Nov 04, 2018

Using TargetTransformInfo allows the splitting pass to factor in the
code size cost of instructions as it decides whether or not outlining is
profitable.

This did not regress the overall amount of outlining seen on the handful
of internal frameworks I tested.

Thanks to Jun Bum Lim for suggesting this!

Differential Revision: https://reviews.llvm.org/D53835

llvm-svn: 346108

d2a895a9

Oct 31, 2018

ADT/STLExtras: Introduce llvm::empty; NFC · 9fd397b4

Matthias Braun authored Oct 31, 2018

This is modeled after C++17 std::empty().

Differential Revision: https://reviews.llvm.org/D53909

llvm-svn: 345679

9fd397b4

Oct 29, 2018

[HotColdSplitting] Allow outlining single-block cold regions · dd4be53b

Vedant Kumar authored Oct 29, 2018

It can be profitable to outline single-block cold regions because they
may be large.

Allow outlining single-block regions if they have over some threshold of
non-debug, non-terminator instructions. I chose 3 as the threshold after
experimenting with several internal frameworks.

In practice, reducing the threshold further did not give much
improvement, whereas increasing it resulted in substantial regressions.

Differential Revision: https://reviews.llvm.org/D53824

llvm-svn: 345524

dd4be53b

Oct 25, 2018

[HotColdSplitting] Identify larger cold regions using domtree queries · c2990068

Vedant Kumar authored Oct 24, 2018

The current splitting algorithm works in three stages:

  1) Identify cold blocks, then
  2) Use forward/backward propagation to mark hot blocks, then
  3) Grow a SESE region of blocks *outside* of the set of hot blocks and
  start outlining.

While testing this pass on Apple internal frameworks I noticed that some
kinds of control flow (e.g. loops) are never outlined, even though they
unconditionally lead to / follow cold blocks. I noticed two other issues
related to how cold regions are identified:

  - An inconsistency can arise in the internal state of the hotness
  propagation stage, as a block may end up in both the ColdBlocks set
  and the HotBlocks set. Further inconsistencies can arise as these sets
  do not match what's in ProfileSummaryInfo.

  - It isn't necessary to limit outlining to single-exit regions.

This patch teaches the splitting algorithm to identify maximal cold
regions and outline them. A maximal cold region is defined as the set of
blocks post-dominated by a cold sink block, or dominated by that sink
block. This approach can successfully outline loops in the cold path. As
a side benefit, it maintains less internal state than the current
approach.

Due to a limitation in CodeExtractor, blocks within the maximal cold
region which aren't dominated by a single entry point (a so-called "max
ancestor") are filtered out.

Results:
  - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
  47KB pre-patch, or a ~3x improvement. Did not see a performance impact
  across two runs.
  - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
  of TEXT were outlined. Ditto re: performance impact.
  - Outlining results improve marginally in the internal frameworks I
  tested.

Follow-ups:
  - Outline more than once per function, outline large single basic
  blocks, & try to remove unconditional branches in outlined functions.

Differential Revision: https://reviews.llvm.org/D53627

llvm-svn: 345209

c2990068

Oct 24, 2018

[hot-cold-split] Name split functions with ".cold" suffix · c8dba682

Teresa Johnson authored Oct 24, 2018

Summary:
The current default of appending "_"+entry block label to the new
extracted cold function breaks demangling. Change the deliminator from
"_" to "." to enable demangling. Because the header block label will
be empty for release compile code, use "extracted" after the "." when
the label is empty.

Additionally, add a mechanism for the client to pass in an alternate
suffix applied after the ".", and have the hot cold split pass use
"cold."+Count, where the Count is currently 1 but can be used to
uniquely number multiple cold functions split out from the same function
with D53588.

Reviewers: sebpop, hiraditya

Subscribers: llvm-commits, erik.pilkington

Differential Revision: https://reviews.llvm.org/D53534

llvm-svn: 345178

c8dba682

[PM] keeping history when original SCC split and then merge into itself · 80a0c97e

Wei Mi authored Oct 23, 2018

in the same round of SCC update.

In https://reviews.llvm.org/rL309784, inline history is added to prevent
infinite inlining across multiple run of inliner and SCC update, but the
history will only be kept when new SCC is actually generated during SCC update.

We found a case that SCC can be split and then merge into itself in the same
round of SCC update, so the same SCC will be pop out from UR.CWorklist and
then added back immediately, without any new SCC generated, that is why the
existing patch cannot catch the infinite inline case.

What the patch does is even if no new SCC is generated, if only the current
SCC appears in UR.CWorklist again, then keep the inline history.

Differential Revision: https://reviews.llvm.org/D52915

llvm-svn: 345103

80a0c97e

Oct 23, 2018

[HotColdSplitting] Attach MinSize to outlined code · 50315461

Vedant Kumar authored Oct 23, 2018

Outlined code is cold by assumption, so it makes sense to optimize it
for minimal code size rather than performance.

After r344869 moved the splitting pass to the end of the IR pipeline,
this does not result in much of a code size reduction. This is probably
because a comparatively small number backend transforms make use of the
MinSize hint.

Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of
919 bytes of TEXT reduction. I didn't measure a significant performance
impact.

Differential Revision: https://reviews.llvm.org/D53518

llvm-svn: 345072

50315461

[DebugInfo][GlobalOpt] Fix -debugify for globalopt shrinking globals to booleans. · 2fed6ac1

Jordan Rupprecht authored Oct 23, 2018

Summary:
TryToShrinkGlobalToBoolean, when possible, will split store <value> + load <value> into store <bool> + select <bool ? value : 0>. This preserves DebugLoc during that pass.

Fixes PR37959. The test case here is the simplified .ll for:

```
static int foo;
int bar() {
  foo = 5;
  return foo;
}
```

Reviewers: dblaikie, gbedwell, aprantl

Reviewed By: dblaikie

Subscribers: mehdi_amini, JDevlieghere, dexonsmith, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D53531

llvm-svn: 345046

2fed6ac1

Oct 22, 2018

[hot-cold-split] Add opt remark on success · f431a2f2

Teresa Johnson authored Oct 22, 2018

Summary: Emit optimization remark on successful hot cold split.

Reviewers: sebpop, hiraditya

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53512

llvm-svn: 344938

f431a2f2

Oct 21, 2018

Schedule Hot Cold Splitting pass after most optimization passes · d9e2e383

Aditya Kumar authored Oct 21, 2018

Summary:
In the new+old pass manager, hot cold splitting was schedule too early.
Thanks to Vedant for pointing this out.

Reviewers: sebpop, vsk

Reviewed By: sebpop, vsk

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D53437

llvm-svn: 344869

d9e2e383

Oct 18, 2018
- [TI removal] Switch MergeFunctions to directly use Instruction API. · 93cf2ea2
  Chandler Carruth authored Oct 18, 2018
```
llvm-svn: 344714
```
  93cf2ea2
Oct 17, 2018

[ThinLTO] Add importing stats to thin link · d2c234a4

Teresa Johnson authored Oct 16, 2018

Summary:
Previously we could only get the number of imported functions and
variables from the backend. This adds stats to the thin link where the
importing is decided.

Reviewers: wmi

Subscribers: inglorion, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D53337

llvm-svn: 344658

d2c234a4

Oct 16, 2018

[InstCombine] Cleanup libfunc attribute inferring · 7c7760da

David Bolvansky authored Oct 16, 2018

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53338

llvm-svn: 344645

7c7760da

Change a TerminatorInst* to an Instruction* in HotColdSplitting.cpp. · 8f9a2446

Lang Hames authored Oct 15, 2018

r344558 added an assignment to a TerminatorInst* from
BasicBlock::getTerminatorInst(), but BasicBlock::getTerminatorInst() returns an
Instruction* rather than a TerminatorInst* since r344504 so this fails to
compile.

Changing the variable to an Instruction* should get the bots building again.

llvm-svn: 344566

8f9a2446

Oct 15, 2018

[hot-cold-split] fix static analysis of cold regions · 542e522b

Sebastian Pop authored Oct 15, 2018

Make the code of blockEndsInUnreachable to match the function
blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have
added a note to make sure the code of this function will not be
modified unless the back-end version is also modified.

An early return before outlining has been added to avoid
outlining the full function body when the first block in the
function is marked cold.

The static analysis of cold code has been amended to avoid
marking the whole function as cold by back-propagation
because the back-propagation would mark blocks with return
statements as cold.

The patch adds debug statements to help discover these problems.

Differential Revision: https://reviews.llvm.org/D52904

llvm-svn: 344558

542e522b

[TI removal] Make variables declared as `TerminatorInst` and initialized · edb12a83

Chandler Carruth authored Oct 15, 2018

by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502

edb12a83

Oct 13, 2018

[InstCombine] Fixed crash with aliased functions · e8b3bba7

David Bolvansky authored Oct 13, 2018

Summary: Fixes PR39177

Reviewers: spatel, jbuening

Reviewed By: jbuening

Subscribers: jbuening, llvm-commits

Differential Revision: https://reviews.llvm.org/D53129

llvm-svn: 344454

e8b3bba7

Oct 12, 2018
- [ThinLTO] Don't import GV which contains blockaddress · eddf6b5d
  Eugene Leviant authored Oct 12, 2018
```
Differential revision: https://reviews.llvm.org/D53139

llvm-svn: 344325
```
  eddf6b5d
Oct 11, 2018

Add a flag to remap manglings when reading profile data information. · 6c676628

Richard Smith authored Oct 10, 2018

This can be used to preserve profiling information across codebase
changes that have widespread impact on mangled names, but across which
most profiling data should still be usable. For example, when switching
from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
or even from a 32-bit to a 64-bit build.

The user can provide a remapping file specifying parts of mangled names
that should be treated as equivalent (eg, std::__1 should be treated as
equivalent to std::__cxx11), and profile data will be treated as
applying to a particular function if its name is equivalent to the name
of a function in the profile data under the provided equivalences. See
the documentation change for a description of how this is configured.

Remapping is supported for both sample-based profiling and instruction
profiling. We do not support remapping indirect branch target
information, but all other profile data should be remapped
appropriately.

Support is only added for the new pass manager. If someone wants to also
add support for this for the old pass manager, doing so should be
straightforward.

This is the LLVM side of Clang r344199.

Reviewers: davidxl, tejohnson, dlj, erik.pilkington

Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D51249

llvm-svn: 344200

6c676628

Oct 10, 2018

Replace most users of UnknownSize with LocationSize::unknown(); NFC · 6ef8002c

George Burgess IV authored Oct 10, 2018

Moving away from UnknownSize is part of the effort to migrate us to
LocationSizes (e.g. the cleanup promised in D44748).

This doesn't entirely remove all of the uses of UnknownSize; some uses
require tweaks to assume that UnknownSize isn't just some kind of int.
This patch is intended to just be a trivial replacement for all places
where LocationSize::unknown() will Just Work.

llvm-svn: 344186

6ef8002c

Oct 08, 2018

[ThinLTO] Keep non-prevailing (linkonce|weak)_odr symbols live · bfdad33b

Xin Tong authored Oct 08, 2018

Summary:
If we have a symbol with (linkonce|weak)_odr linkage, we do not want
to dead strip it even it is not prevailing.

IR level (linkonce|weak)_odr symbol can become non-prevailing when we mix
ELF objects and IR objects where the (linkonce|weak)_odr symbol in the ELF
object is prevailing and the ones in the IR objects are not. Stripping
them will prevent us from doing optimizations with them.

By not dead stripping them, We will convert these symbols to
available_externally linkage as a result of non-prevailing and eventually
dropping them after inlining.

I modified cache-prevailing.ll to use linkonce linkage as it is
testing whether cache prevailing bit is effective or not, not
we should treat linkonce_odr alive or not

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D52893

llvm-svn: 343970

bfdad33b

Oct 03, 2018

Improve static analysis of cold basic blocks · a27014b8

Aditya Kumar authored Oct 03, 2018

Differential Revision: https://reviews.llvm.org/D52704

Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: sebpop

llvm-svn: 343663

a27014b8

Add support for new pass manager · 9e20ade7

Aditya Kumar authored Oct 03, 2018

Modified the testcases to use both pass managers
Use single commandline flag for both pass managers.

Differential Revision: https://reviews.llvm.org/D52708
Reviewers: sebpop, tejohnson, brzycki, SirishP
Reviewed By: tejohnson, brzycki

llvm-svn: 343662

9e20ade7

Oct 01, 2018

Temporarily revert "[GVNHoist] Re-enable GVNHoist by default" · dcf1d97c

Eric Christopher authored Oct 01, 2018

This reverts commit r342387 as it's showing significant performance
regressions in a number of benchmarks. Followed up with the
committer and original thread with an example and will get performance
numbers before recommitting.

llvm-svn: 343522

dcf1d97c

Recommit r343308: [LoopInterchange] Turn into a loop pass. · 8600fee5
Florian Hahn authored Oct 01, 2018
```
llvm-svn: 343450
```
8600fee5

Sep 28, 2018

Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints" · 29b29801

whitequark authored Sep 28, 2018

This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.

Broke the bots, and should really be in Transforms/Coroutines
instead.

llvm-svn: 343337

29b29801

[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints · 937afbc3

whitequark authored Sep 28, 2018

Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: mehdi_amini, modocache, llvm-commits

Differential Revision: https://reviews.llvm.org/D51642

llvm-svn: 343336

937afbc3

Revert r343308: [LoopInterchange] Turn into a loop pass. · 8d72ecc3
Florian Hahn authored Sep 28, 2018
```
llvm-svn: 343310
```
8d72ecc3

[LoopInterchange] Turn into a loop pass. · 0694c159

Florian Hahn authored Sep 28, 2018

This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.

The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.

It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.


Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D51702

llvm-svn: 343308

0694c159