Commits · 7a6b6d56566319d813b158b753da00ce7c223580 · Lorenzo Albano / LLVM bpEVL

Nov 17, 2016

[CMake] NFC. Updating CMake dependency specifications · 05c279fc

Chris Bieneman authored Nov 17, 2016

This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.

llvm-svn: 287206

05c279fc

Introduce GlobalSplit pass. · f72a8d4e

Peter Collingbourne authored Nov 16, 2016

This pass splits globals into elements using inrange annotations on
getelementptr indices.

Differential Revision: https://reviews.llvm.org/D22295

llvm-svn: 287178

f72a8d4e

Nov 14, 2016

[ThinLTO] Only promote exported locals as marked in index · 4fef68cb

Teresa Johnson authored Nov 14, 2016

Summary:
We have always speculatively promoted all renamable local values
(except const non-address taken variables) for both the exporting
and importing module. We would then internalize them back based on
the ThinLink results if they weren't actually exported. This is
inefficient, and results in unnecessary renames. It also meant we
had to check the non-renamability of a value in the summary, which
was already checked during function importing analysis in the ThinLink.

Made renameModuleForThinLTO (which does the promotion/renaming) instead
use the index when exporting, to avoid unnecessary renames/promotions.
For importing modules, we can simply promoted all values as any local
we import by definition is exported and needs promotion.

This required changes to the method used by the FunctionImport pass
(only invoked from 'opt' for testing) and when invoked from llvm-link,
since neither does a ThinLink. We simply conservatively mark all locals
in the index as promoted, which preserves the current aggressive
promotion behavior.

I also needed to change an llvm-lto based test where we had previously
been aggressively promoting values that weren't importable (aliasees),
but now will not promote.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26467

llvm-svn: 286871

4fef68cb

[ThinLTO] Make inline assembly handling more efficient in summary · d5033a45

Teresa Johnson authored Nov 14, 2016

Summary:
The change in r285513 to prevent exporting of locals used in
inline asm added all locals in the llvm.used set to the reference
set of functions containing inline asm. Since these locals were marked
NoRename, this automatically prevented importing of the function.

Unfortunately, this caused an explosion in the summary reference lists
in some cases. In my particular example, it happened for a large protocol
buffer generated C++ file, where many of the generated functions
contained an inline asm call. It was exacerbated when doing a ThinLTO
PGO instrumentation build, where the PGO instrumentation included
thousands of private __profd_* values that were added to llvm.used.

We really only need to include a single llvm.used local (NoRename) value
in the reference list of a function containing inline asm to block it
being imported. However, it seems cleaner to add a flag to the summary
that explicitly describes this situation, which is what this patch does.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26402

llvm-svn: 286840

d5033a45

Nov 13, 2016
- Bitcode: Change module reader functions to return an llvm::Expected. · d9445c49
  Peter Collingbourne authored Nov 13, 2016
```
Differential Revision: https://reviews.llvm.org/D26562

llvm-svn: 286752
```
  d9445c49
Nov 11, 2016

[cfi] Fix weak functions handling. · 1fe189d7

Evgeniy Stepanov authored Nov 11, 2016

When a function pointer is replaced with a jumptable pointer, special
case is needed to preserve the semantics of extern_weak functions.
Since a jumptable entry can not be extern_weak, we emulate that
behaviour by replacing all references to F (the extern_weak function)
with the following expression: F != nullptr ? JumpTablePtr : nullptr.

Extra special care is needed for global initializers, since most (or
probably all) backends can not lower an initializer that includes
this kind of constant expression. Initializers like that are replaced
with a global constructor (i.e. a runtime initializer).

llvm-svn: 286636

1fe189d7

Make the FunctionComparator of the MergeFunctions pass a stand-alone utility. · 4d6fb72a

Erik Eckstein authored Nov 11, 2016

This is pure refactoring. NFC.

This change moves the FunctionComparator (together with the GlobalNumberState
utility) in to a separate file so that it can be used by other passes.
For example, the SwiftMergeFunctions pass in the Swift compiler:
https://github.com/apple/swift/blob/master/lib/LLVMPasses/LLVMMergeFunctions.cpp

Details of the change:

*) The big part is just moving code out of MergeFunctions.cpp into FunctionComparator.h/cpp
*) Make FunctionComparator member functions protected (instead of private)
   so that a derived comparator class can use them.

Following refactoring helps to share code between the base FunctionComparator
class and a derived class:

*) Add a beginCompare() function
*) Move some basic function property comparisons into a separate function compareSignature()
*) Do the GEP comparison inside cmpOperations() which now has a new
   needToCmpOperands reference parameter

https://reviews.llvm.org/D25385

llvm-svn: 286632

4d6fb72a

Bitcode: Change getModuleSummaryIndex() to return an llvm::Expected. · 6de481a3
Peter Collingbourne authored Nov 11, 2016
```
Differential Revision: https://reviews.llvm.org/D26539

llvm-svn: 286624
```
6de481a3

[cfi] Implement cfi-icall using inline assembly. · f48ffab5

Evgeniy Stepanov authored Nov 11, 2016

The current implementation is emitting a global constant that happens
to evaluate to the same bytes + relocation as a jump instruction on
X86. This does not work for PIE executables and shared libraries
though, because we end up with a wrong relocation type. And it has no
chance of working on ARM/AArch64 which use different relocation types
for jump instructions (R_ARM_JUMP24) that is never generated for
data.

This change replaces the constant with module-level inline assembly
followed by a hidden declaration of the jump table. Works fine for
ARM/AArch64, but has some drawbacks.
* Extra symbols are added to the static symbol table, which inflate
the size of the unstripped binary a little. Stripped binaries are not
affected. This happens because jump table declarations must be
external (because their body is in the inline asm).
* Original functions that were anonymous are now named
<original name>.cfi, and it affects symbolization sometimes. This is
necessary because the only user of these functions is the (inline
asm) jump table, so they had to be added to @llvm.used, which does
not allow unnamed functions.

llvm-svn: 286611

f48ffab5

Nov 10, 2016
- Add comments about why we put LoopSink pass at the very late stage. · 5492f864
  Dehao Chen authored Nov 10, 2016
```
llvm-svn: 286480
```
  5492f864
Nov 09, 2016

Bitcode: Change the materializer interface to return llvm::Error. · 7f00d0a1
Peter Collingbourne authored Nov 09, 2016
```
Differential Revision: https://reviews.llvm.org/D26439

llvm-svn: 286382
```
7f00d0a1

Enable Loop Sink pass for functions that has profile. · 947dbe12

Dehao Chen authored Nov 09, 2016

Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.

Reviewers: davidxl, hfinkel

Subscribers: mehdi_amini, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26155

llvm-svn: 286325

947dbe12

Nov 08, 2016
- Fix typo in comment. NFC. · fbc7b7d1
  Chad Rosier authored Nov 08, 2016
```
llvm-svn: 286270
```
  fbc7b7d1
- Remove unused include. NFC. · c244349b
  Chad Rosier authored Nov 08, 2016
```
llvm-svn: 286250
```
  c244349b
Nov 07, 2016
- Fix 80-column violations. NFC. · 611b73b1
  Chad Rosier authored Nov 07, 2016
```
llvm-svn: 286117
```
  611b73b1
Nov 04, 2016
- Fix typo · f450b881
  Xinliang David Li authored Nov 04, 2016
```
llvm-svn: 285978
```
  f450b881
Oct 28, 2016

[ThinLTO] Rename HasSection to NoRename (NFC) · 58fbc916

Teresa Johnson authored Oct 28, 2016

Summary:
This is in preparation for a change to utilize this flag for symbols
referenced/defined in either inline or module level assembly.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26048

llvm-svn: 285376

58fbc916

Oct 25, 2016
- GlobalDCE: Restore a statement accidentally removed in r285048. · 4f3b2df9
  Peter Collingbourne authored Oct 25, 2016
```
llvm-svn: 285052
```
  4f3b2df9
- GlobalDCE: Deduplicate code. NFCI. · 7695cb6d
  Peter Collingbourne authored Oct 25, 2016
```
llvm-svn: 285048
```
  7695cb6d
Oct 18, 2016

Conditionally eliminate library calls where the result value is not used · 1c0e9b97

Rong Xu authored Oct 18, 2016

Summary:
This pass shrink-wraps a condition to some library calls where the call
result is not used. For example:
   sqrt(val);
 is transformed to
   if (val < 0)
     sqrt(val);
Even if the result of library call is not being used, the compiler cannot
safely delete the call because the function can set errno on error
conditions.
Note in many functions, the error condition solely depends on the incoming
parameter. In this optimization, we can generate the condition can lead to
the errno to shrink-wrap the call. Since the chances of hitting the error
condition is low, the runtime call is effectively eliminated.

These partially dead calls are usually results of C++ abstraction penalty
exposed by inlining. This optimization hits 108 times in 19 C/C++ programs
in SPEC2006.

Reviewers: hfinkel, mehdi_amini, davidxl

Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz

Differential Revision: https://reviews.llvm.org/D24414

llvm-svn: 284542

1c0e9b97

Oct 08, 2016

Turn cl::values() (for enum) from a vararg function to using C++ variadic template · 732afdd0

Mehdi Amini authored Oct 08, 2016

The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:

 va_start(ValueArgs, Desc);

with Desc being a StringRef.

Differential Revision: https://reviews.llvm.org/D25342

llvm-svn: 283671

732afdd0

Oct 05, 2016

Modify df_iterator to support post-order actions · c1051ab2

David Callahan authored Oct 05, 2016

Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes. It will however allow a client to distinguish back from cross edges in a DFS tree.

Reviewers: nadav, mehdi_amini, dberlin

Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits

Differential Revision: https://reviews.llvm.org/D25191

llvm-svn: 283391

c1051ab2

Oct 03, 2016

[PruneEH] Be correct in the face IPO · 0359a193

Sanjoy Das authored Oct 03, 2016

This fixes one spot I had missed in r265762.  Credit goes to Philip
Reames for spotting this one!

llvm-svn: 283137

0359a193

Oct 01, 2016
- Use StringRef in Pass/PassManager APIs (NFC) · 117296c0
  Mehdi Amini authored Oct 01, 2016
```
llvm-svn: 283004
```
  117296c0
Sep 30, 2016

[thinlto] Don't decay threshold for hot callsites · d2869473

Piotr Padlewski authored Sep 30, 2016

Summary:
We don't want to decay hot callsites to import chains of hot
callsites. The same mechanism is used in LIPO.

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24976

llvm-svn: 282833

d2869473

Sep 29, 2016

[thinlto] Add cold-callsite import heuristic · ba72b95f

Piotr Padlewski authored Sep 29, 2016

Summary:
Not tunned up heuristic, but with this small heuristic there is about
+0.10% improvement on SPEC 2006

Reviewers: tejohnson, mehdi_amini, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24940

llvm-svn: 282733

ba72b95f

Sep 28, 2016
- Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update. · 5461d8bd
  Dehao Chen authored Sep 28, 2016
```
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

llvm-svn: 282630
```
  5461d8bd
- [Inliner] Port all opt remarks to new streaming API · c507ac96
  Adam Nemet authored Sep 27, 2016
```
llvm-svn: 282559
```
  c507ac96
- Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC · 04758ba3
  Adam Nemet authored Sep 27, 2016
```
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

llvm-svn: 282544
```
  04758ba3
Sep 27, 2016

[Inliner] Fold the analysis remark into the missed remark · 1142147e

Adam Nemet authored Sep 27, 2016

There is really no reason for these to be separate.

The vectorizer started this pretty bad tradition that the text of the
missed remarks is pretty meaningless, i.e. vectorization failed.  There,
you have to query analysis to get the full picture.

I think we should just explain the reason for missing the optimization
in the missed remark when possible.  Analysis remarks should provide
information that the pass gathers regardless whether the optimization is
passing or not.

llvm-svn: 282542

1142147e

Output optimization remarks in YAML · a62b7e1a

Adam Nemet authored Sep 27, 2016

(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282539

a62b7e1a

Revert "Output optimization remarks in YAML" · cc2a3fa8
Adam Nemet authored Sep 27, 2016
```
This reverts commit r282499.

The GCC bots are failing

llvm-svn: 282503
```
cc2a3fa8

Output optimization remarks in YAML · 92e928c1

Adam Nemet authored Sep 27, 2016

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282499

92e928c1

Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation · 4ff4f21e

Ivan Krasin authored Sep 27, 2016

Summary:
We don't currently need this facility for CFI. Disabling individual hot methods proved
to be a better strategy in Chrome.

Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne.

Reviewers: pcc

Subscribers: kcc

Differential Revision: https://reviews.llvm.org/D24948

llvm-svn: 282461

4ff4f21e

LowerTypeTests: Remove unused variable. · 53a852b6
Peter Collingbourne authored Sep 26, 2016
```
llvm-svn: 282456
```
53a852b6
LowerTypeTests: Create LowerTypeTestsModule class and move implementation... · 6ed92e3f
Peter Collingbourne authored Sep 26, 2016
```
LowerTypeTests: Create LowerTypeTestsModule class and move implementation there. Related simplifications.

llvm-svn: 282455
```
6ed92e3f

Sep 26, 2016

[thinlto] Basic thinlto fdo heuristic · d9830eb7

Piotr Padlewski authored Sep 26, 2016

Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437

d9830eb7

Sep 21, 2016

Change the basic block weight calculation algorithm to use max instead of voting. · 160fbc3f

Dehao Chen authored Sep 21, 2016

Summary: Now that we have more precise debug info, we should change back to use maximum to get basic block weight.

Reviewers: dnovillo

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D24788

llvm-svn: 282084

160fbc3f

DeadArgElim: Don't mark swifterror arguments as unused · f62ba103
Arnold Schwaighofer authored Sep 21, 2016
```
Replacing swifterror arguments with undef creates invalid IR.

rdar://28300490

llvm-svn: 282075
```
f62ba103

Sep 19, 2016

Handle early inline for hot callsites that reside in the same basic block. · 20866ed5

Dehao Chen authored Sep 19, 2016

Summary: Callsites in the same basic block should share the same hotness. This patch checks for the hottest callsite in the same basic block, and use the hotness for all callsites in that basic block for early inline decisions. It also fixes the test to add "-S" so theat the "CHECK-NOT" is actually checking the content.

Reviewers: dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24734

llvm-svn: 281927

20866ed5