Commits · 9f9ab8658119c99c09668f5e9a68878cc09a2e6e · Lorenzo Albano / LLVM bpEVL

Mar 14, 2012

Change where we enable the heuristic that delays inlining into functions · 30b8416d

Chandler Carruth authored Mar 14, 2012

which are small enough to themselves be inlined. Delaying in this manner
can be harmful if the function is inelligible for inlining in some (or
many) contexts as it pessimizes the code of the function itself in the
event that inlining does not eventually happen.

Previously the check was written to only do this delaying of inlining
for static functions in the hope that they could be entirely deleted and
in the knowledge that all callers of static functions will have the
opportunity to inline if it is in fact profitable. However, with C++ we
get two other important sources of functions where the definition is
always available for inlining: inline functions and templated functions.
This patch generalizes the inliner to allow linkonce-ODR (the linkage
such C++ routines receive) to also qualify for this delay-based
inlining.

Benchmarking across a range of large real-world applications shows
roughly 2% size increase across the board, but an average speedup of
about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary
itself (when bootstrapped with this feature) shows a 1% -O0 performance
improvement when run over all Sema, Lex, and Parse source code smashed
into a single file. A clean re-build of Clang+LLVM with a bootstrapped
Clang shows approximately 2% improvement, but that measurement is often
noisy.

llvm-svn: 152737

30b8416d

Mar 12, 2012

When inlining a function and adding its inner call sites to the · 595fda84

Chandler Carruth authored Mar 12, 2012

candidate set for subsequent inlining, try to simplify the arguments to
the inner call site now that inlining has been performed.

The goal here is to propagate and fold constants through deeply nested
call chains. Without doing this, we loose the inliner bonus that should
be applied because the arguments don't match the exact pattern the cost
estimator uses.

Reviewed on IRC by Benjamin Kramer.

llvm-svn: 152556

595fda84

Feb 25, 2012

Add support for disabling llvm.lifetime intrinsics in the AlwaysInliner. These · 07d37bc1

Chad Rosier authored Feb 25, 2012

are optimization hints, but at -O0 we're not optimizing.  This becomes a problem
when the alwaysinline attribute is abused.
rdar://10921594

llvm-svn: 151429

07d37bc1

Oct 20, 2011

Refactor code from inlining and globalopt that checks whether a function... · 1923a330

Eli Friedman authored Oct 20, 2011

Refactor code from inlining and globalopt that checks whether a function definition is unused, and enhance it so it can tell that functions which are only used by a blockaddress are in fact dead.  This probably doesn't happen much on most code, but the Linux kernel's _THIS_IP_ can trigger this issue with blockaddress.  (GlobalDCE can also handle the given tescase, but we only run that at -O3.)  Found while looking at PR11180.

llvm-svn: 142572

1923a330

Jul 18, 2011
- land David Blaikie's patch to de-constify Type, with a few tweaks. · 229907cd
  Chris Lattner authored Jul 18, 2011
```
llvm-svn: 135375
```
  229907cd
Apr 23, 2011
- Remove unused STL header includes. · 1a180156
  Jay Foad authored Apr 23, 2011
```
llvm-svn: 130068
```
  1a180156
Jan 04, 2011

Improve the accuracy of the inlining heuristic looking for the · a71d2cc8

Dale Johannesen authored Jan 04, 2011

case where a static caller is itself inlined everywhere else, and
thus may go away if it doesn't get too big due to inlining other
things into it.  If there are references to the caller other than
calls, it will not be removed; account for this.
This results in same-day completion of the case in PR8853.

llvm-svn: 122821

a71d2cc8

Dec 06, 2010

Fix PR8735, a really terrible problem in the inliner's "alloca merging" · fb212de0

Chris Lattner authored Dec 06, 2010

optimization.

Consider:
static void foo() {
  A = alloca
  ...
}

static void bar() {
  B = alloca
  ...
  call foo();
}

void main() {
  bar()
}

The inliner proceeds bottom up, but lets pretend it decides not to inline foo
into bar.  When it gets to main, it inlines bar into main(), and says "hey, I
just inlined an alloca "B" into main, lets remember that.  Then it keeps going
and finds that it now contains a call to foo.  It decides to inline foo into
main, and says "hey, foo has an alloca A, and I have an alloca B from another
inlined call site, lets reuse it".  The problem with this of course, is that 
the lifetime of A and B are nested, not disjoint.

Unfortunately I can't create a reasonable testcase for this: the one in the
PR is both huge and extremely sensitive, because you minor tweaks end up
causing foo to get inlined into bar too early.  We already have tests for the
basic alloca merging optimization and this does not break them.

llvm-svn: 120995

fb212de0

improve -debug output and comments a little. · 5b6a865f
Chris Lattner authored Dec 06, 2010
```
llvm-svn: 120993
```
5b6a865f

Nov 03, 2010

Let the -inline-threshold command line argument take precedence over the · 31a7eb40

Jakob Stoklund Olesen authored Nov 02, 2010

threshold given to createFunctionInliningPass().

Both opt -O3 and clang would silently ignore the -inline-threshold option.

llvm-svn: 118117

31a7eb40

Aug 06, 2010
- Reapply r110396, with fixes to appease the Linux buildbot gods. · a7aed186
  Owen Anderson authored Aug 06, 2010
```
llvm-svn: 110460
```
  a7aed186
- Revert r110396 to fix buildbots. · bda59bd2
  Owen Anderson authored Aug 06, 2010
```
llvm-svn: 110410
```
  bda59bd2
- Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static · 755aceb5
  Owen Anderson authored Aug 05, 2010
```
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
```
  755aceb5
Jul 29, 2010
- simplify by using CallSite constructors; virtually eliminates CallSite::get from the tree · 62f0aac9
  Gabor Greif authored Jul 28, 2010
```
llvm-svn: 109687
```
  62f0aac9
Jul 13, 2010
- Grammar. · ea282034
  Eric Christopher authored Jul 13, 2010
```
llvm-svn: 108252
```
  ea282034
May 31, 2010
- Avoid swap when a copy suffices. · 5ac57e34
  Benjamin Kramer authored May 31, 2010
```
llvm-svn: 105220
```
  5ac57e34
May 01, 2010

revert r102831. We already delete dead readonly calls in · b49a622f
Chris Lattner authored May 01, 2010
```
other places, killing a valid transformation is not the right
answer.

llvm-svn: 102850
```
b49a622f

Disable the call-deletion transformation introduced in r86975. Without · 550986ea

Owen Anderson authored May 01, 2010

halting analysis, it is illegal to delete a call to a read-only function.
The correct solution is almost certainly to add a "must halt" attribute and
only allow deletions in its presence.

XFAIL the relevant testcase for now.

llvm-svn: 102831

550986ea

rename InlineInfo.DevirtualizedCalls -> InlinedCalls to · c2432b9d
Chris Lattner authored May 01, 2010
```
reflect that it includes all inlined calls now, not just
devirtualized ones.

llvm-svn: 102824
```
c2432b9d

The inliner has traditionally not considered call sites · e8262675

Chris Lattner authored May 01, 2010

that appear due to inlining a callee as candidates for
futher inlining, but a recent patch made it do this if
those call sites were indirect and became direct.

Unfortunately, in bizarre cases (see testcase) doing this
can cause us to infinitely inline mutually recursive
functions into callers not in the cycle.  Fix this by
keeping track of the inline history from which callsite
inline candidates got inlined from.

This shouldn't affect any "real world" code, but is required
for a follow on patch that is coming up next.

llvm-svn: 102822

e8262675

Apr 25, 2010
- remove #if 1's. · b34ffe36
  Chris Lattner authored Apr 25, 2010
```
llvm-svn: 102296
```
  b34ffe36
Apr 23, 2010

enable my inliner change: add newly devirtualized call sites to · d3b361d1
Chris Lattner authored Apr 23, 2010
```
the worklist, making them inline candidates.

llvm-svn: 102213
```
d3b361d1

switch InlineInfo.DevirtualizedCalls's list to be of WeakVH. · c691de3b

Chris Lattner authored Apr 23, 2010

This fixes a bug where calls inlined into an invoke would get
changed into an invoke but the array would keep pointing to
the (now dead) call.  The improved inliner behavior is still
disabled for now.

llvm-svn: 102196

c691de3b

disable my previous inliner patch, it appears to be busting self-host. · d8d898db
Chris Lattner authored Apr 23, 2010
```
llvm-svn: 102153
```
d8d898db

The inliner was choosing to not consider call sites · 2eee5d34

Chris Lattner authored Apr 22, 2010

that appear in the SCC as a result of inlining as candidates
for inlining.  Change this so that it *does* consider call 
sites that change from being indirect to being direct as a
result of inlining.  This allows it to completely 
"devirtualize" the testcase.

llvm-svn: 102146

2eee5d34

refactor the interface to InlineFunction so that most of the in/out · 4ba01ec8

Chris Lattner authored Apr 22, 2010

arguments are handled with a new InlineFunctionInfo class.  This 
makes it easier to extend InlineFunction to return more info in the
future.

llvm-svn: 102137

4ba01ec8

Apr 20, 2010
- make the inliner do less work for leaf functions. · a5cdd5e6
  Chris Lattner authored Apr 20, 2010
```
llvm-svn: 101846
```
  a5cdd5e6
Apr 17, 2010

introduce a new CallGraphSCC class, and pass it around · 4422d31b

Chris Lattner authored Apr 16, 2010

to CallGraphSCCPass's instead of passing around a
std::vector<CallGraphNode*>.  No functionality change,
but now we have a much tidier interface.

llvm-svn: 101558

4422d31b

Mar 10, 2010

Try to keep the cached inliner costs around for a bit longer for big functions. · b495cad7

Jakob Stoklund Olesen authored Mar 09, 2010

The Caller cost info would be reset everytime a callee was inlined. If the
caller has lots of calls and there is some mutual recursion going on, the
caller cost info could be calculated many times.

This patch reduces inliner runtime from 240s to 0.5s for a function with 20000
small function calls.

This is a more conservative version of r98089 that doesn't break the clang
test CodeGenCXX/temp-order.cpp. That test relies on rather extreme inlining
for constant folding.

llvm-svn: 98099

b495cad7

Mar 09, 2010

Revert r98089, it was breaking a clang test. · 44974759
Jakob Stoklund Olesen authored Mar 09, 2010
```
llvm-svn: 98094
```
44974759

Try to keep the cached inliner costs around for a bit longer for big functions. · 741dec43

Jakob Stoklund Olesen authored Mar 09, 2010

The Caller cost info would be reset everytime a callee was inlined. If the
caller has lots of calls and there is some mutual recursion going on, the
caller cost info could be calculated many times.

This patch reduces inliner runtime from 240s to 0.5s for a function with 20000
small function calls.

llvm-svn: 98089

741dec43

Add inlining threshold to log output. · d62c2f55
Jakob Stoklund Olesen authored Mar 09, 2010
```
llvm-svn: 98024
```
d62c2f55

Feb 13, 2010

Enable the inlinehint attribute in the Inliner. · 492b8b42

Jakob Stoklund Olesen authored Feb 13, 2010

Functions explicitly marked inline will get an inlining threshold slightly
more aggressive than the default for -O3. This means than -O3 builds are
mostly unaffected while -Os builds will be a bit bigger and faster.

The difference depends entirely on how many 'inline's are sprinkled on the
source.

In the CINT2006 suite, only these tests are significantly affected under -Os:

               Size   Time
471.omnetpp   +1.63% -1.85%
473.astar     +4.01% -6.02%
483.xalancbmk +4.60%  0.00%

Note that 483.xalancbmk runs too quickly to give useful timing results.

llvm-svn: 96066

492b8b42

Feb 06, 2010

Reintroduce the InlineHint function attribute. · 74bb06c0

Jakob Stoklund Olesen authored Feb 06, 2010

This time it's for real! I am going to hook this up in the frontends as well.

The inliner has some experimental heuristics for dealing with the inline hint.
When given a -respect-inlinehint option, functions marked with the inline
keyword are given a threshold just above the default for -O3.

We need some experiments to determine if that is the right thing to do.

llvm-svn: 95466

74bb06c0

Feb 04, 2010

Increase inliner thresholds by 25. · 113fb54b

Jakob Stoklund Olesen authored Feb 04, 2010

This makes the inliner about as agressive as it was before my changes to the
inliner cost calculations. These levels give the same performance and slightly
smaller code than before.

llvm-svn: 95320

113fb54b

Jan 20, 2010

Move per-function inline threshold calculation to a method. · 8a19d3c9

Jakob Stoklund Olesen authored Jan 20, 2010

No functional change except the forgotten test for
InlineLimit.getNumOccurrences() == 0 in the CurrentThreshold2 calculation.

llvm-svn: 94007

8a19d3c9

Jan 05, 2010
- Change errs() to dbgs(). · 0122fc49
  David Greene authored Jan 05, 2010
```
llvm-svn: 92625
```
  0122fc49
Nov 12, 2009

use isInstructionTriviallyDead, as pointed out by Duncan · 5c89f4b4
Chris Lattner authored Nov 12, 2009
```
llvm-svn: 87035
```
5c89f4b4

implement a nice little efficiency hack in the inliner. Since we're now · eb9acbfb

Chris Lattner authored Nov 12, 2009

running IPSCCP early, and we run functionattrs interlaced with the inliner,
we often (particularly for small or noop functions) completely propagate
all of the information about a call to its call site in IPSSCP (making a call
dead) and functionattrs is smart enough to realize that the function is
readonly (because it is interlaced with inliner).

To improve compile time and make the inliner threshold more accurate, realize
that we don't have to inline dead readonly function calls.  Instead, just 
delete the call.  This happens all the time for C++ codes, here are some
counters from opt/llvm-ld counting the number of times calls were deleted vs
inlined on various apps:

Tramp3d opt:
  5033 inline                - Number of call sites deleted, not inlined
 24596 inline                - Number of functions inlined
llvm-ld:
  667 inline           - Number of functions deleted because all callers found
  699 inline           - Number of functions inlined

483.xalancbmk opt:
  8096 inline                - Number of call sites deleted, not inlined
 62528 inline                - Number of functions inlined
llvm-ld:
   217 inline           - Number of allocas merged together
  2158 inline           - Number of functions inlined

471.omnetpp:
  331 inline                - Number of call sites deleted, not inlined
 8981 inline                - Number of functions inlined
llvm-ld:
  171 inline           - Number of functions deleted because all callers found
  629 inline           - Number of functions inlined


Deleting a call is much faster than inlining it, and is insensitive to the
size of the callee. :)

llvm-svn: 86975

eb9acbfb

Oct 13, 2009
- Move the InlineCost code from Transforms/Utils to Analysis. · 4552e3cd
  Dan Gohman authored Oct 13, 2009
```
llvm-svn: 83998
```
  4552e3cd