Commits · e505a5abe9fc28a1bb50d900f23ef29766aaf38a · Roger Ferrer / llvm-epi-0.8

Mar 24, 2012
- add EP_OptimizerLast extension point · e505a5ab
  Kostya Serebryany authored Mar 23, 2012
```
llvm-svn: 153353
```
  e505a5ab
Mar 16, 2012

Rip out support for 'llvm.noinline'. This thing has a strange history... · b37fc13a

Chandler Carruth authored Mar 16, 2012

It was added in 2007 as the first cut at supporting no-inline
attributes, but we didn't have function attributes of any form at the
time. However, it was added without any mention in the LangRef or other
documentation.

Later on, in 2008, Devang added function notes for 'inline=never' and
then turned them into proper function attributes. From that point
onward, as far as I can tell, the world moved on, and no one has touched
'llvm.noinline' in any meaningful way since.

It's time has now come. We have had better mechanisms for doing this for
a long time, all the frontends I'm aware of use them, and this is just
holding back progress. Given that it was never a documented feature of
the IR, I've provided no auto-upgrade support. If people know of real,
in-the-wild bitcode that relies on this, yell at me and I'll add it, but
I *seriously* doubt anyone cares.

llvm-svn: 152904

b37fc13a

Start removing the use of an ad-hoc 'never inline' set and instead · d7a5f2ad

Chandler Carruth authored Mar 16, 2012

directly query the function information which this set was representing.
This simplifies the interface of the inline cost analysis, and makes the
always-inline pass significantly more efficient.

Previously, always-inline would first make a single set of every
function in the module *except* those marked with the always-inline
attribute. It would then query this set at every call site to see if the
function was a member of the set, and if so, refuse to inline it. This
is quite wasteful. Instead, simply check the function attribute directly
when looking at the callsite.

The normal inliner also had similar redundancy. It added every function
in the module with the noinline attribute to its set to ignore, even
though inside the cost analysis function we *already tested* the
noinline attribute and produced the same result.

The only tricky part of removing this is that we have to be able to
correctly remove only the functions inlined by the always-inline pass
when finalizing, which requires a bit of a hack. Still, much less of
a hack than the set of all non-always-inline functions was. While I was
touching this function, I switched a heavy-weight set to a vector with
sort+unique. The algorithm already had a two-phase insert and removal
pattern, we were just needlessly paying the uniquing cost on every
insert.

This probably speeds up some compiles by a small amount (-O0 compiles
with lots of always-inline, so potentially heavy libc++ users), but I've
not tried to measure it.

I believe there is no functional change here, but yell if you spot one.
None are intended.

Finally, the direction this is going in is to greatly simplify the
inline cost query interface so that we can replace its implementation
with a much more clever one. Along the way, all the APIs get simplified,
so it seems incrementally good.

llvm-svn: 152903

d7a5f2ad

Mar 14, 2012

Change where we enable the heuristic that delays inlining into functions · 30b8416d

Chandler Carruth authored Mar 14, 2012

which are small enough to themselves be inlined. Delaying in this manner
can be harmful if the function is inelligible for inlining in some (or
many) contexts as it pessimizes the code of the function itself in the
event that inlining does not eventually happen.

Previously the check was written to only do this delaying of inlining
for static functions in the hope that they could be entirely deleted and
in the knowledge that all callers of static functions will have the
opportunity to inline if it is in fact profitable. However, with C++ we
get two other important sources of functions where the definition is
always available for inlining: inline functions and templated functions.
This patch generalizes the inliner to allow linkonce-ODR (the linkage
such C++ routines receive) to also qualify for this delay-based
inlining.

Benchmarking across a range of large real-world applications shows
roughly 2% size increase across the board, but an average speedup of
about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary
itself (when bootstrapped with this feature) shows a 1% -O0 performance
improvement when run over all Sema, Lex, and Parse source code smashed
into a single file. A clean re-build of Clang+LLVM with a bootstrapped
Clang shows approximately 2% improvement, but that measurement is often
noisy.

llvm-svn: 152737

30b8416d

Mar 13, 2012
- Teach globalopt how to evaluate an invoke with a non-void return type. · eab06fa3
  Dan Gohman authored Mar 13, 2012
```
llvm-svn: 152634
```
  eab06fa3
Mar 12, 2012

When inlining a function and adding its inner call sites to the · 595fda84

Chandler Carruth authored Mar 12, 2012

candidate set for subsequent inlining, try to simplify the arguments to
the inner call site now that inlining has been performed.

The goal here is to propagate and fold constants through deeply nested
call chains. Without doing this, we loose the inliner bonus that should
be applied because the arguments don't match the exact pattern the cost
estimator uses.

Reviewed on IRC by Benjamin Kramer.

llvm-svn: 152556

595fda84

Mar 08, 2012

Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012: · 5b648afb

Stepan Dyatkovskiy authored Mar 08, 2012

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html

Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".

ConstCaseIt is just a read-only iterator.
CaseIt is read-write iterator; it allows to change case successor and case value.

Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.

Main way of iterator usage looks like this:
SwitchInst *SI = ... // intialize it somehow

for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
  BasicBlock *BB = i.getCaseSuccessor();
  ConstantInt *V = i.getCaseValue();
  // Do something.
}

If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.

There are also related changes in llvm-clients: klee and clang.

llvm-svn: 152297

5b648afb

Feb 27, 2012
- Plog a memleak in GlobalOpt. · 93887631
  Benjamin Kramer authored Feb 27, 2012
```
Found by valgrind.

llvm-svn: 151525
```
  93887631
Feb 25, 2012
- Add comment. · 50e0b81e
  Chad Rosier authored Feb 25, 2012
```
llvm-svn: 151431
```
  50e0b81e
- Add support for disabling llvm.lifetime intrinsics in the AlwaysInliner. These · 07d37bc1
  Chad Rosier authored Feb 25, 2012
```
are optimization hints, but at -O0 we're not optimizing.  This becomes a problem
when the alwaysinline attribute is abused.
rdar://10921594

llvm-svn: 151429
```
  07d37bc1
- Fix indentation. · e48e5d29
  Chad Rosier authored Feb 25, 2012
```
llvm-svn: 151420
```
  e48e5d29
Feb 23, 2012
- GCC fails to understand that NextBB is always initialized if EvaluateBlock · 4730cb9c
  Duncan Sands authored Feb 23, 2012
```
returns 'true' and emits a warning.  Help it out.

llvm-svn: 151242
```
  4730cb9c
Feb 21, 2012

Use the target-aware constant folder on expressions to improve the chance · 9d0da185

Nick Lewycky authored Feb 21, 2012

they'll be simple enough to simulate, and to reduce the chance we'll encounter
equal but different simple pointer constants.

This removes the symptoms from PR11352 but is not a full fix. A proper fix would
either require a guarantee that two constant objects we simulate are folded
when equal, or a different way of handling equal pointers (ie., trying a
constantexpr icmp on them to see whether we know they're equal or non-equal or
unsure).

llvm-svn: 151093

9d0da185

Check for the correct size in the invariant marker. · 519561f4
Nick Lewycky authored Feb 20, 2012
```
llvm-svn: 151003
```
519561f4

Feb 20, 2012
- Rename class Evaluate to Evaluator and put it in an anonymous namespace. · 60829a58
  Nick Lewycky authored Feb 20, 2012
```
llvm-svn: 150947
```
  60829a58
- Move EvaluateFunction and EvaluateBlock into a class, and make the class store · 73be5e31
  Nick Lewycky authored Feb 19, 2012
```
the information that they pass around between them. No functionality change!

llvm-svn: 150939
```
  73be5e31
Feb 17, 2012

Add support for invariant.start inside the static constructor evaluator. This is · 68f9f9d9

Nick Lewycky authored Feb 17, 2012

useful to represent a variable that is const in the source but can't be constant
in the IR because of a non-trivial constructor. If globalopt evaluates the
constructor, and there was an invariant.start with no matching invariant.end
possible, it will mark the global constant afterwards.

llvm-svn: 150794

68f9f9d9

Feb 12, 2012
- Handle InvokeInst in EvaluateBlock. Don't try to support exceptions, it's just · c1572e4c
  Nick Lewycky authored Feb 12, 2012
```
that no optz'ns have run yet to convert invokes to calls.

llvm-svn: 150326
```
  c1572e4c
- false is totally null! · f285256f
  Nick Lewycky authored Feb 12, 2012
```
llvm-svn: 150324
```
  f285256f
- Remove redundant getAnalysis<> calls in GlobalOpt. Add a few Itanium ABI calls · 4b273cb7
  Nick Lewycky authored Feb 12, 2012
```
to TargetLibraryInfo and use one of them in GlobalOpt.

llvm-svn: 150323
```
  4b273cb7
- Pass TargetData and TargetLibraryInfo through to the constant folder. Fixes a · cf6aae68
  Nick Lewycky authored Feb 12, 2012
```
few fixme's when TLI was added.

llvm-svn: 150322
```
  cf6aae68
- Fix function name in comment to match actual name. Fix comments that are using · 1480f1d3
  Nick Lewycky authored Feb 12, 2012
```
doxy-style on local variables to not do so. Fix one 80-col violation.

llvm-svn: 150320
```
  1480f1d3
- Don't traverse the PHI nodes twice. No functionality change! · 4231c41c
  Nick Lewycky authored Feb 12, 2012
```
llvm-svn: 150319
```
  4231c41c
Feb 09, 2012

Tweak comment readability and grammar. · 1a4695a0
Benjamin Kramer authored Feb 09, 2012
```
llvm-svn: 150183
```
1a4695a0

GlobalOpt: Be more aggressive about elminating side-effect free static dtors. · 487a3962

Benjamin Kramer authored Feb 09, 2012

GlobalOpt runs early in the pipeline (before inlining) and complex class
hierarchies often introduce bitcasts or GEPs which weren't optimized away.
Teach it to ignore side-effect free instructions instead of depending on
other passes to remove them.

llvm-svn: 150174

487a3962

Feb 06, 2012
- [unwind removal] We no longer have 'unwind' instructions being generated, so · d5d95b0b
  Bill Wendling authored Feb 06, 2012
```
remove the code that handles them.

llvm-svn: 149901
```
  d5d95b0b
- Split part of EvaluateFunction into a new EvaluateBlock method. No functionality · 239fdf0f
  Nick Lewycky authored Feb 06, 2012
```
change.

llvm-svn: 149861
```
  239fdf0f
Feb 05, 2012

Teach GlobalOpt to handle atomic accesses to globals. · 52da72b1

Nick Lewycky authored Feb 05, 2012

 * Most of the transforms come through intact by having each transformed load or
store copy the ordering and synchronization scope of the original.
 * The transform that turns a global only accessed in main() into an alloca
(since main is non-recursive) with a store of the initial value uses an
unordered store, since it's guaranteed to be the first thing to happen in main.
(Threads may have started before main (!) but they can't have the address of a
function local before the point in the entry block we insert our code.)
 * The heap-SRoA transforms are disabled in the face of atomic operations. This
can probably be improved; it seems odd to have atomic accesses to an alloca
that doesn't have its address taken.

AnalyzeGlobal keeps track of the strongest ordering found in any use of the
global. This is more information than we need right now, but it's cheap to
compute and likely to be useful.

llvm-svn: 149847

52da72b1

Clean up some whitespace and comments. No functionality change. · bbd1156b
Nick Lewycky authored Feb 05, 2012
```
llvm-svn: 149845
```
bbd1156b

Feb 01, 2012

SwitchInst refactoring. · 513aaa56

Stepan Dyatkovskiy authored Feb 01, 2012

The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.

What was done:

1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.

Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
llvm-svn: 149481

513aaa56

Add a basic-block autovectorization pass. · c34e5113

Hal Finkel authored Feb 01, 2012

This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).

llvm-svn: 149468

c34e5113

Jan 27, 2012
- continue making the world safe for ConstantDataVector. At this point, · 0256be96
  Chris Lattner authored Jan 27, 2012
```
we should (theoretically optimize and codegen ConstantDataVector as well
as ConstantVector.

llvm-svn: 149116
```
  0256be96
Jan 26, 2012
- Continue improving support for ConstantDataAggregate, and use the · fa77500d
  Chris Lattner authored Jan 26, 2012
```
new methods recently added to (sometimes greatly!) simplify code.

llvm-svn: 149024
```
  fa77500d
Jan 25, 2012
- use Constant::getAggregateElement to simplify a bunch of code. · 6705883a
  Chris Lattner authored Jan 25, 2012
```
llvm-svn: 148934
```
  6705883a
Jan 20, 2012
- More dead code removal (using -Wunreachable-code) · 46a9f016
  David Blaikie authored Jan 20, 2012
```
llvm-svn: 148578
```
  46a9f016
Jan 17, 2012

Add a new PassManagerBuilder customization point, · b9936296

Dan Gohman authored Jan 17, 2012

EP_ModuleOptimizerEarly, to allow passes to be added before the
main ModulePass optimizers.

llvm-svn: 148329

b9936296

Jan 11, 2012

Re-fix the issue Bill fixed in r147899 in a slightly different way, which... · b31c627b

Eli Friedman authored Jan 11, 2012

Re-fix the issue Bill fixed in r147899 in a slightly different way, which doesn't abuse the semantics of linker_private.  We don't really want to merge any string constant with a weak_odr global.

llvm-svn: 147971

b31c627b

If the global variable is removed by the linker, then don't constant merge it · c7915519

Bill Wendling authored Jan 11, 2012

with other symbols.

An object in the __cfstring section is suppoed to be filled with CFString
objects, which have a pointer to ___CFConstantStringClassReference followed by a
pointer to a __cstring. If we allow the object in the __cstring section to be
merged with another global, then it could end up in any section. Because the
linker is going to remove these symbols in the final executable, we shouldn't
bother to merge them.
<rdar://problem/10564621>

llvm-svn: 147899

c7915519

Jan 06, 2012

PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into... · 55fa49f3

Eli Friedman authored Jan 05, 2012

PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation.

llvm-svn: 147625

55fa49f3

Jan 05, 2012
- SCCCaptured is trivially false on entry to this loop and not modified inside it. · f740db31
  Nick Lewycky authored Jan 05, 2012
```
Eliminate the dead test for it on each loop iteration. No functionality change.

llvm-svn: 147616
```
  f740db31