Commits · a968caf8e024c51cb90872fef60cd529043e25d0 · Roger Ferrer / llvm-epi-0.8

May 14, 2012

Move the capture analysis from MemoryDependencyAnalysis to a more general place · a968caf8

Chad Rosier authored May 14, 2012

so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686

llvm-svn: 156776

a968caf8

May 10, 2012
- Teach DeadStoreElimination to eliminate exit-block stores with phi addresses. · ed7c24e2
  Dan Gohman authored May 10, 2012
```
llvm-svn: 156558
```
  ed7c24e2
- teach DSE and isInstructionTriviallyDead() about calloc · 300d6299
  Nuno Lopes authored May 10, 2012
```
llvm-svn: 156553
```
  300d6299
- Fix the objc_storeStrong recognizer to stop before walking off the · f8b19d09
  Dan Gohman authored May 09, 2012
```
end of a basic block if there's no store.

llvm-svn: 156520
```
  f8b19d09
May 09, 2012
- Remove unused variable to get rid of warning. · 28540adf
  Craig Topper authored May 09, 2012
```
llvm-svn: 156466
```
  28540adf
- Miscellaneous accumulated cleanups. · 41375a35
  Dan Gohman authored May 08, 2012
```
llvm-svn: 156445
```
  41375a35
- Fix objc_storeStrong pattern matching to catch a potential use of the · 61708d37
  Dan Gohman authored May 08, 2012
```
old value after the store but before it is released.
This fixes rdar:/11116986.

llvm-svn: 156442
```
  61708d37
May 08, 2012

Calling ReassociateExpression recursively is extremely dangerous since it will · 3bbb1d50

Duncan Sands authored May 08, 2012

replace the operands of expressions with only one use with undef and generate
a new expression for the original without using RAUW to update the original.
Thus any copies of the original expression held in a vector may end up
referring to some bogus value - and using a ValueHandle won't help since there
is no RAUW. There is already a mechanism for getting the effect of recursion
non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't
being used systematically. Have various places where recursion had snuck in at
some point use the RedoInsts mechanism instead. Fixes PR12169.

llvm-svn: 156379

3bbb1d50

May 07, 2012

Teach reassociate to commute FMul's and FAdd's in order to canonicalize the... · f4f80e1f

Owen Anderson authored May 07, 2012

Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions.  This allows for greater CSE opportunities.

llvm-svn: 156323

f4f80e1f

May 06, 2012

Switch the select to branch transformation on by default. · 3d38c17b

Benjamin Kramer authored May 06, 2012

The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!

llvm-svn: 156258

3d38c17b

May 05, 2012

CodeGenPrepare: Add a transform to turn selects into branches in some cases. · 047d7ca0

Benjamin Kramer authored May 05, 2012

This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

llvm-svn: 156234

047d7ca0

May 04, 2012

Add 'landingpad' instructions to the list of instructions to ignore. · fa0ebcd1
Bill Wendling authored May 04, 2012
```
Also combine the code in the 'assert' statement.

llvm-svn: 156155
```
fa0ebcd1

A pile of long over-due refactorings here. There are some very, *very* · da7513a8

Chandler Carruth authored May 04, 2012

minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:

- Make 'callIsSmall' actually accept a callsite so it can handle
  intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
  lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
  'free' heuristics that got added to the inline cost analysis during
  review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
  were essentially computing 'is this instruction free?' to use the code
  metrics routine instead. This way we won't keep duplicating logic.

All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.

llvm-svn: 156140

da7513a8

May 03, 2012
- Whitespace cleanup. · c94d86c4
  Bill Wendling authored May 02, 2012
```
llvm-svn: 156034
```
  c94d86c4
May 02, 2012
- The value held in the vector may be RAUW'ed by some of the canonicalization · 274ba89d
  Bill Wendling authored May 02, 2012
```
methods. Use a weak value handle to keep up with this.
PR12245

llvm-svn: 155984
```
  274ba89d
May 01, 2012
- An instruction in a loop is not guaranteed to be executed just because the loop · 78ee67e8
  Nick Lewycky authored May 01, 2012
```
has no exit blocks. Fixes PR12706!

llvm-svn: 155884
```
  78ee67e8
Apr 30, 2012

Second attempt at PR12573: · bf4b9afb

Bill Wendling authored Apr 30, 2012

Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If
the pass is *sure* that it thinks it knows what it's doing, then it may go ahead
and specify that the landing pad can have its critical edge split. The loop
unswitch pass is one of these passes. It will split the critical edges of all
edges coming from a loop to a landing pad not within the loop. Doing so will
retain important loop analysis information, such as loop simplify.

llvm-svn: 155817

bf4b9afb

Remove hack from r154987. The problem persists even with it, so it's not even a good hack. · 712d85a8
Bill Wendling authored Apr 30, 2012
```
llvm-svn: 155813
```
712d85a8
Make sure HoistInsertPosition finds a position that is dominated by all · dd489314
Rafael Espindola authored Apr 30, 2012
```
inputs.

llvm-svn: 155809
```
dd489314

Apr 27, 2012
- Change recurse depth limit to uint32 to fix warning. · 84e4b399
  David Blaikie authored Apr 27, 2012
```
llvm-svn: 155727
```
  84e4b399
- Miscellaneous accumulated cleanups. · dae3349a
  Dan Gohman authored Apr 27, 2012
```
llvm-svn: 155725
```
  dae3349a
- Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks. · 6120cfb8
  Mon P Wang authored Apr 27, 2012
```
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.

llvm-svn: 155722
```
  6120cfb8
- Break up getProfitableChainIncrement(). · c90abc89
  Jakob Stoklund Olesen authored Apr 26, 2012
```
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().

Also cache the ExprBase in IVChain to avoid frequent recomputations.

No functional change intended.

llvm-svn: 155676
```
  c90abc89
- Turn IVChain into a struct. · a0337d7b
  Jakob Stoklund Olesen authored Apr 26, 2012
```
No functional change intended.

llvm-svn: 155675
```
  a0337d7b
Apr 26, 2012

Teach the reassociate pass to fold chains of multiplies with repeated · 739ef80f

Chandler Carruth authored Apr 26, 2012

elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

llvm-svn: 155616

739ef80f

Apr 25, 2012
- Print IV chain numbers while collecting them. · 293673d7
  Jakob Stoklund Olesen authored Apr 25, 2012
```
llvm-svn: 155567
```
  293673d7
- Simplify the known retain count tracking; use a boolean state instead · 62079b43
  Dan Gohman authored Apr 25, 2012
```
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.

llvm-svn: 155513
```
  62079b43
- Build custom predecessor and successor lists for each basic block. · c24c66f2
  Dan Gohman authored Apr 24, 2012
```
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.

llvm-svn: 155500
```
  c24c66f2
Apr 20, 2012
- Put this expensive check below the less expensive ones. · 9f975952
  Bill Wendling authored Apr 19, 2012
```
llvm-svn: 155166
```
  9f975952
Apr 19, 2012
- Avoid a bug in the path count computation, preventing an infinite · 26aa8274
  Dan Gohman authored Apr 19, 2012
```
loop repeatedlt making the same change. This is for rdar://11256239.

llvm-svn: 155160
```
  26aa8274
- Don't crash on code where the user put __attribute__((constructor)) on · 22fbe8d7
  Dan Gohman authored Apr 18, 2012
```
a function with arguments. This fixes rdar://11265785.

llvm-svn: 155073
```
  22fbe8d7
Apr 18, 2012

Use a heavy hammer to fix PR12573. · 4d4d0257

Bill Wendling authored Apr 18, 2012

If the loop contains invoke instructions, whose unwind edge escapes the loop,
then don't try to unswitch the loop. Doing so may cause the unwind edge to be
split, which not only is non-trivial but doesn't preserve loop simplify
information.

Fixes PR12573

llvm-svn: 154987

4d4d0257

loop-reduce: Add an early bailout to catch extremely large loops. · 19f80c1e

Andrew Trick authored Apr 18, 2012

This introduces a threshold of 200 IV Users, which is very
conservative but should be sufficient to avoid serious compile time
sink or stack overflow. The llvm test-suite with LTO never exceeds 190
users per loop.

The bug doesn't relate to a specific type of loop. Checking in an
arbitrary giant loop as a unit test would be silly.

Fixes rdar://11262507.

llvm-svn: 154983

19f80c1e

fix pr12559: mark unavailable win32 math libcalls · a81bcbb9

Joe Groff authored Apr 17, 2012

also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

llvm-svn: 154960

a81bcbb9

Apr 13, 2012
- Add some comments, and fix a few places that missed setting Changed. · 670f9374
  Dan Gohman authored Apr 13, 2012
```
llvm-svn: 154687
```
  670f9374
- Consider ObjC runtime calls objc_storeWeak and others which make a copy of · e1e352af
  Dan Gohman authored Apr 13, 2012
```
their argument as "escape" points for objc_retainBlock optimization.
This fixes rdar://11229925.

llvm-svn: 154682
```
  e1e352af
- Use the new Use-aware dominates method to apply the objc runtime · de8d2c44
  Dan Gohman authored Apr 13, 2012
```
library return value optimization for phi uses. Even when the
phi itself is not dominated, the specific use may be dominated.

llvm-svn: 154647
```
  de8d2c44
- Don't move objc_autorelease calls past autorelease pool boundaries when · 8478d76d
  Dan Gohman authored Apr 13, 2012
```
optimizing autorelease calls on phi nodes with null operands.
This fixes rdar://11207070.

llvm-svn: 154642
```
  8478d76d
Apr 11, 2012
- Typo. · cc899f3b
  Chad Rosier authored Apr 11, 2012
```
llvm-svn: 154522
```
  cc899f3b
Apr 10, 2012

Fix 12513: Loop unrolling breaks with indirect branches. · 4442bfe5

Andrew Trick authored Apr 10, 2012

Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

llvm-svn: 154386

4442bfe5