- May 14, 2012
-
-
Chad Rosier authored
so that it can be reused in MemCpyOptimizer. This analysis is needed to remove an unnecessary memcpy when returning a struct into a local variable. rdar://11341081 PR12686 llvm-svn: 156776
-
- May 10, 2012
-
-
Dan Gohman authored
llvm-svn: 156558
-
Nuno Lopes authored
llvm-svn: 156553
-
Dan Gohman authored
end of a basic block if there's no store. llvm-svn: 156520
-
- May 09, 2012
-
-
Craig Topper authored
llvm-svn: 156466
-
Dan Gohman authored
llvm-svn: 156445
-
Dan Gohman authored
old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442
-
- May 08, 2012
-
-
Duncan Sands authored
replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379
-
- May 07, 2012
-
-
Owen Anderson authored
Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323
-
- May 06, 2012
-
-
Benjamin Kramer authored
The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
-
- May 05, 2012
-
-
Benjamin Kramer authored
This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
-
- May 04, 2012
-
-
Bill Wendling authored
Also combine the code in the 'assert' statement. llvm-svn: 156155
-
Chandler Carruth authored
minor behavior changes with this, but nothing I have seen evidence of in the wild or expect to be meaningful. The real goal is unifying our logic and simplifying the interfaces. A summary of the changes follows: - Make 'callIsSmall' actually accept a callsite so it can handle intrinsics, and simplify callers appropriately. - Nuke a completely bogus declaration of 'callIsSmall' that was still lurking in InlineCost.h... No idea how this got missed. - Teach the 'isInstructionFree' about the various more intelligent 'free' heuristics that got added to the inline cost analysis during review and testing. This mostly surrounds int->ptr and ptr->int casts. - Switch most of the interesting parts of the inline cost analysis that were essentially computing 'is this instruction free?' to use the code metrics routine instead. This way we won't keep duplicating logic. All of this is motivated by the desire to allow other passes to compute a roughly equivalent 'cost' metric for a particular basic block as the inline cost analysis. Sadly, re-using the same analysis for both is really messy because only the actual inline cost analysis is ever going to go to the contortions required for simplification, SROA analysis, etc. llvm-svn: 156140
-
- May 03, 2012
-
-
Bill Wendling authored
llvm-svn: 156034
-
- May 02, 2012
-
-
Bill Wendling authored
methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984
-
- May 01, 2012
-
-
Nick Lewycky authored
has no exit blocks. Fixes PR12706! llvm-svn: 155884
-
- Apr 30, 2012
-
-
Bill Wendling authored
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is *sure* that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817
-
Bill Wendling authored
llvm-svn: 155813
-
Rafael Espindola authored
inputs. llvm-svn: 155809
-
- Apr 27, 2012
-
-
David Blaikie authored
llvm-svn: 155727
-
Dan Gohman authored
llvm-svn: 155725
-
Mon P Wang authored
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow issues. <rdar://problem/11286839>. llvm-svn: 155722
-
Jakob Stoklund Olesen authored
The required checks are moved to ChainInstruction() itself and the policy decisions are moved to IVChain::isProfitableInc(). Also cache the ExprBase in IVChain to avoid frequent recomputations. No functional change intended. llvm-svn: 155676
-
Jakob Stoklund Olesen authored
No functional change intended. llvm-svn: 155675
-
- Apr 26, 2012
-
-
Chandler Carruth authored
elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616
-
- Apr 25, 2012
-
-
Jakob Stoklund Olesen authored
llvm-svn: 155567
-
Dan Gohman authored
of a precise count. Also, move RRInfo's Partial field into PtrState, now that it won't increase the size. llvm-svn: 155513
-
Dan Gohman authored
These lists exclude invoke unwind edges and loop backedges which are being ignored. This makes it easier to ignore them consistently. llvm-svn: 155500
-
- Apr 20, 2012
-
-
Bill Wendling authored
llvm-svn: 155166
-
- Apr 19, 2012
-
-
Dan Gohman authored
loop repeatedlt making the same change. This is for rdar://11256239. llvm-svn: 155160
-
Dan Gohman authored
a function with arguments. This fixes rdar://11265785. llvm-svn: 155073
-
- Apr 18, 2012
-
-
Bill Wendling authored
If the loop contains invoke instructions, whose unwind edge escapes the loop, then don't try to unswitch the loop. Doing so may cause the unwind edge to be split, which not only is non-trivial but doesn't preserve loop simplify information. Fixes PR12573 llvm-svn: 154987
-
Andrew Trick authored
This introduces a threshold of 200 IV Users, which is very conservative but should be sufficient to avoid serious compile time sink or stack overflow. The llvm test-suite with LTO never exceeds 190 users per loop. The bug doesn't relate to a specific type of loop. Checking in an arbitrary giant loop as a unit test would be silly. Fixes rdar://11262507. llvm-svn: 154983
-
Joe Groff authored
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint llvm-svn: 154960
-
- Apr 13, 2012
-
-
Dan Gohman authored
llvm-svn: 154687
-
Dan Gohman authored
their argument as "escape" points for objc_retainBlock optimization. This fixes rdar://11229925. llvm-svn: 154682
-
Dan Gohman authored
library return value optimization for phi uses. Even when the phi itself is not dominated, the specific use may be dominated. llvm-svn: 154647
-
Dan Gohman authored
optimizing autorelease calls on phi nodes with null operands. This fixes rdar://11207070. llvm-svn: 154642
-
- Apr 11, 2012
-
-
Chad Rosier authored
llvm-svn: 154522
-
- Apr 10, 2012
-
-
Andrew Trick authored
Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386
-