- May 12, 2012
-
-
Jay Foad authored
the address of a function. llvm-svn: 156703
-
- May 11, 2012
-
-
Nuno Lopes authored
llvm-svn: 156625
-
Eli Friedman authored
llvm-svn: 156600
-
Nuno Lopes authored
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag llvm-svn: 156585
-
- May 10, 2012
-
-
Dan Gohman authored
llvm-svn: 156558
-
Nuno Lopes authored
llvm-svn: 156553
-
Dan Gohman authored
end of a basic block if there's no store. llvm-svn: 156520
-
- May 09, 2012
-
-
Nuno Lopes authored
refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515
-
Craig Topper authored
llvm-svn: 156466
-
Dan Gohman authored
llvm-svn: 156445
-
Dan Gohman authored
old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442
-
- May 08, 2012
-
-
Duncan Sands authored
replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379
-
Andrew Trick authored
llvm-svn: 156358
-
- May 07, 2012
-
-
Owen Anderson authored
Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323
-
- May 06, 2012
-
-
Benjamin Kramer authored
The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258
-
Jakub Staszak authored
llvm-svn: 156257
-
- May 05, 2012
-
-
Benjamin Kramer authored
This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234
-
Stepan Dyatkovskiy authored
Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231
-
- May 04, 2012
-
-
Chandler Carruth authored
RegionInfo's RegionNode. This mirrors the logic for automating the extraction from a Loop. llvm-svn: 156208
-
Chandler Carruth authored
of the CodeExtractor utility. This allows speculatively computing input and output sets to measure the likely size impact of the code extraction. These sets cannot be reused sadly -- we mutate the function prior to forming the final sets used by the actual extraction. The interface has been revamped slightly to make it easier to use correctly by making the interface const and sinking the computation of the number of exit blocks into the full extraction function and away from the rest of this logic which just computed two output parameters. llvm-svn: 156168
-
Chandler Carruth authored
blocks, assert that this doesn't happen. We don't want to bother trying to support this call pattern as it isn't necessary. llvm-svn: 156167
-
Chandler Carruth authored
detect an in-eligible block rather than just breaking out of the loop. llvm-svn: 156166
-
Chandler Carruth authored
of the extractor itself. llvm-svn: 156164
-
Chandler Carruth authored
and expose it as a utility class rather than as free function wrappers. The simple free-function interface works well for the bugpoint-specific pass's uses of code extraction, but in an upcoming patch for more advanced code extraction, they simply don't expose a rich enough interface. I need to expose various stages of the process of doing the code extraction and query information to decide whether or not to actually complete the extraction or give up. Rather than build up a new predicate model and pass that into these functions, just take the class that was actually implementing the functions and lift it up into a proper interface that can be used to perform code extraction. The interface is cleaned up and re-documented to work better in a header. It also is now setup to accept the blocks to be extracted in the constructor rather than in a method. In passing this essentially reverts my previous commit here exposing a block-level query for eligibility of extraction. That is no longer necessary with the more rich interface as clients can query the extraction object for eligibility directly. This will reduce the number of walks of the input basic block sequence by quite a bit which is useful if this enters the normal optimization pipeline. llvm-svn: 156163
-
Bill Wendling authored
Also combine the code in the 'assert' statement. llvm-svn: 156155
-
Chandler Carruth authored
minor behavior changes with this, but nothing I have seen evidence of in the wild or expect to be meaningful. The real goal is unifying our logic and simplifying the interfaces. A summary of the changes follows: - Make 'callIsSmall' actually accept a callsite so it can handle intrinsics, and simplify callers appropriately. - Nuke a completely bogus declaration of 'callIsSmall' that was still lurking in InlineCost.h... No idea how this got missed. - Teach the 'isInstructionFree' about the various more intelligent 'free' heuristics that got added to the inline cost analysis during review and testing. This mostly surrounds int->ptr and ptr->int casts. - Switch most of the interesting parts of the inline cost analysis that were essentially computing 'is this instruction free?' to use the code metrics routine instead. This way we won't keep duplicating logic. All of this is motivated by the desire to allow other passes to compute a roughly equivalent 'cost' metric for a particular basic block as the inline cost analysis. Sadly, re-using the same analysis for both is really messy because only the actual inline cost analysis is ever going to go to the contortions required for simplification, SROA analysis, etc. llvm-svn: 156140
-
Chandler Carruth authored
extraction into a public interface. Also clean it up and apply it more consistently such that we check for landing pads *anywhere* in the extracted code, not just in single-block extraction. This will be used to guide decisions in passes that are planning to eventually perform a round of code extraction. llvm-svn: 156114
-
Nuno Lopes authored
fix a few typos found by Chad in my previous commit llvm-svn: 156110
-
- May 03, 2012
-
-
Nuno Lopes authored
llvm-svn: 156102
-
Nuno Lopes authored
replace 'break's with 'return 0' in visitCallInst code for objectsize, since there is no need to fallback to visitCallSite. This gives a 0.9% in a test case llvm-svn: 156069
-
Bill Wendling authored
llvm-svn: 156034
-
- May 02, 2012
-
-
Kostya Serebryany authored
llvm-svn: 155986
-
Bill Wendling authored
methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984
-
- May 01, 2012
-
-
Nick Lewycky authored
has no exit blocks. Fixes PR12706! llvm-svn: 155884
-
Lang Hames authored
<rdar://problem/11291436>. This is a second attempt at a fix for this, the first was r155468. Thanks to Chandler, Bob and others for the feedback that helped me improve this. llvm-svn: 155866
-
- Apr 30, 2012
-
-
Bill Wendling authored
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is *sure* that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817
-
Bill Wendling authored
llvm-svn: 155816
-
Bill Wendling authored
llvm-svn: 155813
-
Rafael Espindola authored
inputs. llvm-svn: 155809
-
- Apr 27, 2012
-
-
Hal Finkel authored
Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729
-