- Jun 01, 2013
-
-
David Majnemer authored
llvm-svn: 183078
-
- May 23, 2013
-
-
Benjamin Kramer authored
llvm-svn: 182590
-
- Apr 29, 2013
-
-
Arnold Schwaighofer authored
This resurrects r179957, but adds code that makes sure we don't touch atomic/volatile stores: This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case where the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. llvm-svn: 180731
-
- Apr 21, 2013
-
-
Arnold Schwaighofer authored
There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980
-
- Apr 20, 2013
-
-
Arnold Schwaighofer authored
This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957
-
- Apr 16, 2013
-
-
Hans Wennborg authored
If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! llvm-svn: 179587
-
- Mar 11, 2013
-
-
Bill Wendling authored
An invoke may require a table entry. For instance, when the function it calls is expected to throw. <rdar://problem/13360379> llvm-svn: 176827
-
- Mar 07, 2013
-
-
Andrew Trick authored
Fixes rdar:13349374. Volatile loads and stores need to be preserved even if the language standard says they are undefined. "volatile" in this context means "get out of the way compiler, let my platform handle it". Additionally, this is the only way I know of with llvm to write to the first page (when hardware allows) without dropping to assembly. llvm-svn: 176599
-
- Jan 27, 2013
-
-
Chandler Carruth authored
out bug fixes, or functionality preserving refactorings. llvm-svn: 173610
-
- Jan 25, 2013
-
-
Chandler Carruth authored
loops over instructions in the basic block or the use-def list of the value, neither of which are really efficient when repeatedly querying about values in the same basic block. What's more, we already know that the CondBB is small, and so we can do a much more efficient test by counting the uses in CondBB, and seeing if those account for all of the uses. Finally, we shouldn't blanket fail on any such instruction, instead we should conservatively assume that those instructions are part of the cost. Note that this actually fixes a bug in the pass because isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my next commit, but the fix for it would make this code suddenly take the compile time hit I thought it already was taking, so I wanted to go ahead and migrate this code to a faster & better pattern. The bug in isUsedInBasicBlock was also causing other tests to test the wrong thing entirely: for example we weren't actually disabling speculation for floating point operations as intended (and tested), but the test passed because we failed to speculate them due to the isUsedInBasicBlock failure. llvm-svn: 173417
-
- Jan 24, 2013
-
-
Benjamin Kramer authored
Original commit message: Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173357
-
Chandler Carruth authored
of stage2 in a bootstrap. Still investigating.... llvm-svn: 173343
-
Chandler Carruth authored
that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173342
-
Chandler Carruth authored
unfolded constant expressions rather than checking each one independently. llvm-svn: 173341
-
Chandler Carruth authored
a cost fuction that seems both a bit ad-hoc and also poorly suited to evaluating constant expressions. Notably, it is missing any support for trivial expressions such as 'inttoptr'. I could fix this routine, but it isn't clear to me all of the constraints its other users are operating under. The core protection that seems relevant here is avoiding the formation of a select instruction wich a further chain of select operations in a constant expression operand. Just explicitly encode that constraint. Also, update the comments and organization here to make it clear where this needs to go -- this should be driven off of real cost measurements which take into account the number of constants expressions and the depth of the constant expression tree. llvm-svn: 173340
-
Chandler Carruth authored
terms of cost rather than hoisting a single instruction. This does *not* change the cost model! We still set the cost threshold at 1 here, it's just that we track it by accumulating cost rather than by storing an instruction. The primary advantage is that we no longer leave no-op intrinsics in the basic block. For example, this will now move both debug info intrinsics and a single instruction, instead of only moving the instruction and leaving a basic block with nothing bug debug info intrinsics in it, and those intrinsics now no longer ordered correctly with the hoisted value. Instead, we now splice the entire conditional basic block's instruction sequence. This also places the code for checking the safety of hoisting next to the code computing the cost. Currently, the only observable side-effect of this change is that debug info intrinsics are no longer abandoned. I'm not sure how to craft a test case for this, and my real goal was the refactoring, but I'll talk to Dave or Eric about how to add a test case for this. llvm-svn: 173339
-
Chandler Carruth authored
Previously, the code would scan the PHI nodes and build up a small setvector of candidate value pairs in phi nodes to go and rewrite. Once certain the rewrite could be performed, the code walks the set, and for each one re-scans the entire PHI node list looking for nodes to rewrite operands. Instead, scan the PHI nodes once to check for hazards, and then scan it a second time to rewrite the operands to selects. No set vector, and a max of two scans. The only downside is that we might form identical selects, but instcombine or anything else should fold those easily, and it seems unlikely to happen often. llvm-svn: 173337
-
Chandler Carruth authored
structure being analyzed. No functionality changed. llvm-svn: 173334
-
Chandler Carruth authored
tests. No need to pay the high cost when we're never going to do anything. No functionality changed. llvm-svn: 173331
-
Chandler Carruth authored
pretty in doxygen, adding some of the details actually present in a classic example where this matters (a loop from gzip and many other compression algorithms), and a cautionary note about the risks inherent in the transform. This has come up on the mailing lists recently, and I suspect folks reading this code could benefit from going and looking at the MI pass that can really deal with these issues. llvm-svn: 173329
-
- Jan 23, 2013
-
-
Duncan Sands authored
used uninitialized, since it fails to understand that Array is only used when SingleValue is not, and outputs a warning. It also seems generally safer given that the constructor is non-trivial and has plenty of early exits. llvm-svn: 173242
-
- Jan 07, 2013
-
-
Chandler Carruth authored
through as a reference rather than a pointer. There is always *some* implementation of this available, so this simplifies code by not having to test for whether it is available or not. Further, it turns out there were piles of places where SimplifyCFG was recursing and not passing down either TD or TTI. These are fixed to be more pedantically consistent even though I don't have any particular cases where it would matter. llvm-svn: 171691
-
Chandler Carruth authored
longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686
-
- Jan 05, 2013
-
-
Chandler Carruth authored
the ScalarTargetTransformInfo interface. llvm-svn: 171617
-
- Jan 02, 2013
-
-
Chandler Carruth authored
into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
-
- Dec 03, 2012
-
-
Chandler Carruth authored
Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
-
- Nov 30, 2012
-
-
Chandler Carruth authored
We're iterating over a non-deterministically ordered container looking for two saturating flags. To do this correctly, we have to saturate both, and only stop looping if both saturate to their final value. Otherwise, which flag we see first changes the result. This is also a micro-optimization of the previous version as now we don't go into the (possibly expensive) test logic once the first violation of either constraint is detected. llvm-svn: 168989
-
Chandler Carruth authored
functionality changed. Evan's commit r168970 moved the code that the primary comment in this function referred to to the other end of the function without moving the comment, and there has been a steady creep of "boolean" logic in it that is simpler if handled via early exit. That way each special case can have its own comments. I've also made the variable name a bit more explanatory than "AllFit". This is in preparation to fix the non-deterministic output of this function. llvm-svn: 168988
-
Evan Cheng authored
the tables cannot fit in registers (i.e. bitmap), do not emit the table if it's using an illegal type. rdar://12779436 llvm-svn: 168970
-
- Nov 16, 2012
-
-
Hans Wennborg authored
Patch by Pekka Jääskeläinen! llvm-svn: 168176
-
- Nov 15, 2012
-
-
Andrew Trick authored
llvm-svn: 168058
-
Andrew Trick authored
llvm-svn: 168057
-
- Nov 07, 2012
-
-
Hans Wennborg authored
is available. llvm-svn: 167552
-
- Nov 01, 2012
-
-
Chandler Carruth authored
getIntPtrType support for multiple address spaces via a pointer type, and also introduced a crasher bug in the constant folder reported in PR14233. These commits also contained several problems that should really be addressed before they are re-committed. I have avoided reverting various cleanups to the DataLayout APIs that are reasonable to have moving forward in order to reduce the amount of churn, and minimize the number of commits that were reverted. I've also manually updated merge conflicts and manually arranged for the getIntPtrType function to stay in DataLayout and to be defined in a plausible way after this revert. Thanks to Duncan for working through this exact strategy with me, and Nick Lewycky for tracking down the really annoying crasher this triggered. (Test case to follow in its own commit.) After discussing with Duncan extensively, and based on a note from Micah, I'm going to continue to back out some more of the more problematic patches in this series in order to ensure we go into the LLVM 3.2 branch with a reasonable story here. I'll send a note to llvmdev explaining what's going on and why. Summary of reverted revisions: r166634: Fix a compiler warning with an unused variable. r166607: Add some cleanup to the DataLayout changes requested by Chandler. r166596: Revert "Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! r166591: Delete a directory that wasn't supposed to be checked in yet. r166578: Add in support for getIntPtrType to get the pointer type based on the address space. llvm-svn: 167221
-
- Oct 31, 2012
-
-
Hans Wennborg authored
SimplifyCFG will have removed those cases for us. llvm-svn: 167132
-
Hans Wennborg authored
llvm-svn: 167130
-
Hans Wennborg authored
- Use 0 instead of NULL - Helper function for "dyn_cast, else lookup in the constant pool". llvm-svn: 167121
-
Hans Wennborg authored
llvm-svn: 167117
-
Hans Wennborg authored
By propagating the value for the switch condition, LLVM can now build lookup tables for code such as: switch (x) { case 1: return 5; case 2: return 42; case 3: case 4: case 5: return x - 123; default: return 123; } Given that x is known for each case, "x - 123" becomes a constant for cases 3, 4, and 5. llvm-svn: 167115
-
- Oct 30, 2012
-
-
Hans Wennborg authored
When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. llvm-svn: 167011
-