- Apr 17, 2013
-
-
Peter Collingbourne authored
Differential Revision: http://llvm-reviews.chandlerc.com/D620 llvm-svn: 179661
-
- Apr 16, 2013
-
-
Hans Wennborg authored
If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! llvm-svn: 179587
-
- Apr 10, 2013
-
-
Joey Gouly authored
rather than checking if the source and destination have the same number of arguments and copying the attributes over directly. llvm-svn: 179169
-
- Mar 22, 2013
-
-
Bill Wendling authored
llvm-svn: 177757
-
Bill Wendling authored
llvm-svn: 177749
-
Evgeniy Stepanov authored
llvm-svn: 177713
-
Bill Wendling authored
How did this ever work? Basically, if you have a function that's inlined into the caller, it may not have any 'call' instructions, but any 'resume' instructions it may have should still be forwarded to the outer (caller's) landing pad. This requires that all of the 'landingpad' instructions in the callee have their clauses merged with the caller's outer 'landingpad' instruction (hence the bit of ugly code in the `forwardResume' method). Testcase in a follow commit to the test-suite repository. <rdar://problem/13360379> & PR15555 llvm-svn: 177680
-
- Mar 12, 2013
-
-
Meador Inge authored
Nadav reported a performance regression due to the work I did to merge the library call simplifier into instcombine [1]. The issue is that a new LibCallSimplifier object is being created whenever InstCombiner::runOnFunction is called. Every time a LibCallSimplifier object is used to optimize a call it creates a hash table to map from a function name to an object that optimizes functions of that name. For short-lived LibCallSimplifier instances this is quite inefficient. Especially for cases where no calls are actually simplified. This patch fixes the issue by dropping the hash table and implementing an explicit lookup function to correlate the function name to the object that optimizes functions of that name. This avoids the cost of always building and destroying the hash table in cases where the LibCallSimplifier object is short-lived and avoids the cost of building the table when no simplifications are actually preformed. On a benchmark containing 100,000 calls where none of them are simplified I noticed a 30% speedup. On a benchmark containing 100,000 calls where all of them are simplified I noticed an 8% speedup. [1] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130304/167639.html llvm-svn: 176840
-
- Mar 11, 2013
-
-
Bill Wendling authored
An invoke may require a table entry. For instance, when the function it calls is expected to throw. <rdar://problem/13360379> llvm-svn: 176827
-
- Mar 07, 2013
-
-
Pekka Jaaskelainen authored
different size argument list and without attributes in the arguments. llvm-svn: 176632
-
Andrew Trick authored
Fixes rdar:13349374. Volatile loads and stores need to be preserved even if the language standard says they are undefined. "volatile" in this context means "get out of the way compiler, let my platform handle it". Additionally, this is the only way I know of with llvm to write to the first page (when hardware allows) without dropping to assembly. llvm-svn: 176599
-
- Mar 04, 2013
-
-
Preston Gurd authored
* Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
-
- Mar 02, 2013
-
-
Peter Collingbourne authored
llvm-svn: 176397
-
- Feb 27, 2013
-
-
Nadav Rotem authored
For each function that we optimize we initialize a new list of lib functions. For each function name we malloc memory. This patch changes the Libcall map to use BumpPtrAllocator. Now we malloc only once. This speeds up instcombine by a few % on a large c++ program. llvm-svn: 176170
-
Pedro Artigas authored
enhancement done the trivial way; by extending inputs and truncating outputs which is addequate for targets with little or no support for integer arithmetic on integer types less than 32 bits. llvm-svn: 176139
-
- Feb 22, 2013
-
-
Bill Wendling authored
The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835
-
- Feb 19, 2013
-
-
Bill Wendling authored
llvm-svn: 175476
-
Bill Wendling authored
llvm-svn: 175470
-
- Feb 09, 2013
-
-
Jakub Staszak authored
llvm-svn: 174786
-
- Feb 08, 2013
-
-
Chad Rosier authored
isn't using the default calling convention. However, if the transformation is from a call to inline IR, then the calling convention doesn't matter. rdar://13157990 llvm-svn: 174724
-
- Feb 05, 2013
-
-
Chad Rosier authored
edge is critical, then split it so we can insert the store. rdar://13126179 llvm-svn: 174418
-
- Jan 31, 2013
-
-
Manman Ren authored
This is a re-worked version of r174048. Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. rdar://problem/13089880 llvm-svn: 174093
-
Alexey Samsonov authored
llvm-svn: 174048
-
Bill Wendling authored
llvm-svn: 173992
-
- Jan 30, 2013
-
-
Manman Ren authored
Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. llvm-svn: 173946
-
- Jan 27, 2013
-
-
Chandler Carruth authored
out bug fixes, or functionality preserving refactorings. llvm-svn: 173610
-
- Jan 26, 2013
-
-
Bill Wendling authored
llvm-svn: 173536
-
- Jan 25, 2013
-
-
Chandler Carruth authored
loops over instructions in the basic block or the use-def list of the value, neither of which are really efficient when repeatedly querying about values in the same basic block. What's more, we already know that the CondBB is small, and so we can do a much more efficient test by counting the uses in CondBB, and seeing if those account for all of the uses. Finally, we shouldn't blanket fail on any such instruction, instead we should conservatively assume that those instructions are part of the cost. Note that this actually fixes a bug in the pass because isUsedInBasicBlock has a really terrible bug in it. I'll fix that in my next commit, but the fix for it would make this code suddenly take the compile time hit I thought it already was taking, so I wanted to go ahead and migrate this code to a faster & better pattern. The bug in isUsedInBasicBlock was also causing other tests to test the wrong thing entirely: for example we weren't actually disabling speculation for floating point operations as intended (and tested), but the test passed because we failed to speculate them due to the isUsedInBasicBlock failure. llvm-svn: 173417
-
- Jan 24, 2013
-
-
Benjamin Kramer authored
Original commit message: Plug TTI into the speculation logic, giving it a real cost interface that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173357
-
Chandler Carruth authored
of stage2 in a bootstrap. Still investigating.... llvm-svn: 173343
-
Chandler Carruth authored
that can be specialized by targets. The goal here is not to be more aggressive, but to just be more accurate with very obvious cases. There are instructions which are known to be truly free and which were not being modeled as such in this code -- see the regression test which is distilled from an inner loop of zlib. Everywhere the TTI cost model is insufficiently conservative I've added explicit checks with FIXME comments to go add proper modelling of these cost factors. If this causes regressions, the likely solution is to make TTI even more conservative in its cost estimates, but test cases will help here. llvm-svn: 173342
-
Chandler Carruth authored
unfolded constant expressions rather than checking each one independently. llvm-svn: 173341
-
Chandler Carruth authored
a cost fuction that seems both a bit ad-hoc and also poorly suited to evaluating constant expressions. Notably, it is missing any support for trivial expressions such as 'inttoptr'. I could fix this routine, but it isn't clear to me all of the constraints its other users are operating under. The core protection that seems relevant here is avoiding the formation of a select instruction wich a further chain of select operations in a constant expression operand. Just explicitly encode that constraint. Also, update the comments and organization here to make it clear where this needs to go -- this should be driven off of real cost measurements which take into account the number of constants expressions and the depth of the constant expression tree. llvm-svn: 173340
-
Chandler Carruth authored
terms of cost rather than hoisting a single instruction. This does *not* change the cost model! We still set the cost threshold at 1 here, it's just that we track it by accumulating cost rather than by storing an instruction. The primary advantage is that we no longer leave no-op intrinsics in the basic block. For example, this will now move both debug info intrinsics and a single instruction, instead of only moving the instruction and leaving a basic block with nothing bug debug info intrinsics in it, and those intrinsics now no longer ordered correctly with the hoisted value. Instead, we now splice the entire conditional basic block's instruction sequence. This also places the code for checking the safety of hoisting next to the code computing the cost. Currently, the only observable side-effect of this change is that debug info intrinsics are no longer abandoned. I'm not sure how to craft a test case for this, and my real goal was the refactoring, but I'll talk to Dave or Eric about how to add a test case for this. llvm-svn: 173339
-
Chandler Carruth authored
Previously, the code would scan the PHI nodes and build up a small setvector of candidate value pairs in phi nodes to go and rewrite. Once certain the rewrite could be performed, the code walks the set, and for each one re-scans the entire PHI node list looking for nodes to rewrite operands. Instead, scan the PHI nodes once to check for hazards, and then scan it a second time to rewrite the operands to selects. No set vector, and a max of two scans. The only downside is that we might form identical selects, but instcombine or anything else should fold those easily, and it seems unlikely to happen often. llvm-svn: 173337
-
Chandler Carruth authored
structure being analyzed. No functionality changed. llvm-svn: 173334
-
Chandler Carruth authored
tests. No need to pay the high cost when we're never going to do anything. No functionality changed. llvm-svn: 173331
-
Chandler Carruth authored
pretty in doxygen, adding some of the details actually present in a classic example where this matters (a loop from gzip and many other compression algorithms), and a cautionary note about the risks inherent in the transform. This has come up on the mailing lists recently, and I suspect folks reading this code could benefit from going and looking at the MI pass that can really deal with these issues. llvm-svn: 173329
-
- Jan 23, 2013
-
-
Anton Korobeynikov authored
Otherwise this might hide the problems. llvm-svn: 173265
-
Duncan Sands authored
used uninitialized, since it fails to understand that Array is only used when SingleValue is not, and outputs a warning. It also seems generally safer given that the constructor is non-trivial and has plenty of early exits. llvm-svn: 173242
-