- May 20, 2013
-
-
Hal Finkel authored
As discussed, LoopUtils.h is a better name. llvm-svn: 182314
-
Hal Finkel authored
Other passes, PPC counter-loop formation for example, also need to add loop preheaders outside of the regular loop simplification pass. This makes InsertPreheaderForLoop a global function so that it can be used by other passes. No functionality change intended. llvm-svn: 182299
-
- May 05, 2013
-
-
Dmitri Gribenko authored
Patch by Robert Wilhelm. llvm-svn: 181138
-
- May 01, 2013
-
-
Filip Pizlo authored
the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881
-
Richard Trieu authored
prevent this, capture the location before RI is freed. llvm-svn: 180824
-
- Apr 30, 2013
-
-
Adrian Prantl authored
llvm-svn: 180794
-
Adrian Prantl authored
the inlined function has multiple returns. rdar://problem/12415623 llvm-svn: 180793
-
- Apr 29, 2013
-
-
Arnold Schwaighofer authored
This resurrects r179957, but adds code that makes sure we don't touch atomic/volatile stores: This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case where the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. llvm-svn: 180731
-
- Apr 26, 2013
-
-
Adrian Prantl authored
rdar://problem/13056109 llvm-svn: 180618
-
Adrian Prantl authored
Since we can't guarantee that the original dbg.declare instrinsic is removed by LowerDbgDeclare(), we need to make sure that we are not inserting the same dbg.value intrinsic over and over. This removes tons of redundant DIEs when compiling optimized code. rdar://problem/13056109 llvm-svn: 180615
-
- Apr 23, 2013
-
-
Adrian Prantl authored
debug location. This solves a problem where range of an inlined subroutine is emitted wrongly. Patch by Manman Ren. Fixes rdar://problem/12415623 llvm-svn: 180140
-
Eric Christopher authored
or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063
-
- Apr 21, 2013
-
-
Arnold Schwaighofer authored
There is the temptation to make this tranform dependent on target information as it is not going to be beneficial on all (sub)targets. Therefore, we should probably do this in MI Early-Ifconversion. This reverts commit r179957. Original commit message: "SimplifyCFG: If convert single conditional stores This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up." llvm-svn: 179980
-
- Apr 20, 2013
-
-
Arnold Schwaighofer authored
This transformation will transform a conditional store with a preceeding uncondtional store to the same location: a[i] = may-alias with a[i] load if (cond) a[i] = Y into an unconditional store. a[i] = X may-alias with a[i] load tmp = cond ? Y : X; a[i] = tmp We assume that on average the cost of a mispredicted branch is going to be higher than the cost of a second store to the same location, and that the secondary benefits of creating a bigger basic block for other optimizations to work on outway the potential case were the branch would be correctly predicted and the cost of the executing the second store would be noticably reflected in performance. hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With this change we are on par with gcc's performance (gcc also performs this transformation). There was a 1.2 % performance improvement on a ARM swift chip. Other tests in the test-suite+external seem to be mostly uninfluenced in my experiments: This optimization was triggered on 41 tests such that the executable was different before/after the patch. Only 1 out of the 40 tests (dealII) was reproducable below 100% (by about .4%). Given that hmmer benefits so much I believe this to be a fair trade off. I am going to watch performance numbers across the builtbots and will revert this if anything unexpected comes up. llvm-svn: 179957
-
- Apr 17, 2013
-
-
Peter Collingbourne authored
Differential Revision: http://llvm-reviews.chandlerc.com/D620 llvm-svn: 179661
-
- Apr 16, 2013
-
-
Hans Wennborg authored
If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! llvm-svn: 179587
-
- Apr 10, 2013
-
-
Joey Gouly authored
rather than checking if the source and destination have the same number of arguments and copying the attributes over directly. llvm-svn: 179169
-
- Mar 22, 2013
-
-
Bill Wendling authored
llvm-svn: 177757
-
Bill Wendling authored
llvm-svn: 177749
-
Evgeniy Stepanov authored
llvm-svn: 177713
-
Bill Wendling authored
How did this ever work? Basically, if you have a function that's inlined into the caller, it may not have any 'call' instructions, but any 'resume' instructions it may have should still be forwarded to the outer (caller's) landing pad. This requires that all of the 'landingpad' instructions in the callee have their clauses merged with the caller's outer 'landingpad' instruction (hence the bit of ugly code in the `forwardResume' method). Testcase in a follow commit to the test-suite repository. <rdar://problem/13360379> & PR15555 llvm-svn: 177680
-
- Mar 12, 2013
-
-
Meador Inge authored
Nadav reported a performance regression due to the work I did to merge the library call simplifier into instcombine [1]. The issue is that a new LibCallSimplifier object is being created whenever InstCombiner::runOnFunction is called. Every time a LibCallSimplifier object is used to optimize a call it creates a hash table to map from a function name to an object that optimizes functions of that name. For short-lived LibCallSimplifier instances this is quite inefficient. Especially for cases where no calls are actually simplified. This patch fixes the issue by dropping the hash table and implementing an explicit lookup function to correlate the function name to the object that optimizes functions of that name. This avoids the cost of always building and destroying the hash table in cases where the LibCallSimplifier object is short-lived and avoids the cost of building the table when no simplifications are actually preformed. On a benchmark containing 100,000 calls where none of them are simplified I noticed a 30% speedup. On a benchmark containing 100,000 calls where all of them are simplified I noticed an 8% speedup. [1] http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130304/167639.html llvm-svn: 176840
-
- Mar 11, 2013
-
-
Bill Wendling authored
An invoke may require a table entry. For instance, when the function it calls is expected to throw. <rdar://problem/13360379> llvm-svn: 176827
-
- Mar 07, 2013
-
-
Pekka Jaaskelainen authored
different size argument list and without attributes in the arguments. llvm-svn: 176632
-
Andrew Trick authored
Fixes rdar:13349374. Volatile loads and stores need to be preserved even if the language standard says they are undefined. "volatile" in this context means "get out of the way compiler, let my platform handle it". Additionally, this is the only way I know of with llvm to write to the first page (when hardware allows) without dropping to assembly. llvm-svn: 176599
-
- Mar 04, 2013
-
-
Preston Gurd authored
* Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
-
- Mar 02, 2013
-
-
Peter Collingbourne authored
llvm-svn: 176397
-
- Feb 27, 2013
-
-
Nadav Rotem authored
For each function that we optimize we initialize a new list of lib functions. For each function name we malloc memory. This patch changes the Libcall map to use BumpPtrAllocator. Now we malloc only once. This speeds up instcombine by a few % on a large c++ program. llvm-svn: 176170
-
Pedro Artigas authored
enhancement done the trivial way; by extending inputs and truncating outputs which is addequate for targets with little or no support for integer arithmetic on integer types less than 32 bits. llvm-svn: 176139
-
- Feb 22, 2013
-
-
Bill Wendling authored
The 'nobuiltin' attribute is applied to call sites to indicate that LLVM should not treat the callee function as a built-in function. I.e., it shouldn't try to replace that function with different code. llvm-svn: 175835
-
- Feb 19, 2013
-
-
Bill Wendling authored
llvm-svn: 175476
-
Bill Wendling authored
llvm-svn: 175470
-
- Feb 09, 2013
-
-
Jakub Staszak authored
llvm-svn: 174786
-
- Feb 08, 2013
-
-
Chad Rosier authored
isn't using the default calling convention. However, if the transformation is from a call to inline IR, then the calling convention doesn't matter. rdar://13157990 llvm-svn: 174724
-
- Feb 05, 2013
-
-
Chad Rosier authored
edge is critical, then split it so we can insert the store. rdar://13126179 llvm-svn: 174418
-
- Jan 31, 2013
-
-
Manman Ren authored
This is a re-worked version of r174048. Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. rdar://problem/13089880 llvm-svn: 174093
-
Alexey Samsonov authored
llvm-svn: 174048
-
Bill Wendling authored
llvm-svn: 173992
-
- Jan 30, 2013
-
-
Manman Ren authored
Given source IR: call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15 we used to generate call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29 !27 = metadata !{null} With this patch, we will correctly generate call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28 Looking up %argc.addr in ValueMap will return null, since %argc.addr is already correctly set up, we can use identity mapping. llvm-svn: 173946
-
- Jan 27, 2013
-
-
Chandler Carruth authored
out bug fixes, or functionality preserving refactorings. llvm-svn: 173610
-