- Nov 17, 2013
-
-
Hal Finkel authored
This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3*i] = foo(0); x[3*i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. llvm-svn: 194939
-
- Nov 15, 2013
-
-
Manman Ren authored
We used to use std::map<IndicesVector, LoadInst*> for OriginalLoads, and when we try to promote two arguments, they will both write to OriginalLoads causing created loads for the two arguments to have the same original load. And the same tbaa tag and alignment will be put to the created loads for the two arguments. The fix is to use std::map<std::pair<Argument*, IndicesVector>, LoadInst*> for OriginalLoads, so each Argument will write to different parts of the map. PR17906 llvm-svn: 194846
-
- Nov 12, 2013
-
-
Rafael Espindola authored
Constant merge can merge a constant with implicit alignment with one that has explicit alignment. Before this change it was assuming that the explicit alignment was higher than the implicit one, causing the result to be under aligned in some cases. Fixes pr17815. Patch by Chris Smowton! llvm-svn: 194506
-
- Nov 10, 2013
-
-
Matt Arsenault authored
llvm-svn: 194342
-
- Nov 04, 2013
-
-
Shuxin Yang authored
llvm-svn: 194017
-
- Nov 03, 2013
-
-
David Majnemer authored
llvm-svn: 193954
-
- Oct 31, 2013
-
-
Rafael Espindola authored
There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when *not* doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800
-
Rafael Espindola authored
llvm-svn: 193734
-
- Oct 27, 2013
-
-
Shuxin Yang authored
llvm-svn: 193489
-
- Oct 23, 2013
-
-
Shuxin Yang authored
Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251
-
- Oct 22, 2013
-
-
Eric Christopher authored
llvm-svn: 193130
-
- Oct 21, 2013
-
-
Matt Arsenault authored
llvm-svn: 193109
-
Rafael Espindola authored
When a linkonce_odr value that is on the dso list is not unnamed_addr we can still look to see if anything is actually using its address. If not, it is safe to hide it. This patch implements that by moving GlobalStatus to Transforms/Utils and using it in Internalize. llvm-svn: 193090
-
- Oct 19, 2013
-
-
Nadav Rotem authored
llvm-svn: 193013
-
- Oct 17, 2013
-
-
Rafael Espindola authored
llvm-svn: 192910
-
Rafael Espindola authored
llvm-svn: 192907
-
Rafael Espindola authored
No functionality change. llvm-svn: 192906
-
- Oct 09, 2013
-
-
Shuxin Yang authored
If a function seen at compile time is not necessarily the one linked to the binary being built, it is illegal to change the actual arguments passing to it. e.g. -------------------------- void foo(int lol) { // foo() has linkage satisifying isWeakForLinker() // "lol" is not used at all. } void bar(int lo2) { // xform to foo(undef) is illegal, as compiler dose not know which // instance of foo() will be linked to the the binary being built. foo(lol2); } ----------------------------- Such functions can be captured by isWeakForLinker(). NOTE that mayBeOverridden() is insufficient for this purpose as it dosen't include linkage types like AvailableExternallyLinkage and LinkOnceODRLinkage. Take link_odr* as an example, it indicates a set of *EQUIVALENT* globals that can be merged at link-time. However, the semantic of *EQUIVALENT*-functions includes parameters. Changing parameters breaks the assumption. Thank John McCall for help, especially for the explanation of subtle difference between linkage types. rdar://11546243 llvm-svn: 192302
-
- Oct 07, 2013
-
-
Alexey Samsonov authored
llvm-svn: 192121
-
- Oct 03, 2013
-
-
Rafael Espindola authored
Generalize the API so we can distinguish symbols that are needed just for a DSO symbol table from those that are used from some native .o. The symbols that are only wanted for the dso symbol table can be dropped if llvm can prove every other dso has a copy (linkonce_odr) and the address is not important (unnamed_addr). llvm-svn: 191922
-
- Oct 02, 2013
-
-
Alexey Samsonov authored
Summary: As discussed in http://llvm-reviews.chandlerc.com/D1754, this optimization isn't really valid for C, and fires too rarely anyway. Reviewers: rafael, nicholas Reviewed By: nicholas CC: rnk, llvm-commits, nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D1769 llvm-svn: 191834
-
- Oct 01, 2013
-
-
Matt Arsenault authored
It's silly to merge functions like these: define void @foo(i32 %x) { ret void } define void @bar(i32 %x) { ret void } to get define void @bar(i32) { tail call void @foo(i32 %0) ret void } llvm-svn: 191786
-
- Sep 22, 2013
-
-
Benjamin Kramer authored
This makes using array_pod_sort significantly safer. The implementation relies on function pointer casting but that should be safe as we're dealing with void* here. llvm-svn: 191175
-
- Sep 17, 2013
-
-
Stepan Dyatkovskiy authored
Wrong cast operation. MergeFunctions emits Bitcast instead of pointer-to-integer operation. Patch fixes MergeFunctions::writeThunk function. It replaces unconditional Bitcast creation with "Value* createCast(...)" method, that checks operand types and selects proper instruction. See unit-test as example. llvm-svn: 190859
-
- Sep 16, 2013
-
-
Peter Collingbourne authored
Previous discussion: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html Differential Revision: http://llvm-reviews.chandlerc.com/D1191 llvm-svn: 190773
-
- Sep 13, 2013
-
-
Duncan Sands authored
disabled. llvm-svn: 190668
-
- Sep 11, 2013
-
-
Matt Arsenault authored
This doesn't change anything since malloc always returns address space 0. llvm-svn: 190498
-
- Sep 10, 2013
-
-
Eli Friedman authored
LLVM IR doesn't currently allow atomic bool load/store operations, and the transformation is dubious anyway because it isn't profitable on all platforms. PR17163. llvm-svn: 190357
-
- Sep 05, 2013
-
-
Rafael Espindola authored
llvm-svn: 190090
-
Nick Lewycky authored
llvm-svn: 190035
-
- Sep 04, 2013
-
-
Rafael Espindola authored
I am about to patch this code, and this makes the diff far more readable. llvm-svn: 189982
-
Rafael Espindola authored
llvm-svn: 189971
-
Rafael Espindola authored
No functionality change. llvm-svn: 189969
-
Rafael Espindola authored
llvm-svn: 189967
-
Rafael Espindola authored
This reverts commit r189886. I found a corner case where this optimization is not valid: Say we have a "linkonce_odr unnamed_addr" in two translation units: * In TU 1 this optimization kicks in and makes it hidden. * In TU 2 it gets const merged with a constant that is *not* unnamed_addr, resulting in a non unnamed_addr constant with default visibility. * The static linker rules for combining visibility them produce a hidden symbol, which is incorrect from the point of view of the non unnamed_addr constant. The one place we can do this is when we know that the symbol is not used from another TU in the same shared object, i.e., during LTO. I will move it there. llvm-svn: 189954
-
Rafael Espindola authored
Original message: If a constant or a function has linkonce_odr linkage and unnamed_addr, mark hidden. Being linkonce_odr guarantees that it is available in every dso that needs it. Being a constant/function with unnamed_addr guarantees that the copies don't have to be merged. llvm-svn: 189886
-
- Sep 03, 2013
-
-
Nadav Rotem authored
This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran. Perf gains: SingleSource/Benchmarks/Shootout/matrix -37.33% MultiSource/Benchmarks/PAQ8p/paq8p -22.83% SingleSource/Benchmarks/Linpack/linpack-pc -16.22% SingleSource/Benchmarks/Shootout-C++/ary3 -15.16% MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34% MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12% Regressions: SingleSource/Benchmarks/Misc/lowercase 15.10% MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18% SingleSource/Benchmarks/Shootout-C++/matrix 8.27% SingleSource/Benchmarks/CoyoteBench/lpbench 7.30% llvm-svn: 189858
-
- Aug 30, 2013
-
-
Bill Wendling authored
llvm-svn: 189697
-
Bill Wendling authored
llvm-svn: 189632
-
- Aug 29, 2013
-
-
Nadav Rotem authored
Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons: 1. They are a kind of cannonicalization. 2. The performance measurements show that it is better to keep them in. There should be no functional change if you are not enabling the LateVectorization mode. llvm-svn: 189539
-