- Feb 24, 2014
-
-
Arnold Schwaighofer authored
During the LTO phase LICM will move loop invariant global variables out of loops (informed by GlobalModRef). This makes more loops countable presenting opportunity for the loop vectorizer. Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset (5%) on x86-64. radar://15970632 llvm-svn: 202051
-
- Jan 13, 2014
-
-
Chandler Carruth authored
directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
-
- Dec 05, 2013
-
-
Renato Golin authored
The intended behaviour is to force vectorization on the presence of the flag (either turn on or off), and to continue the behaviour as expected in its absence. Tests were added to make sure the all cases are covered in opt. No tests were added in other tools with the assumption that they should use the PassManagerBuilder in the same way. This patch also removes the outdated -late-vectorize flag, which was on by default and not helping much. The pragma metadata is being attached to the same place as other loop metadata, but nothing forbids one from attaching it to a function (to enable #pragma optimize) or basic blocks (to hint the basic-block vectorizers), etc. The logic should be the same all around. Patches to Clang to produce the metadata will be produced after the initial implementation is agreed upon and committed. Patches to other vectorizers (such as SLP and BB) will be added once we're happy with the pass manager changes. llvm-svn: 196537
-
- Nov 17, 2013
-
-
Hal Finkel authored
This adds a boolean member variable to the PassManagerBuilder to control loop rerolling (just like we have for unrolling and the various vectorization options). This is necessary for control by the frontend. Loop rerolling remains disabled by default at all optimization levels. llvm-svn: 194966
-
Hal Finkel authored
This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3*i] = foo(0); x[3*i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. llvm-svn: 194939
-
- Oct 31, 2013
-
-
Rafael Espindola authored
There are two ways one could implement hiding of linkonce_odr symbols in LTO: * LLVM tells the linker which symbols can be hidden if not used from native files. * The linker tells LLVM which symbols are not used from other object files, but will be put in the dso symbol table if present. GOLD's API is the second option. It was implemented almost 1:1 in llvm by passing the list down to internalize. LLVM already had partial support for the first option. It is also very similar to how ld64 handles hiding these symbols when *not* doing LTO. This patch then * removes the APIs for the DSO list. * marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr global values and other linkonce_odr whose address is not used. * makes the gold plugin responsible for handling the API mismatch. llvm-svn: 193800
-
- Oct 19, 2013
-
-
Nadav Rotem authored
llvm-svn: 193013
-
- Oct 03, 2013
-
-
Rafael Espindola authored
Generalize the API so we can distinguish symbols that are needed just for a DSO symbol table from those that are used from some native .o. The symbols that are only wanted for the dso symbol table can be dropped if llvm can prove every other dso has a copy (linkonce_odr) and the address is not important (unnamed_addr). llvm-svn: 191922
-
- Sep 03, 2013
-
-
Nadav Rotem authored
This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran. Perf gains: SingleSource/Benchmarks/Shootout/matrix -37.33% MultiSource/Benchmarks/PAQ8p/paq8p -22.83% SingleSource/Benchmarks/Linpack/linpack-pc -16.22% SingleSource/Benchmarks/Shootout-C++/ary3 -15.16% MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34% MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12% Regressions: SingleSource/Benchmarks/Misc/lowercase 15.10% MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18% SingleSource/Benchmarks/Shootout-C++/matrix 8.27% SingleSource/Benchmarks/CoyoteBench/lpbench 7.30% llvm-svn: 189858
-
- Aug 30, 2013
-
-
Bill Wendling authored
llvm-svn: 189632
-
- Aug 29, 2013
-
-
Nadav Rotem authored
Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons: 1. They are a kind of cannonicalization. 2. The performance measurements show that it is better to keep them in. There should be no functional change if you are not enabling the LateVectorization mode. llvm-svn: 189539
-
- Aug 28, 2013
-
-
Hal Finkel authored
When unrolling is disabled in the pass manager, the loop vectorizer should also not unroll loops. This will allow the -fno-unroll-loops option in Clang to behave as expected (even for vectorizable loops). The loop vectorizer's -force-vector-unroll option will (continue to) override the pass-manager setting (including -force-vector-unroll=0 to force use of the internal auto-selection logic). In order to test this, I added a flag to opt (-disable-loop-unrolling) to force disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also, this fixes a small bug in opt where the loop vectorizer was enabled only after the pass manager populated the queue of passes (the global_alias.ll test needed a slight update to the RUN line as a result of this fix). llvm-svn: 189499
-
- Aug 13, 2013
-
-
Arnold Schwaighofer authored
llvm-svn: 188285
-
Arnold Schwaighofer authored
I have moved this logic into clang and opt. llvm-svn: 188281
-
- Aug 06, 2013
-
-
Tom Stellard authored
Patch by: Mei Ye llvm-svn: 187764
-
- Aug 02, 2013
-
-
Nadav Rotem authored
llvm-svn: 187628
-
- Aug 01, 2013
-
-
Nadav Rotem authored
llvm-svn: 187595
-
- Jul 27, 2013
-
-
Tom Stellard authored
Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
-
- Jun 24, 2013
-
-
Chandler Carruth authored
CGSCC pass manager. This should insulate the inlining decisions from the vectorization decisions, however it may have both compile time and code size problems so it is just an experimental option right now. Adding this based on a discussion with Arnold and it seems at least worth having this flag for us to both run some experiments to see if this strategy is workable. It may solve some of the regressions seen with the loop vectorizer. llvm-svn: 184698
-
- Jun 20, 2013
-
-
Meador Inge authored
This commit completely removes what is left of the simplify-libcalls pass. All of the functionality has now been migrated to the instcombine and functionattrs passes. The following C API functions are now NOPs: 1. LLVMAddSimplifyLibCallsPass 2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls llvm-svn: 184459
-
- Jun 17, 2013
-
-
Nadav Rotem authored
llvm-svn: 184089
-
Nadav Rotem authored
llvm-svn: 184084
-
- Jun 07, 2013
-
-
Nadav Rotem authored
Jeffrey Yasskin volunteered to benchmark the vectorizer on -O2 or -Os when compiling chrome. This patch adds a new flag to enable vectorization on all levels and not only on -O3. It should go away once we make a decision. llvm-svn: 183456
-
- May 01, 2013
-
-
Filip Pizlo authored
the things, and renames it to CBindingWrapping.h. I also moved CBindingWrapping.h into Support/. This new file just contains the macros for defining different wrap/unwrap methods. The calls to those macros, as well as any custom wrap/unwrap definitions (like for array of Values for example), are put into corresponding C++ headers. Doing this required some #include surgery, since some .cpp files relied on the fact that including Wrap.h implicitly caused the inclusion of a bunch of other things. This also now means that the C++ headers will include their corresponding C API headers; for example Value.h must include llvm-c/Core.h. I think this is harmless, since the C API headers contain just external function declarations and some C types, so I don't believe there should be any nasty dependency issues here. llvm-svn: 180881
-
- Apr 23, 2013
-
-
Eric Christopher authored
or the C++ files themselves. This enables people to use just a C compiler to interoperate with LLVM. llvm-svn: 180063
-
- Apr 16, 2013
-
-
Nadav Rotem authored
SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562
-
- Apr 15, 2013
-
-
Nadav Rotem authored
Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer. llvm-svn: 179508
-
Nadav Rotem authored
llvm-svn: 179505
-
- Mar 10, 2013
-
-
Nick Lewycky authored
llvm-svn: 176793
-
- Mar 06, 2013
-
-
Andrew Trick authored
Always print options that differ from their implicit default. At least for simple option types. llvm-svn: 176572
-
Andrew Trick authored
This way, clang -mllvm -print-options shows that the driver is overriding it. llvm-svn: 176569
-
- Jan 29, 2013
-
-
Hal Finkel authored
Because BBVectorize may significantly shorten a loop body, unroll again after vectorization. This is especially important when using runtime or partial unrolling. llvm-svn: 173730
-
- Jan 07, 2013
-
-
Chandler Carruth authored
builder these days, and this thing hasn't seen updates for a very long time. llvm-svn: 171741
-
- Jan 04, 2013
-
-
Nadav Rotem authored
Move the loop vectorizer from O2 to O3. It looks like the increase in code size actually hurts the performance on many programs. llvm-svn: 171471
-
- Dec 21, 2012
-
-
Roman Divacky authored
llvm-svn: 170902
-
- Dec 19, 2012
-
-
Nadav Rotem authored
Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang. llvm-svn: 170470
-
- Dec 18, 2012
-
-
Nadav Rotem authored
llvm-svn: 170416
-
- Dec 15, 2012
-
-
NAKAMURA Takumi authored
llvm-svn: 170267
-
- Dec 14, 2012
-
-
Nadav Rotem authored
llvm-svn: 170246
-
Nadav Rotem authored
llvm-svn: 170172
-