- Apr 16, 2013
-
-
Nadav Rotem authored
SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops. llvm-svn: 179562
-
- Apr 15, 2013
-
-
Nadav Rotem authored
Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer. llvm-svn: 179508
-
Nadav Rotem authored
llvm-svn: 179505
-
- Mar 10, 2013
-
-
Nick Lewycky authored
llvm-svn: 176793
-
- Mar 06, 2013
-
-
Andrew Trick authored
Always print options that differ from their implicit default. At least for simple option types. llvm-svn: 176572
-
Andrew Trick authored
This way, clang -mllvm -print-options shows that the driver is overriding it. llvm-svn: 176569
-
- Jan 29, 2013
-
-
Hal Finkel authored
Because BBVectorize may significantly shorten a loop body, unroll again after vectorization. This is especially important when using runtime or partial unrolling. llvm-svn: 173730
-
- Jan 07, 2013
-
-
Chandler Carruth authored
builder these days, and this thing hasn't seen updates for a very long time. llvm-svn: 171741
-
- Jan 04, 2013
-
-
Nadav Rotem authored
Move the loop vectorizer from O2 to O3. It looks like the increase in code size actually hurts the performance on many programs. llvm-svn: 171471
-
- Dec 21, 2012
-
-
Roman Divacky authored
llvm-svn: 170902
-
- Dec 19, 2012
-
-
Nadav Rotem authored
Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang. llvm-svn: 170470
-
- Dec 18, 2012
-
-
Nadav Rotem authored
llvm-svn: 170416
-
- Dec 15, 2012
-
-
NAKAMURA Takumi authored
llvm-svn: 170267
-
- Dec 14, 2012
-
-
Nadav Rotem authored
llvm-svn: 170246
-
Nadav Rotem authored
llvm-svn: 170172
-
Nadav Rotem authored
llvm-svn: 170166
-
Nadav Rotem authored
llvm-svn: 170162
-
Nadav Rotem authored
Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today. llvm-svn: 170157
-
- Dec 12, 2012
-
-
Nadav Rotem authored
LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size. llvm-svn: 170004
-
Nadav Rotem authored
LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag. llvm-svn: 169950
-
- Dec 10, 2012
-
-
Nadav Rotem authored
llvm-svn: 169774
-
- Dec 03, 2012
-
-
Chandler Carruth authored
Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
-
- Nov 29, 2012
-
-
Nadav Rotem authored
llvm-svn: 168928
-
- Nov 15, 2012
-
-
Dmitri Gribenko authored
llvm-svn: 168049
-
- Oct 30, 2012
-
-
Nadav Rotem authored
llvm-svn: 167036
-
- Oct 29, 2012
-
-
Nadav Rotem authored
llvm-svn: 166948
-
Nadav Rotem authored
Change the PassManagerBuilder (used by -O3) loop vectorizer flag from -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer. llvm-svn: 166937
-
- Oct 26, 2012
-
-
Rafael Espindola authored
list of externals. This makes sense since a shared library with no symbols can still be useful if it has static constructors. llvm-svn: 166795
-
- Oct 25, 2012
-
-
Nadav Rotem authored
llvm-svn: 166643
-
Nadav Rotem authored
llvm-svn: 166642
-
- Oct 18, 2012
-
-
Chandler Carruth authored
over the implicitly-formed-and-nesting CGSCC pass manager and function pass managers, especially when using them on the opt commandline or using extension points in the module builder. The '-barrier' opt flag (or the pass itself) will create a no-op module pass in the pipeline, resetting the pass manager stack, and allowing the creation of a new pipeline of function passes or CGSCC passes to be created that is independent from any previous pipelines. For example, this can be used to test running two CGSCC passes in independent CGSCC pass managers as opposed to in the same CGSCC pass manager. It also allows us to introduce a further hack into the PassManagerBuilder to separate the O0 pipeline extension passes from the always-inliner's CGSCC pass manager, which they likely do not want to participate in... At the very least none of the Sanitizer passes want this behavior. This fixes a bug with ASan at O0 currently, and I'll commit the ASan test which covers this pass. I'm happy to add a test case that this pass exists and works, but not sure how much time folks would like me to spend adding test cases for the details of its behavior of partition pass managers.... The whole thing is just vile, and mostly intended to unblock ASan, so I'm hoping to rip this all out in a brave new pass manager world. llvm-svn: 166172
-
- Oct 17, 2012
-
-
Nadav Rotem authored
llvm-svn: 166112
-
- Oct 02, 2012
-
-
Chandler Carruth authored
Again, let me know if anything breaks due to this! llvm-svn: 164986
-
- Sep 28, 2012
-
-
- Sep 27, 2012
-
-
Nick Lewycky authored
have testcases for the current problems. llvm-svn: 164731
-
- Sep 24, 2012
-
-
Chandler Carruth authored
Queue the fallout. ;] llvm-svn: 164480
-
- Sep 18, 2012
-
-
Benjamin Kramer authored
llvm-svn: 164124
-
Chandler Carruth authored
FCAs. This is essential in order to promote allocas that are used in struct returns by frontends like Clang. The FCA load would block the rest of the pass from firing, resulting is significant regressions with the bullet benchmark in the nightly test suite. Thanks to Duncan for repeated discussions about how best to do this, and to both him and Benjamin for review. This appears to have blocked many places where the pass tries to fire, and so I'm expect somewhat different results with this fix added. As with the last big patch, I'm including a change to enable the SROA by default *temporarily*. Ben is going to remove this as soon as the LNT bots pick up the patch. I'm just trying to get a round of LNT numbers from the stable machines in the lab. NOTE: Four clang tests are expected to fail in the brief window where this is enabled. Sorry for the noise! llvm-svn: 164119
-
- Sep 15, 2012
-
-
Benjamin Kramer authored
What we have so far: - Some clang test failures (these were known already) - Perf results are mixed, some big regressions http://llvm.org/perf/db_default/v4/nts/3844 http://llvm.org/perf/db_default/v4/nts/3845 bullet suffers a lot. matmul is interesting: slower scalar code, faster with -vectorize. - Some dragonegg selfhost bots crash in SROA during selfhost now http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.6-self-host-checks/builds/1632 http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.5-self-host/builds/1891 llvm-svn: 163968
-
Chandler Carruth authored
new one, and add support for running the new pass in that mode and in that slot of the pass manager. With this the new pass can completely replace the old one within the pipeline. The strategy for enabling or disabling the SSAUpdater logic is to do it by making the requirement of the domtree analysis optional. By default, it is required and we get the standard mem2reg approach. This is usually the desired strategy when run in stand-alone situations. Within the CGSCC pass manager, we disable requiring of the domtree analysis and consequentially trigger fallback to the SSAUpdater promotion. In theory this would allow the pass to re-use a domtree if one happened to be available even when run in a mode that doesn't require it. In practice, it lets us have a single pass rather than two which was simpler for me to wrap my head around. There is a hidden flag to force the use of the SSAUpdater code path for the purpose of testing. The primary testing strategy is just to run the existing tests through that path. One notable difference is that it has custom code to handle lifetime markers, and one of the tests has been enhanced to exercise that code. This has survived a bootstrap and the test suite without serious correctness issues, however my run of the test suite produced *very* alarming performance numbers. I don't entirely understand or trust them though, so more investigation is on-going. To aid my understanding of the performance impact of the new SROA now that it runs throughout the optimization pipeline, I'm enabling it by default in this commit, and will disable it again once the LNT bots have picked up one iteration with it. I want to get those bots (which are much more stable) to evaluate the impact of the change before I jump to any conclusions. NOTE: Several Clang tests will fail because they run -O3 and check the result's order of output. They'll go back to passing once I disable it again. llvm-svn: 163965
-