Commits · b9116e69666de1d6da41af6554beb9ceb2835db8 · Roger Ferrer / llvm-epi-0.8

Apr 16, 2013

SLPVectorizer: Make it a function pass and add code for hoisting the... · b9116e69

Nadav Rotem authored Apr 15, 2013

SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops.

llvm-svn: 179562

b9116e69

Apr 15, 2013
- Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make... · d4dcc003
  Nadav Rotem authored Apr 15, 2013
```
Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer.

llvm-svn: 179508
```
  d4dcc003
- Rename the slp-vectorizer clang/llvm flags. No functionality change. · a1e5e44e
  Nadav Rotem authored Apr 15, 2013
```
llvm-svn: 179505
```
  a1e5e44e
Mar 10, 2013
- Use LLVMBool instead of 'bool' in the C API. Based on a patch by Peter Zotov! · 5f508541
  Nick Lewycky authored Mar 10, 2013
```
llvm-svn: 176793
```
  5f508541
Mar 06, 2013

Generalize my previous fix for -print-options. · fcb37243

Andrew Trick authored Mar 06, 2013

Always print options that differ from their implicit default. At least
for simple option types.

llvm-svn: 176572

fcb37243

Give -loop-vectorize an explicit default. · 946c2b32

Andrew Trick authored Mar 06, 2013

This way, clang -mllvm -print-options shows that the driver is overriding it.

llvm-svn: 176569

946c2b32

Jan 29, 2013

Unroll again after running BBVectorize · bf4db4fe

Hal Finkel authored Jan 29, 2013

Because BBVectorize may significantly shorten a loop body, unroll
again after vectorization. This is especially important when using
runtime or partial unrolling.

llvm-svn: 173730

bf4db4fe

Jan 07, 2013
- Remove the long defunct 'DefaultPasses' header. We have a pass manager · 683ff2d7
  Chandler Carruth authored Jan 07, 2013
```
builder these days, and this thing hasn't seen updates for a very long
time.

llvm-svn: 171741
```
  683ff2d7
Jan 04, 2013

Move the loop vectorizer from O2 to O3. It looks like the increase in code... · be6570d4

Nadav Rotem authored Jan 04, 2013

Move the loop vectorizer from O2 to O3. It looks like the increase in code size actually hurts the performance on many programs.

llvm-svn: 171471

be6570d4

Dec 21, 2012
- Remove duplicate includes. · a229186a
  Roman Divacky authored Dec 21, 2012
```
llvm-svn: 170902
```
  a229186a
Dec 19, 2012
- Enable the loop vectorizer in clang and not in the pass manager, so that we... · 9aee065e
  Nadav Rotem authored Dec 18, 2012
```
Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang.

llvm-svn: 170470
```
  9aee065e
Dec 18, 2012
- Enable the loop vectorizer. · c0699854
  Nadav Rotem authored Dec 18, 2012
```
llvm-svn: 170416
```
  c0699854
Dec 15, 2012
- Revert r170246, "Enable the loop vectorizer by default." · 8f45b6c7
  NAKAMURA Takumi authored Dec 15, 2012
```
llvm-svn: 170267
```
  8f45b6c7
Dec 14, 2012
- Enable the loop vectorizer by default. · acde7748
  Nadav Rotem authored Dec 14, 2012
```
llvm-svn: 170246
```
  acde7748
- revert r170166 - disable the loop vectorizer. · d3a3c9fd
  Nadav Rotem authored Dec 14, 2012
```
llvm-svn: 170172
```
  d3a3c9fd
- Enable the loop vectorizer. · 3b606d6f
  Nadav Rotem authored Dec 14, 2012
```
llvm-svn: 170166
```
  3b606d6f
- Disable the loop vectorizer. · b4ea4b37
  Nadav Rotem authored Dec 14, 2012
```
llvm-svn: 170162
```
  b4ea4b37
- Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by... · e5e28b48
  Nadav Rotem authored Dec 13, 2012
```
Enable the Loop Vectorizer by default for O2 and O3. Disable if-conversion by default. I plan to revert this patch later today.

llvm-svn: 170157
```
  e5e28b48
Dec 12, 2012

LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to... · d0bb22bb
Nadav Rotem authored Dec 12, 2012
```
LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.

llvm-svn: 170004
```
d0bb22bb

LoopVectorizer: When -Os is used, vectorize only loops that dont require a... · aeb17df8

Nadav Rotem authored Dec 12, 2012

LoopVectorizer: When -Os is used, vectorize only loops that dont require a tail loop. There is no testcase because I dont know of a way to initialize the loop vectorizer pass without adding an additional hidden flag. 

llvm-svn: 169950

aeb17df8

Dec 10, 2012
- Enable the loop vectorizer only on O2 and above. (Still disabled by default) · 36cdd826
  Nadav Rotem authored Dec 10, 2012
```
llvm-svn: 169774
```
  36cdd826
Dec 03, 2012

Use the new script to sort the includes of every file under lib. · ed0881b2

Chandler Carruth authored Dec 03, 2012

Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131

ed0881b2

Nov 29, 2012
- No need to run LICM after loop vectorization because we dont generate invariant code any more. · ec739205
  Nadav Rotem authored Nov 29, 2012
```
llvm-svn: 168928
```
  ec739205
Nov 15, 2012
- Use empty parens for empty function parameter list instead of '(void)'. · 0011bbf9
  Dmitri Gribenko authored Nov 15, 2012
```
llvm-svn: 168049
```
  0011bbf9
Oct 30, 2012
- 80-col · d3df6651
  Nadav Rotem authored Oct 30, 2012
```
llvm-svn: 167036
```
  d3df6651
Oct 29, 2012
- Rename the BB-vectorize flag to match the dragonegg name · 39aab03b
  Nadav Rotem authored Oct 29, 2012
```
llvm-svn: 166948
```
  39aab03b
- Change the PassManagerBuilder (used by -O3) loop vectorizer flag from... · c59ae207
  Nadav Rotem authored Oct 29, 2012
```
Change the PassManagerBuilder (used by -O3) loop vectorizer flag from -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer.

llvm-svn: 166937
```
  c59ae207
Oct 26, 2012

Change the internalize pass to internalize all symbols when given an empty · 4253bd8f

Rafael Espindola authored Oct 26, 2012

list of externals. This makes sense since a shared library with no symbols
can still be useful if it has static constructors.

llvm-svn: 166795

4253bd8f

Oct 25, 2012
- revert accidental change · 086ea5c1
  Nadav Rotem authored Oct 24, 2012
```
llvm-svn: 166643
```
  086ea5c1
- Implement a basic cost model for vector and scalar instructions. · 4a87683a
  Nadav Rotem authored Oct 24, 2012
```
llvm-svn: 166642
```
  4a87683a
Oct 18, 2012

Introduce a BarrierNoop pass, a hack designed to allow *some* control · e8479e15

Chandler Carruth authored Oct 18, 2012

over the implicitly-formed-and-nesting CGSCC pass manager and function
pass managers, especially when using them on the opt commandline or
using extension points in the module builder. The '-barrier' opt flag
(or the pass itself) will create a no-op module pass in the pipeline,
resetting the pass manager stack, and allowing the creation of a new
pipeline of function passes or CGSCC passes to be created that is
independent from any previous pipelines.

For example, this can be used to test running two CGSCC passes in
independent CGSCC pass managers as opposed to in the same CGSCC pass
manager. It also allows us to introduce a further hack into the
PassManagerBuilder to separate the O0 pipeline extension passes from the
always-inliner's CGSCC pass manager, which they likely do not want to
participate in... At the very least none of the Sanitizer passes want
this behavior.

This fixes a bug with ASan at O0 currently, and I'll commit the ASan
test which covers this pass. I'm happy to add a test case that this pass
exists and works, but not sure how much time folks would like me to
spend adding test cases for the details of its behavior of partition
pass managers.... The whole thing is just vile, and mostly intended to
unblock ASan, so I'm hoping to rip this all out in a brave new pass
manager world.

llvm-svn: 166172

e8479e15

Oct 17, 2012
- Add a loop vectorizer. · 6b94c2a0
  Nadav Rotem authored Oct 17, 2012
```
llvm-svn: 166112
```
  6b94c2a0
Oct 02, 2012
- Turn the new SROA pass back on. Let's see if it sticks this time. =] · 4e435993
  Chandler Carruth authored Oct 02, 2012
```
Again, let me know if anything breaks due to this!

llvm-svn: 164986
```
  4e435993
Sep 28, 2012
- GlobalDCE should be run at -O2 / -Os to eliminate unused dtor, etc. rdar://9142819 · 8c6b06d4
  Evan Cheng authored Sep 28, 2012
```
llvm-svn: 164850
```
  8c6b06d4
Sep 27, 2012
- Disable the new SROA pass to get the tree back in working order. We don't yet · 2e646236
  Nick Lewycky authored Sep 26, 2012
```
have testcases for the current problems.

llvm-svn: 164731
```
  2e646236
Sep 24, 2012
- Enable the new SROA pass by default. · 8232bf53
  Chandler Carruth authored Sep 24, 2012
```
Queue the fallout. ;]

llvm-svn: 164480
```
  8232bf53
Sep 18, 2012

LNT builders have picked up new SROA, disable it to get the remaining builders green again. · 9bc3efc8
Benjamin Kramer authored Sep 18, 2012
```
llvm-svn: 164124
```
9bc3efc8

Add a major missing piece to the new SROA pass: aggressive splitting of · 42cb9cb1

Chandler Carruth authored Sep 18, 2012

FCAs. This is essential in order to promote allocas that are used in
struct returns by frontends like Clang. The FCA load would block the
rest of the pass from firing, resulting is significant regressions with
the bullet benchmark in the nightly test suite.

Thanks to Duncan for repeated discussions about how best to do this, and
to both him and Benjamin for review.

This appears to have blocked many places where the pass tries to fire,
and so I'm expect somewhat different results with this fix added.

As with the last big patch, I'm including a change to enable the SROA by
default *temporarily*. Ben is going to remove this as soon as the LNT
bots pick up the patch. I'm just trying to get a round of LNT numbers
from the stable machines in the lab.

NOTE: Four clang tests are expected to fail in the brief window where
this is enabled. Sorry for the noise!

llvm-svn: 164119

42cb9cb1

Sep 15, 2012

Disable new sroa now that all buildbots have tested it. · ed11e35e

Benjamin Kramer authored Sep 15, 2012

What we have so far:
- Some clang test failures (these were known already)

- Perf results are mixed, some big regressions
  http://llvm.org/perf/db_default/v4/nts/3844
  http://llvm.org/perf/db_default/v4/nts/3845

  bullet suffers a lot. matmul is interesting: slower scalar code, faster with -vectorize.

- Some dragonegg selfhost bots crash in SROA during selfhost now
  http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.6-self-host-checks/builds/1632
  http://lab.llvm.org:8011/builders/dragonegg-x86_64-linux-gcc-4.5-self-host/builds/1891

llvm-svn: 163968

ed11e35e

Port the SSAUpdater-based promotion logic from the old SROA pass to the · 70b44c5c

Chandler Carruth authored Sep 15, 2012

new one, and add support for running the new pass in that mode and in
that slot of the pass manager. With this the new pass can completely
replace the old one within the pipeline.

The strategy for enabling or disabling the SSAUpdater logic is to do it
by making the requirement of the domtree analysis optional. By default,
it is required and we get the standard mem2reg approach. This is usually
the desired strategy when run in stand-alone situations. Within the
CGSCC pass manager, we disable requiring of the domtree analysis and
consequentially trigger fallback to the SSAUpdater promotion.

In theory this would allow the pass to re-use a domtree if one happened
to be available even when run in a mode that doesn't require it. In
practice, it lets us have a single pass rather than two which was
simpler for me to wrap my head around.

There is a hidden flag to force the use of the SSAUpdater code path for
the purpose of testing. The primary testing strategy is just to run the
existing tests through that path. One notable difference is that it has
custom code to handle lifetime markers, and one of the tests has been
enhanced to exercise that code.

This has survived a bootstrap and the test suite without serious
correctness issues, however my run of the test suite produced *very*
alarming performance numbers. I don't entirely understand or trust them
though, so more investigation is on-going.

To aid my understanding of the performance impact of the new SROA now
that it runs throughout the optimization pipeline, I'm enabling it by
default in this commit, and will disable it again once the LNT bots have
picked up one iteration with it. I want to get those bots (which are
much more stable) to evaluate the impact of the change before I jump to
any conclusions.

NOTE: Several Clang tests will fail because they run -O3 and check the
result's order of output. They'll go back to passing once I disable it
again.

llvm-svn: 163965

70b44c5c