Commits · 6ccda923e594bff47aade5a2342f01e5d2dbc661 · Roger Ferrer / llvm-epi-0.8

Feb 24, 2014

LTO: Add the loop vectorizer to the LTO pipeline. · 6ccda923

Arnold Schwaighofer authored Feb 24, 2014

During the LTO phase LICM will move loop invariant global variables out of loops
(informed by GlobalModRef). This makes more loops countable presenting
opportunity for the loop vectorizer.

Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset
(5%) on x86-64.

radar://15970632

llvm-svn: 202051

6ccda923

Jan 13, 2014

[cleanup] Move the Dominators.h and Verifier.h headers into the IR · 5ad5f15c

Chandler Carruth authored Jan 13, 2014

directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

llvm-svn: 199082

5ad5f15c

Dec 05, 2013

Add #pragma vectorize enable/disable to LLVM · 729a3ae9

Renato Golin authored Dec 05, 2013

The intended behaviour is to force vectorization on the presence
of the flag (either turn on or off), and to continue the behaviour
as expected in its absence. Tests were added to make sure the all
cases are covered in opt. No tests were added in other tools with
the assumption that they should use the PassManagerBuilder in the
same way.

This patch also removes the outdated -late-vectorize flag, which was
on by default and not helping much.

The pragma metadata is being attached to the same place as other loop
metadata, but nothing forbids one from attaching it to a function
(to enable #pragma optimize) or basic blocks (to hint the basic-block
vectorizers), etc. The logic should be the same all around.

Patches to Clang to produce the metadata will be produced after the
initial implementation is agreed upon and committed. Patches to other
vectorizers (such as SLP and BB) will be added once we're happy with
the pass manager changes.

llvm-svn: 196537

729a3ae9

Nov 17, 2013

Add a loop rerolling flag to the PassManagerBuilder · 29aeb205

Hal Finkel authored Nov 17, 2013

This adds a boolean member variable to the PassManagerBuilder to control loop
rerolling (just like we have for unrolling and the various vectorization
options). This is necessary for control by the frontend. Loop rerolling remains
disabled by default at all optimization levels.

llvm-svn: 194966

29aeb205

Add a loop rerolling pass · bf45efde

Hal Finkel authored Nov 16, 2013

This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The
transformation aims to take loops like this:

for (int i = 0; i < 3200; i += 5) {
  a[i]     += alpha * b[i];
  a[i + 1] += alpha * b[i + 1];
  a[i + 2] += alpha * b[i + 2];
  a[i + 3] += alpha * b[i + 3];
  a[i + 4] += alpha * b[i + 4];
}

and turn them into this:

for (int i = 0; i < 3200; ++i) {
  a[i] += alpha * b[i];
}

and loops like this:

for (int i = 0; i < 500; ++i) {
  x[3*i] = foo(0);
  x[3*i+1] = foo(0);
  x[3*i+2] = foo(0);
}

and turn them into this:

for (int i = 0; i < 1500; ++i) {
  x[i] = foo(0);
}

There are two motivations for this transformation:

  1. Code-size reduction (especially relevant, obviously, when compiling for
code size).

  2. Providing greater choice to the loop vectorizer (and generic unroller) to
choose the unrolling factor (and a better ability to vectorize). The loop
vectorizer can take vector lengths and register pressure into account when
choosing an unrolling factor, for example, and a pre-unrolled loop limits that
choice. This is especially problematic if the manual unrolling was optimized
for a machine different from the current target.

The current implementation is limited to single basic-block loops only. The
rerolling recognition should work regardless of how the loop iterations are
intermixed within the loop body (subject to dependency and side-effect
constraints), but the significant restriction is that the order of the
instructions in each iteration must be identical. This seems sufficient to
capture all current use cases.

This pass is not currently enabled by default at any optimization level.

llvm-svn: 194939

bf45efde

Oct 31, 2013

Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". · 282a4703

Rafael Espindola authored Oct 31, 2013

There are two ways one could implement hiding of linkonce_odr symbols in LTO:
* LLVM tells the linker which symbols can be hidden if not used from native
  files.
* The linker tells LLVM which symbols are not used from other object files,
  but will be put in the dso symbol table if present.

GOLD's API is the second option. It was implemented almost 1:1 in llvm by
passing the list down to internalize.

LLVM already had partial support for the first option. It is also very similar
to how ld64 handles hiding these symbols when *not* doing LTO.

This patch then
* removes the APIs for the DSO list.
* marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr
  global values and other linkonce_odr whose address is not used.
* makes the gold plugin responsible for handling the API mismatch.

llvm-svn: 193800

282a4703

Oct 19, 2013
- Mark some command line flags as hidden · 7f27e0b0
  Nadav Rotem authored Oct 18, 2013
```
llvm-svn: 193013
```
  7f27e0b0
Oct 03, 2013

Optimize linkonce_odr unnamed_addr functions during LTO. · cda2911c

Rafael Espindola authored Oct 03, 2013

Generalize the API so we can distinguish symbols that are needed just for a DSO
symbol table from those that are used from some native .o.

The symbols that are only wanted for the dso symbol table can be dropped if
llvm can prove every other dso has a copy (linkonce_odr) and the address is not
important (unnamed_addr).

llvm-svn: 191922

cda2911c

Sep 03, 2013

Enable late-vectorization by default. · 5d78dba6

Nadav Rotem authored Sep 03, 2013

This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran.

Perf gains:
SingleSource/Benchmarks/Shootout/matrix -37.33%
MultiSource/Benchmarks/PAQ8p/paq8p  -22.83%
SingleSource/Benchmarks/Linpack/linpack-pc  -16.22%
SingleSource/Benchmarks/Shootout-C++/ary3 -15.16%
MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34%
MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12%

Regressions:
SingleSource/Benchmarks/Misc/lowercase  15.10%
MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18%
SingleSource/Benchmarks/Shootout-C++/matrix 8.27%
SingleSource/Benchmarks/CoyoteBench/lpbench 7.30%

llvm-svn: 189858

5d78dba6

Aug 30, 2013
- Random cleanup: No need to use a std::vector here, since createInternalizePass uses an ArrayRef. · 4c0d9ade
  Bill Wendling authored Aug 30, 2013
```
llvm-svn: 189632
```
  4c0d9ade
Aug 29, 2013

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC... · 4c459bcd

Nadav Rotem authored Aug 28, 2013

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons:
1. They are a kind of cannonicalization.
2. The performance measurements show that it is better to keep them in.

There should be no functional change if you are not enabling the LateVectorization mode.

llvm-svn: 189539

4c459bcd

Aug 28, 2013

Disable unrolling in the loop vectorizer when disabled in the pass manager · 6d09904c

Hal Finkel authored Aug 28, 2013

When unrolling is disabled in the pass manager, the loop vectorizer should also
not unroll loops. This will allow the -fno-unroll-loops option in Clang to
behave as expected (even for vectorizable loops). The loop vectorizer's
-force-vector-unroll option will (continue to) override the pass-manager
setting (including -force-vector-unroll=0 to force use of the internal
auto-selection logic).

In order to test this, I added a flag to opt (-disable-loop-unrolling) to force
disable unrolling through opt (the analog of -fno-unroll-loops in Clang). Also,
this fixes a small bug in opt where the loop vectorizer was enabled only after
the pass manager populated the queue of passes (the global_alias.ll test needed
a slight update to the RUN line as a result of this fix).

llvm-svn: 189499

6d09904c

Aug 13, 2013
- Also remove logic in LateVectorize · 124ccf3a
  Arnold Schwaighofer authored Aug 13, 2013
```
llvm-svn: 188285
```
  124ccf3a
- Remove logic that decides whether to vectorize or not depending on O-levels · c14b59d1
  Arnold Schwaighofer authored Aug 13, 2013
```
I have moved this logic into clang and opt.

llvm-svn: 188281
```
  c14b59d1
Aug 06, 2013
- Factor FlattenCFG out from SimplifyCFG · aa664d9b
  Tom Stellard authored Aug 06, 2013
```
Patch by: Mei Ye

llvm-svn: 187764
```
  aa664d9b
Aug 02, 2013
- Move the optlevel check to the frontend. · e4e6e9ed
  Nadav Rotem authored Aug 01, 2013
```
llvm-svn: 187628
```
  e4e6e9ed
Aug 01, 2013
- Only enable SLP-vectorization on O3 builds. · 9153b387
  Nadav Rotem authored Aug 01, 2013
```
llvm-svn: 187595
```
  9153b387
Jul 27, 2013

SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions · 8b1e021e

Tom Stellard authored Jul 27, 2013

Merge consecutive if-regions if they contain identical statements.
Both transformations reduce number of branches.  The transformation
is guarded by a target-hook, and is currently enabled only for +R600,
but the correctness has been tested on X86 target using a variety of
CPU benchmarks.

Patch by: Mei Ye

llvm-svn: 187278

8b1e021e

Jun 24, 2013

Add a flag to defer vectorization into a phase after the inliner and its · 08e1b874

Chandler Carruth authored Jun 24, 2013

CGSCC pass manager. This should insulate the inlining decisions from the
vectorization decisions, however it may have both compile time and code
size problems so it is just an experimental option right now.

Adding this based on a discussion with Arnold and it seems at least
worth having this flag for us to both run some experiments to see if
this strategy is workable. It may solve some of the regressions seen
with the loop vectorizer.

llvm-svn: 184698

08e1b874

Jun 20, 2013

Remove the simplify-libcalls pass (finally) · dfb08a2c

Meador Inge authored Jun 20, 2013

This commit completely removes what is left of the simplify-libcalls
pass.  All of the functionality has now been migrated to the instcombine
and functionattrs passes.  The following C API functions are now NOPs:

  1. LLVMAddSimplifyLibCallsPass
  2. LLVMPassManagerBuilderSetDisableSimplifyLibCalls

llvm-svn: 184459

dfb08a2c

Jun 17, 2013
- Disable vectorization for -Oz. · cde24ef3
  Nadav Rotem authored Jun 17, 2013
```
llvm-svn: 184089
```
  cde24ef3
- Enable the loop vectorizer by default for -Os and -O2. · 7dd8210b
  Nadav Rotem authored Jun 17, 2013
```
llvm-svn: 184084
```
  7dd8210b
Jun 07, 2013

Jeffrey Yasskin volunteered to benchmark the vectorizer on -O2 or -Os when... · 99e529ea

Nadav Rotem authored Jun 06, 2013

Jeffrey Yasskin volunteered to benchmark the vectorizer on -O2 or -Os when compiling chrome. This patch adds a new flag to enable vectorization on all levels and not only on -O3. It should go away once we make a decision.

llvm-svn: 183456

99e529ea

May 01, 2013

This patch breaks up Wrap.h so that it does not have to include all of · dec20e43

Filip Pizlo authored May 01, 2013

the things, and renames it to CBindingWrapping.h.  I also moved 
CBindingWrapping.h into Support/.

This new file just contains the macros for defining different wrap/unwrap 
methods.

The calls to those macros, as well as any custom wrap/unwrap definitions 
(like for array of Values for example), are put into corresponding C++ 
headers.

Doing this required some #include surgery, since some .cpp files relied 
on the fact that including Wrap.h implicitly caused the inclusion of a 
bunch of other things.

This also now means that the C++ headers will include their corresponding 
C API headers; for example Value.h must include llvm-c/Core.h.  I think 
this is harmless, since the C API headers contain just external function 
declarations and some C types, so I don't believe there should be any 
nasty dependency issues here.

llvm-svn: 180881

dec20e43

Apr 23, 2013

Move C++ code out of the C headers and into either C++ headers · 04d4e931

Eric Christopher authored Apr 22, 2013

or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.

llvm-svn: 180063

04d4e931

Apr 16, 2013

SLPVectorizer: Make it a function pass and add code for hoisting the... · b9116e69

Nadav Rotem authored Apr 15, 2013

SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops.

llvm-svn: 179562

b9116e69

Apr 15, 2013
- Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make... · d4dcc003
  Nadav Rotem authored Apr 15, 2013
```
Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer.

llvm-svn: 179508
```
  d4dcc003
- Rename the slp-vectorizer clang/llvm flags. No functionality change. · a1e5e44e
  Nadav Rotem authored Apr 15, 2013
```
llvm-svn: 179505
```
  a1e5e44e
Mar 10, 2013
- Use LLVMBool instead of 'bool' in the C API. Based on a patch by Peter Zotov! · 5f508541
  Nick Lewycky authored Mar 10, 2013
```
llvm-svn: 176793
```
  5f508541
Mar 06, 2013

Generalize my previous fix for -print-options. · fcb37243

Andrew Trick authored Mar 06, 2013

Always print options that differ from their implicit default. At least
for simple option types.

llvm-svn: 176572

fcb37243

Give -loop-vectorize an explicit default. · 946c2b32

Andrew Trick authored Mar 06, 2013

This way, clang -mllvm -print-options shows that the driver is overriding it.

llvm-svn: 176569

946c2b32

Jan 29, 2013

Unroll again after running BBVectorize · bf4db4fe

Hal Finkel authored Jan 29, 2013

Because BBVectorize may significantly shorten a loop body, unroll
again after vectorization. This is especially important when using
runtime or partial unrolling.

llvm-svn: 173730

bf4db4fe

Jan 07, 2013
- Remove the long defunct 'DefaultPasses' header. We have a pass manager · 683ff2d7
  Chandler Carruth authored Jan 07, 2013
```
builder these days, and this thing hasn't seen updates for a very long
time.

llvm-svn: 171741
```
  683ff2d7
Jan 04, 2013

Move the loop vectorizer from O2 to O3. It looks like the increase in code... · be6570d4

Nadav Rotem authored Jan 04, 2013

Move the loop vectorizer from O2 to O3. It looks like the increase in code size actually hurts the performance on many programs.

llvm-svn: 171471

be6570d4

Dec 21, 2012
- Remove duplicate includes. · a229186a
  Roman Divacky authored Dec 21, 2012
```
llvm-svn: 170902
```
  a229186a
Dec 19, 2012
- Enable the loop vectorizer in clang and not in the pass manager, so that we... · 9aee065e
  Nadav Rotem authored Dec 18, 2012
```
Enable the loop vectorizer in clang and not in the pass manager, so that we can disable it in clang.

llvm-svn: 170470
```
  9aee065e
Dec 18, 2012
- Enable the loop vectorizer. · c0699854
  Nadav Rotem authored Dec 18, 2012
```
llvm-svn: 170416
```
  c0699854
Dec 15, 2012
- Revert r170246, "Enable the loop vectorizer by default." · 8f45b6c7
  NAKAMURA Takumi authored Dec 15, 2012
```
llvm-svn: 170267
```
  8f45b6c7
Dec 14, 2012
- Enable the loop vectorizer by default. · acde7748
  Nadav Rotem authored Dec 14, 2012
```
llvm-svn: 170246
```
  acde7748
- revert r170166 - disable the loop vectorizer. · d3a3c9fd
  Nadav Rotem authored Dec 14, 2012
```
llvm-svn: 170172
```
  d3a3c9fd