Commits · bf45efde2dda05cd20f3bb6b6037fc34d427df61 · Roger Ferrer / llvm-epi-0.8

Nov 17, 2013

Hal Finkel authored Nov 16, 2013

This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The
transformation aims to take loops like this:

for (int i = 0; i < 3200; i += 5) {
  a[i]     += alpha * b[i];
  a[i + 1] += alpha * b[i + 1];
  a[i + 2] += alpha * b[i + 2];
  a[i + 3] += alpha * b[i + 3];
  a[i + 4] += alpha * b[i + 4];
}

and turn them into this:

for (int i = 0; i < 3200; ++i) {
  a[i] += alpha * b[i];
}

and loops like this:

for (int i = 0; i < 500; ++i) {
  x[3*i] = foo(0);
  x[3*i+1] = foo(0);
  x[3*i+2] = foo(0);
}

and turn them into this:

for (int i = 0; i < 1500; ++i) {
  x[i] = foo(0);
}

There are two motivations for this transformation:

  1. Code-size reduction (especially relevant, obviously, when compiling for
code size).

  2. Providing greater choice to the loop vectorizer (and generic unroller) to
choose the unrolling factor (and a better ability to vectorize). The loop
vectorizer can take vector lengths and register pressure into account when
choosing an unrolling factor, for example, and a pre-unrolled loop limits that
choice. This is especially problematic if the manual unrolling was optimized
for a machine different from the current target.

The current implementation is limited to single basic-block loops only. The
rerolling recognition should work regardless of how the loop iterations are
intermixed within the loop body (subject to dependency and side-effect
constraints), but the significant restriction is that the order of the
instructions in each iteration must be identical. This seems sufficient to
capture all current use cases.

This pass is not currently enabled by default at any optimization level.

llvm-svn: 194939

bf45efde

Nov 15, 2013

ArgumentPromotion: correctly transfer TBAA tags and alignments. · bc37658a

Manman Ren authored Nov 15, 2013

We used to use std::map<IndicesVector, LoadInst*> for OriginalLoads, and when we
try to promote two arguments, they will both write to OriginalLoads causing
created loads for the two arguments to have the same original load. And the same
tbaa tag and alignment will be put to the created loads for the two arguments.

The fix is to use std::map<std::pair<Argument*, IndicesVector>, LoadInst*>
for OriginalLoads, so each Argument will write to different parts of the map.

PR17906

llvm-svn: 194846

bc37658a

Nov 12, 2013

Corruptly merge constants with explicit and implicit alignments. · dd8757ab

Rafael Espindola authored Nov 12, 2013

Constant merge can merge a constant with implicit alignment with one that has
explicit alignment. Before this change it was assuming that the explicit
alignment was higher than the implicit one, causing the result to be under
aligned in some cases.

Fixes pr17815.

Patch by Chris Smowton!

llvm-svn: 194506

dd8757ab

Nov 10, 2013
- Teach MergeFunctions about address spaces · 5bcefabc
  Matt Arsenault authored Nov 10, 2013
```
llvm-svn: 194342
```
  5bcefabc
Nov 04, 2013
- Remove dead code · d1382b6c
  Shuxin Yang authored Nov 04, 2013
```
llvm-svn: 194017
```
  d1382b6c
Nov 03, 2013
- Spell "Actual" correctly · 927df85d
  David Majnemer authored Nov 03, 2013
```
llvm-svn: 193954
```
  927df85d
Oct 31, 2013

Use LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN instead of the "dso list". · 282a4703

Rafael Espindola authored Oct 31, 2013

There are two ways one could implement hiding of linkonce_odr symbols in LTO:
* LLVM tells the linker which symbols can be hidden if not used from native
  files.
* The linker tells LLVM which symbols are not used from other object files,
  but will be put in the dso symbol table if present.

GOLD's API is the second option. It was implemented almost 1:1 in llvm by
passing the list down to internalize.

LLVM already had partial support for the first option. It is also very similar
to how ld64 handles hiding these symbols when *not* doing LTO.

This patch then
* removes the APIs for the DSO list.
* marks LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN all linkonce_odr unnamed_addr
  global values and other linkonce_odr whose address is not used.
* makes the gold plugin responsible for handling the API mismatch.

llvm-svn: 193800

282a4703

Merge CallGraph and BasicCallGraph. · 6554e5a9
Rafael Espindola authored Oct 31, 2013
```
llvm-svn: 193734
```
6554e5a9

Oct 27, 2013
- Revert r193251 : Use address-taken to disambiguate global variable and indirect memops. · 2e1890e1
  Shuxin Yang authored Oct 27, 2013
```
llvm-svn: 193489
```
  2e1890e1
Oct 23, 2013

Use address-taken to disambiguate global variable and indirect memops. · e4fb3759

Shuxin Yang authored Oct 23, 2013

 Major steps include:
 1). introduces a not-addr-taken bit-field in GlobalVariable
 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable 
    dosen't have its address taken.
 3). AA use this info for disambiguation. 

llvm-svn: 193251

e4fb3759

Oct 22, 2013
- Fix spelling, grammar, and match naming convention for test files. · 874fa0f6
  Eric Christopher authored Oct 21, 2013
```
llvm-svn: 193130
```
  874fa0f6
Oct 21, 2013

Use more type helper functions · 404c60a7
Matt Arsenault authored Oct 21, 2013
```
llvm-svn: 193109
```
404c60a7

Optimize more linkonce_odr values during LTO. · 3d7fc25c

Rafael Espindola authored Oct 21, 2013

When a linkonce_odr value that is on the dso list is not unnamed_addr
we can still look to see if anything is actually using its address. If
not, it is safe to hide it.

This patch implements that by moving GlobalStatus to Transforms/Utils
and using it in Internalize.

llvm-svn: 193090

3d7fc25c

Oct 19, 2013
- Mark some command line flags as hidden · 7f27e0b0
  Nadav Rotem authored Oct 18, 2013
```
llvm-svn: 193013
```
  7f27e0b0
Oct 17, 2013
- Rename fields of GlobalStatus to match the coding style. · 045a78fa
  Rafael Espindola authored Oct 17, 2013
```
llvm-svn: 192910
```
  045a78fa
- rename SafeToDestroyConstant to isSafeToDestroyConstant and clang-format. · 27797bae
  Rafael Espindola authored Oct 17, 2013
```
llvm-svn: 192907
```
  27797bae
- Simplify the interface of AnalyzeGlobal a bit and rename to analyzeGlobal. · 026c9cbe
  Rafael Espindola authored Oct 17, 2013
```
No functionality change.

llvm-svn: 192906
```
  026c9cbe
Oct 09, 2013

Fix a bug in Dead Argument Elimination. · 1cab418c

Shuxin Yang authored Oct 09, 2013

  If a function seen at compile time is not necessarily the one linked to
the binary being built, it is illegal to change the actual arguments
passing to it. 

  e.g. 
   --------------------------
   void foo(int lol) {
     // foo() has linkage satisifying isWeakForLinker()
     // "lol" is not used at all.
   }

   void bar(int lo2) {
      // xform to foo(undef) is illegal, as compiler dose not know which
      // instance of foo() will be linked to the the binary being built.
      foo(lol2); 
   }
  -----------------------------

  Such functions can be captured by isWeakForLinker(). NOTE that
mayBeOverridden() is insufficient for this purpose as it dosen't include
linkage types like AvailableExternallyLinkage and LinkOnceODRLinkage.
Take link_odr* as an example, it indicates a set of *EQUIVALENT* globals
that can be merged at link-time. However, the semantic of 
*EQUIVALENT*-functions includes parameters. Changing parameters breaks
the assumption.

  Thank John McCall for help, especially for the explanation of subtle
difference between linkage types.

  rdar://11546243

llvm-svn: 192302

1cab418c

Oct 07, 2013
- Revert r191834 until we measure the effect of this benchmarks and maybe find a better way to fix it · a1944e6d
  Alexey Samsonov authored Oct 07, 2013
```
llvm-svn: 192121
```
  a1944e6d
Oct 03, 2013

Optimize linkonce_odr unnamed_addr functions during LTO. · cda2911c

Rafael Espindola authored Oct 03, 2013

Generalize the API so we can distinguish symbols that are needed just for a DSO
symbol table from those that are used from some native .o.

The symbols that are only wanted for the dso symbol table can be dropped if
llvm can prove every other dso has a copy (linkonce_odr) and the address is not
important (unnamed_addr).

llvm-svn: 191922

cda2911c

Oct 02, 2013

Remove "localize global" optimization · 31540172

Alexey Samsonov authored Oct 02, 2013

Summary:
As discussed in http://llvm-reviews.chandlerc.com/D1754,
this optimization isn't really valid for C, and fires too rarely anyway.

Reviewers: rafael, nicholas

Reviewed By: nicholas

CC: rnk, llvm-commits, nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D1769

llvm-svn: 191834

31540172

Oct 01, 2013

Don't merge tiny functions. · 517d84e2

Matt Arsenault authored Oct 01, 2013

It's silly to merge functions like these:

define void @foo(i32 %x) {
  ret void
}

define void @bar(i32 %x) {
  ret void
}

to get

define void @bar(i32) {
  tail call void @foo(i32 %0)
  ret void
}

llvm-svn: 191786

517d84e2

Sep 22, 2013

Provide basic type safety for array_pod_sort comparators. · 8817cca5

Benjamin Kramer authored Sep 22, 2013

This makes using array_pod_sort significantly safer. The implementation relies
on function pointer casting but that should be safe as we're dealing with void*
here.

llvm-svn: 191175

8817cca5

Sep 17, 2013

Bugfix for PR17099: · dc2c4b44

Stepan Dyatkovskiy authored Sep 17, 2013

Wrong cast operation.
MergeFunctions emits Bitcast instead of pointer-to-integer operation.
Patch fixes MergeFunctions::writeThunk function. It replaces
unconditional Bitcast creation with "Value* createCast(...)" method, that
checks operand types and selects proper instruction.
See unit-test as example.

llvm-svn: 190859

dc2c4b44

Sep 16, 2013

Implement function prefix data as an IR feature. · 3fa50f9b

Peter Collingbourne authored Sep 16, 2013

Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html

Differential Revision: http://llvm-reviews.chandlerc.com/D1191

llvm-svn: 190773

3fa50f9b

Sep 13, 2013
- Avoid a compiler warning about Found not being used when assertions are · c9e95ad0
  Duncan Sands authored Sep 13, 2013
```
disabled.

llvm-svn: 190668
```
  c9e95ad0
Sep 11, 2013

Use type form of getIntPtrType · d3471e9e

Matt Arsenault authored Sep 11, 2013

This doesn't change anything since malloc always returns
address space 0.

llvm-svn: 190498

d3471e9e

Sep 10, 2013

Don't shrink atomic ops to bool in GlobalOpt. · 33d37007

Eli Friedman authored Sep 09, 2013

LLVM IR doesn't currently allow atomic bool load/store operations, and the
transformation is dubious anyway because it isn't profitable on all platforms.

PR17163.

llvm-svn: 190357

33d37007

Sep 05, 2013
- Remove unused argument. · d21ac19b
  Rafael Espindola authored Sep 05, 2013
```
llvm-svn: 190090
```
  d21ac19b
- Declare missing dependency on AliasAnalysis. Patch by Liu Xin! · 2c88067a
  Nick Lewycky authored Sep 05, 2013
```
llvm-svn: 190035
```
  2c88067a
Sep 04, 2013

Rename some variables to match the style guide. · b7c0b4a3
Rafael Espindola authored Sep 04, 2013
```
I am about to patch this code, and this makes the diff far more readable.

llvm-svn: 189982
```
b7c0b4a3
Small simplification given that insert of an empty range is a nop. · b832d498
Rafael Espindola authored Sep 04, 2013
```
llvm-svn: 189971
```
b832d498
Refactor duplicated logic to a helper function. · 49a6c153
Rafael Espindola authored Sep 04, 2013
```
No functionality change.

llvm-svn: 189969
```
49a6c153
Remove dead code. · 9406516a
Rafael Espindola authored Sep 04, 2013
```
llvm-svn: 189967
```
9406516a

Revert "Add r159136 back now that pr13124 has been fixed." · 128c5ea9

Rafael Espindola authored Sep 04, 2013

This reverts commit r189886.

I found a corner case where this optimization is not valid:

Say we have a "linkonce_odr unnamed_addr" in two translation units:
* In TU 1 this optimization kicks in and makes it hidden.
* In TU 2 it gets const merged with a constant that is *not* unnamed_addr,
  resulting in a non unnamed_addr constant with default visibility.
* The static linker rules for combining visibility them produce a hidden
  symbol, which is incorrect from the point of view of the non unnamed_addr
  constant.

The one place we can do this is when we know that the symbol is not used from
another TU in the same shared object, i.e., during LTO. I will move it there.

llvm-svn: 189954

128c5ea9

Add r159136 back now that pr13124 has been fixed. · 5eb7df68

Rafael Espindola authored Sep 03, 2013

Original message:
If a constant or a function has linkonce_odr linkage and unnamed_addr, mark
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 189886

5eb7df68

Sep 03, 2013

Enable late-vectorization by default. · 5d78dba6

Nadav Rotem authored Sep 03, 2013

This patch changes the default setting for the LateVectorization flag that controls where the loop-vectorizer is ran.

Perf gains:
SingleSource/Benchmarks/Shootout/matrix -37.33%
MultiSource/Benchmarks/PAQ8p/paq8p  -22.83%
SingleSource/Benchmarks/Linpack/linpack-pc  -16.22%
SingleSource/Benchmarks/Shootout-C++/ary3 -15.16%
MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -10.34%
MultiSource/Benchmarks/TSVC/NodeSplitting-dbl/NodeSplitting-dbl -7.12%

Regressions:
SingleSource/Benchmarks/Misc/lowercase  15.10%
MultiSource/Benchmarks/TSVC/Equivalencing-flt/Equivalencing-flt 13.18%
SingleSource/Benchmarks/Shootout-C++/matrix 8.27%
SingleSource/Benchmarks/CoyoteBench/lpbench 7.30%

llvm-svn: 189858

5d78dba6

Aug 30, 2013
- Compulsive reformatting. · 2865be79
  Bill Wendling authored Aug 30, 2013
```
llvm-svn: 189697
```
  2865be79
- Random cleanup: No need to use a std::vector here, since createInternalizePass uses an ArrayRef. · 4c0d9ade
  Bill Wendling authored Aug 30, 2013
```
llvm-svn: 189632
```
  4c0d9ade
Aug 29, 2013

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC... · 4c459bcd

Nadav Rotem authored Aug 28, 2013

Vectorizer/PassManager: I am working on moving the vectorizer out of the SCC passes. This patch moves the SLP-vectorizer and BB-vectorizer back into SCC passes for two reasons:
1. They are a kind of cannonicalization.
2. The performance measurements show that it is better to keep them in.

There should be no functional change if you are not enabling the LateVectorization mode.

llvm-svn: 189539

4c459bcd