Commits · 9e00eb38a2e16a79bf3ebd567511225f622a9b1e · Roger Ferrer / llvm-epi-0.8

May 22, 2013

SLPVectorizer: Change the order in which new instructions are added to the function. · 9e00eb38

Nadav Rotem authored May 22, 2013

We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs).
* Imroved the numbering API.
* Changed the placement of new instructions to the last root.
* Fixed a bug with external tree users with non-zero lane.
* Fixed a bug in the placement of in-tree users.

llvm-svn: 182508

9e00eb38

This is an update to a previous commit (r181216). · 0dda6f16

Jean-Luc Duprat authored May 22, 2013

The earlier change list introduced the following inst combines:
B * (uitofp i1 C) —> select C, B, 0
A * (1 - uitofp i1 C) —> select C, 0, A
select C, 0, B + select C, A, 0 —> select C, A, B

Together these 3 changes would simplify :
A * (1 - uitofp i1 C) + B * uitofp i1 C 
down to :
select C, B, A

In practice we found that the first two substitutions can have a
negative effect on performance, because they reduce opportunities to
use FMA contractions; between the two options FMAs are often the
better choice.  This change list amends the previous one to enable
just these inst combines:

select C, B, 0 + select C, 0, A —> select C, B, A
A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A

llvm-svn: 182499

0dda6f16

LoopVectorize: Make Value pointers that could be RAUW'ed a VH · 12b0d1cd

Arnold Schwaighofer authored May 22, 2013

The Value pointers we store in the induction variable list can be RAUW'ed by a
call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing
in some other places where we store pointers that could potentially be RAUW'ed.

Fixes PR16073.

llvm-svn: 182485

12b0d1cd

May 21, 2013

[msan] A no-op implementation of VarArg handling. · ebd7f8e7

Evgeniy Stepanov authored May 21, 2013

This stuff is used on platforms where MSan does not have a proper VarArg
implementation (anything other than x86_64 at the moment).

llvm-svn: 182375

ebd7f8e7

May 20, 2013

Remove unused #include. · 5f474039
Bill Wendling authored May 20, 2013
```
llvm-svn: 182315
```
5f474039
Rename LoopSimplify.h to LoopUtils.h · a969df84
Hal Finkel authored May 20, 2013
```
As discussed, LoopUtils.h is a better name.

llvm-svn: 182314
```
a969df84

Expose InsertPreheaderForLoop from LoopSimplify to other passes · a12d82b4

Hal Finkel authored May 20, 2013

Other passes, PPC counter-loop formation for example, also need to add loop
preheaders outside of the regular loop simplification pass. This makes
InsertPreheaderForLoop a global function so that it can be used by other
passes.

No functionality change intended.

llvm-svn: 182299

a12d82b4

May 18, 2013

LoopVectorize: Handle single edge PHIs · 693a1ca6

Arnold Schwaighofer authored May 18, 2013

We might encouter single edge PHIs - handle them with an identity select.

Fixes PR15990.

llvm-svn: 182199

693a1ca6

May 17, 2013
- Add missing -*- C++ -*- to headers · 52ddb7bc
  Matt Arsenault authored May 17, 2013
```
llvm-svn: 182164
```
  52ddb7bc
- LoopVectorize: Simplify code. No functionality change. · d84a6339
  Benjamin Kramer authored May 17, 2013
```
llvm-svn: 182100
```
  d84a6339
May 16, 2013

[msan] Switch TLS globals to initial-exec model. · 1e764324
Evgeniy Stepanov authored May 16, 2013
```
They are always defined in the main executable.

llvm-svn: 181994
```
1e764324

LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvert · 88e7fddc

Arnold Schwaighofer authored May 15, 2013

We only want to check this once, not for every conditional block in the loop.

No functionality change (except that we don't perform a check redudantly
anymore).

llvm-svn: 181942

88e7fddc

May 15, 2013

[objc-arc] Fixed a spelling error and made the statistic descriptions be... · b4e7f4d8

Michael Gottesman authored May 15, 2013

[objc-arc] Fixed a spelling error and made the statistic descriptions be consistent about their usage of periods.

llvm-svn: 181901

b4e7f4d8

LoopVectorize: Fix comments · 09cee972
Arnold Schwaighofer authored May 15, 2013
```
No functionality change.

llvm-svn: 181862
```
09cee972

LoopVectorize: Hoist conditional loads if possible · 2d920477

Arnold Schwaighofer authored May 15, 2013

InstCombine can be uncooperative to vectorization and sink loads into
conditional blocks. This prevents vectorization.

Undo this optimization if there are unconditional memory accesses to the same
addresses in the loop.

radar://13815763

llvm-svn: 181860

2d920477

Fix two typo · 149e281a
Sylvestre Ledru authored May 14, 2013
```
llvm-svn: 181848
```
149e281a

May 14, 2013

GlobalOpt: fix an issue where CXAAtExitFn points to a deleted function. · b3c52fb4

Manman Ren authored May 14, 2013

CXAAtExitFn was set outside a loop and before optimizations where functions
can be deleted. This patch will set CXAAtExitFn inside the loop and after
optimizations.

Seg fault when running LTO because of accesses to a deleted function.
rdar://problem/13838828

llvm-svn: 181838

b3c52fb4

Removed trailing whitespace. · 0c8b5628
Michael Gottesman authored May 14, 2013
```
llvm-svn: 181760
```
0c8b5628

LoopVectorize: Handle loops with multiple forward inductions · 2e7a922a

Arnold Schwaighofer authored May 14, 2013

We used to give up if we saw two integer inductions. After this patch, we base
further induction variables on the chosen one like we do in the reverse
induction and pointer induction case.

Fixes PR15720.

radar://13851975

llvm-svn: 181746

2e7a922a

[objc-arc-opts] Added debug statements when we set and unset whether a pointer is known positive. · f3f9e3b1
Michael Gottesman authored May 14, 2013
```
llvm-svn: 181745
```
f3f9e3b1

[objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs... · a76143ee

Michael Gottesman authored May 13, 2013

[objc-arc-opts] In the presense of an alloca unconditionally remove RR pairs if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or.

In the presense of a block being initialized, the frontend will emit the
objc_retain on the original pointer and the release on the pointer loaded from
the alloca. The optimizer will through the provenance analysis realize that the
two are related (albiet different), but since we only require KnownSafe in one
direction, will match the inner retain on the original pointer with the guard
release on the original pointer. This is fixed by ensuring that in the presense
of allocas we only unconditionally remove pointers if both our retain and our
release are KnownSafe (i.e. we are KnownSafe in both directions) since we must
deal with the possibility that the frontend will emit what (to the optimizer)
appears to be unbalanced retain/releases.

An example of the miscompile is:

  %A = alloca
  retain(%x)
  retain(%x) <--- Inner Retain
  store %x, %A
  %y = load %A
  ... DO STUFF ...
  release(%y)
  call void @use(%x)
  release(%x) <--- Guarding Release

getting optimized to:

  %A = alloca
  retain(%x)
  store %x, %A
  %y = load %A
  ... DO STUFF ...
  release(%y)
  call void @use(%x)

rdar://13750319

llvm-svn: 181743

a76143ee

May 13, 2013
- Move a couple more statistics inside '#ifndef NDEBUG'. · e55d9492
  Matt Beaumont-Gay authored May 13, 2013
```
Suppresses an unused-variable warning in -Asserts builds.

llvm-svn: 181733
```
  e55d9492
- [objc-arc-opts] Add comment to BBState making it clear that... · 993fbf70
  Michael Gottesman authored May 13, 2013
```
[objc-arc-opts] Add comment to BBState making it clear that get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg.

llvm-svn: 181726
```
  993fbf70
- [objc-arc] Move the before optimization statistics gathering phase out of OptimizeIndividualCalls. · 9fc50b82
  Michael Gottesman authored May 13, 2013
```
This makes the statistics gathering completely independent of the actual
optimization occuring, preventing any sort of bleeding over from occuring.

Additionally, it simplifies a switch statement in the non-statistic gathering case.

llvm-svn: 181719
```
  9fc50b82
- Suppress GCC compiler warnings in release builds about variables that are only · 0480b9b5
  Duncan Sands authored May 13, 2013
```
read in asserts.

llvm-svn: 181689
```
  0480b9b5
- SLPVectorizer: Swap LHS and RHS. No functionality change. · 33dcf0a7
  Nadav Rotem authored May 13, 2013
```
llvm-svn: 181684
```
  33dcf0a7
- SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users. · ce42cc6d
  Nadav Rotem authored May 12, 2013
```
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract.

llvm-svn: 181674
```
  ce42cc6d
- SLPVectorizer: Clear the map that maps between scalars to vectors after each... · cbf6d24d
  Nadav Rotem authored May 12, 2013
```
SLPVectorizer: Clear the map that maps between scalars to vectors after each round of vectorization.
Testcase in the next commit.

llvm-svn: 181673
```
  cbf6d24d
May 12, 2013

InstCombine: Flip the order of two urem transforms · 6c30f49a

David Majnemer authored May 12, 2013

There are two transforms in visitUrem that conflict with each other.

*) One, if a divisor is a power of two, subtracts one from the divisor
   and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
   instructions.

Flipping the order allows the subtraction to go beneath the sign
extension.

llvm-svn: 181668

6c30f49a

LoopVectorize: Use the widest induction variable type · f2305e44

Arnold Schwaighofer authored May 11, 2013

Use the widest induction type encountered for the cannonical induction variable.

We used to turn the following loop into an empty loop because we used i8 as
induction variable type and truncated 1024 to 0 as trip count.

int a[1024];
void fail() {
  int reverse_induction = 1023;
  unsigned char forward_induction = 0;
  while ((reverse_induction) >= 0) {
    forward_induction++;
    a[reverse_induction] = forward_induction;
    --reverse_induction;
  }
}

radar://13862901

llvm-svn: 181667

f2305e44

LoopVectorize: Use variable instead of repeated function call · a544fefa
Arnold Schwaighofer authored May 11, 2013
```
No functionality change intended.

llvm-svn: 181666
```
a544fefa
LoopVectorize: Use IRBuilder interface in more places · 1ba84df4
Arnold Schwaighofer authored May 11, 2013
```
No functionality change intended.

llvm-svn: 181665
```
1ba84df4

May 11, 2013

InstCombine: Turn urem to bitwise-and more often · 470b077b

David Majnemer authored May 11, 2013

Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.

llvm-svn: 181661

470b077b

SLPVectorizer: Add support for trees with external users. · cdfb48d2

Nadav Rotem authored May 10, 2013

For example:
bar() {
  int a = A[i];
  int b = A[i+1];
  B[i] = a;
  B[i+1] = b;
  foo(a);  <--- a is used outside the vectorized expression.
}

llvm-svn: 181648

cdfb48d2

Add a debug print · 0686e5cb
Nadav Rotem authored May 10, 2013
```
llvm-svn: 181647
```
0686e5cb

May 10, 2013

InstCombine: Don't claim to be able to evaluate any shl in a zexted type. · 14e915f7

Benjamin Kramer authored May 10, 2013

The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.

PR15959.

llvm-svn: 181604

14e915f7

InstCombine: Verify the type before transforming uitofp into select. · a6645e8b
Benjamin Kramer authored May 10, 2013
```
PR15952.

llvm-svn: 181586
```
a6645e8b

May 09, 2013

Fix a documentation warning: \bried -> \brief · 9bf66a5f
Dmitri Gribenko authored May 09, 2013
```
llvm-svn: 181551
```
9bf66a5f

[GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next · 1d8d7e4d

Shuxin Yang authored May 09, 2013

  iteration.
  
  This on step toward non-iterative GVN. My local hack suggests that getting rid
of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++).
I cannot explain why not 2x or more at this moment.

llvm-svn: 181532

1d8d7e4d

Don't replace an alias in llvm.used with its target. · 00752167

Rafael Espindola authored May 09, 2013

When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).

llvm-svn: 181524

00752167