Commits · dfd8fcbb00336a0577637936a1c9bc296be60ecd · Roger Ferrer / llvm-epi-0.8

Apr 20, 2013
- Fix the header comment. · dfd8fcbb
  Nadav Rotem authored Apr 20, 2013
  
  llvm-svn: 179928
  dfd8fcbb
- Use 64bit arithmetic for calculating distance between pointers. · 5ed99674
  Nadav Rotem authored Apr 20, 2013
  
  llvm-svn: 179927
  5ed99674
- MergeFunc: Make pointer and integer types generate the same hash. · 630e6e14
  Benjamin Kramer authored Apr 19, 2013
  
  The logic that actually compares the types considers pointers and integers the same if they are of the same size. This created a strange mismatch between hash and reality and made the test case for this fail on some platforms (yay, test cases). llvm-svn: 179905
  630e6e14
Apr 19, 2013
- LoopVectorizer: Use matcher from PatternMatch.h for the min/max patterns · 51469403
  Arnold Schwaighofer authored Apr 19, 2013
  
  Also make some static function class functions to avoid having to mention the class namespace for enums all the time. No functionality change intended. llvm-svn: 179886
  51469403
- Keep coding stanard. Don't use "else if" after "return". · 99317268
  Jakub Staszak authored Apr 19, 2013
  
  llvm-svn: 179826
  99317268
- Implement a better fix for PR15185. · 3b21eb69
  Bill Wendling authored Apr 18, 2013
  
  If the return type is a pointer and the call returns an integer, then do the inttoptr convertions. And vice versa. llvm-svn: 179817
  3b21eb69
Apr 18, 2013

Fix a -Wdocumentation warning · d29ea044
Dmitri Gribenko authored Apr 18, 2013
```
llvm-svn: 179789
```
d29ea044

In the function InstCombiner::visitExtractElementInst() removed the limitation... · 5570318f

Anat Shemer authored Apr 18, 2013

In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use.

llvm-svn: 179786

5570318f

Added a function scalarizePHI() that sclarizes a vector phi instruction if it... · 0c95efad

Anat Shemer authored Apr 18, 2013

Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location.

llvm-svn: 179783

0c95efad

Fix a comment, PR15777. · 8cf09416
Chris Lattner authored Apr 18, 2013
```
llvm-svn: 179775
```
8cf09416

LoopVectorizer: Recognize min/max reductions · 4cd6aa11

Arnold Schwaighofer authored Apr 18, 2013

A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y)
sequence in LLVM. If we see such a sequence we can treat it just as any other
commutative binary instruction and reduce it.

This appears to help bzip2 by about 1.5% on an imac12,2.

radar://12960601

llvm-svn: 179773

4cd6aa11

LoopVectorize: Use a set to avoid longer cycles in the reduction chain too. · 8df2cfb8
Benjamin Kramer authored Apr 18, 2013
```
Fixes PR15748.

llvm-svn: 179757
```
8df2cfb8
Revert "Combine bit test + conditional or into simple math" · 81af06e0
David Majnemer authored Apr 18, 2013
```
It is causing stage2 builds to fail, let's get them running again.

llvm-svn: 179750
```
81af06e0

Combine bit test + conditional or into simple math · bdf0caf6

David Majnemer authored Apr 18, 2013

Simplify:
(select (icmp eq (and X, C1), 0), Y, (or Y, C2))

Into:
(or (shl (and X, C1), C3), y)

Where:
C3 = Log(C2) - Log(C1)

If:
C1 and C2 are both powers of two

llvm-svn: 179748

bdf0caf6

[objc-arc] Do not mismatch up retains inside a for loop with releases outside... · 323964ca

Michael Gottesman authored Apr 18, 2013

[objc-arc] Do not mismatch up retains inside a for loop with releases outside said for loop in the presense of differing provenance caused by escaping blocks.

This occurs due to an alloca representing a separate ownership from the
original pointer. Thus consider the following pseudo-IR:

  objc_retain(%a)
  for (...) {
    objc_retain(%a)
    %block <- %a
    F(%block)
    objc_release(%block)
  }
  objc_release(%a)

From the perspective of the optimizer, the %block is a separate
provenance from the original %a. Thus the optimizer pairs up the inner
retain for %a and the outer release from %a, resulting in segfaults.

This is fixed by noting that the signature of a mismatch of
retain/releases inside the for loop is a Use/CanRelease top down with an
None bottom up (since bottom up the Retain-CanRelease-Use-Release
sequence is completed by the inner objc_retain, but top down due to the
differing provenance from the objc_release said sequence is not
completed). In said case in CheckForCFGHazards, we now clear the state
of %a implying that no pairing will occur.

Additionally a test case is included.

rdar://12969722

llvm-svn: 179747

323964ca

Removed trailing whitespace. · 9e518139
Michael Gottesman authored Apr 18, 2013
```
llvm-svn: 179746
```
9e518139

Apr 17, 2013
- [objc-arc] Added annotation option to only emit annotations for a specific ssa identifier. · 4e88ce68
  Michael Gottesman authored Apr 17, 2013
  
  llvm-svn: 179729
  4e88ce68
- Fixed typo. · adb921af
  Michael Gottesman authored Apr 17, 2013
  
  llvm-svn: 179721
  adb921af
- [objc-arc] Added descriptions for EnableARCAnnotations,... · 6806b51a
  Michael Gottesman authored Apr 17, 2013
  
  [objc-arc] Added descriptions for EnableARCAnnotations, EnableCheckForCFGHazards, EnableARCOptimizations. llvm-svn: 179718
  6806b51a
- [objc-arc] Added an option to arc-annotations for turning off CheckForCFGHazard. · ffef24f9
  Michael Gottesman authored Apr 17, 2013
  
  llvm-svn: 179717
  ffef24f9
- Do not optimise fprintf() calls if its return value is used. · 37ae72b5
  Peter Collingbourne authored Apr 17, 2013
  
  Differential Revision: http://llvm-reviews.chandlerc.com/D620 llvm-svn: 179661
  37ae72b5
Apr 16, 2013

simplifycfg: Fix integer overflow converting switch into icmp. · c9e1d992

Hans Wennborg authored Apr 16, 2013

If a switch instruction has a case for every possible value of its type,
with the same successor, SimplifyCFG would replace it with an icmp ult,
but the computation of the bound overflows in that case, which inverts
the test.

Patch by Jed Davis!

llvm-svn: 179587

c9e1d992

We are not able to bitcast a pointer to an integral value. · 37891719

Bill Wendling authored Apr 15, 2013

Two return types are not equivalent if one is a pointer and the other is an
integral. This is because we cannot bitcast a pointer to an integral value.
PR15185

llvm-svn: 179569

37891719

SLPVectorizer: Make it a function pass and add code for hoisting the... · b9116e69

Nadav Rotem authored Apr 15, 2013

SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops.

llvm-svn: 179562

b9116e69

Apr 15, 2013
- Fix a typo in comment. · 0f38c1e3
  Jim Grosbach authored Apr 15, 2013
  
  llvm-svn: 179542
  0f38c1e3
- Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make... · d4dcc003
  Nadav Rotem authored Apr 15, 2013
  
  Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer. llvm-svn: 179508
  d4dcc003
- Rename the slp-vectorizer clang/llvm flags. No functionality change. · a1e5e44e
  Nadav Rotem authored Apr 15, 2013
  
  llvm-svn: 179505
  a1e5e44e
- SLPVectorizer: Add support for vectorizing trees that start at compare instructions. · 5d393c41
  Nadav Rotem authored Apr 15, 2013
  
  llvm-svn: 179504
  5d393c41
Apr 14, 2013

Reorders two transforms that collide with each other · 1fae1955

David Majnemer authored Apr 14, 2013

One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224

llvm-svn: 179493

1fae1955

Miscellaneous cleanups for VecUtils.h · 7d62ea86
Benjamin Kramer authored Apr 14, 2013
```
llvm-svn: 179483
```
7d62ea86
SLP: Document the scalarization cost method. · 3403c115
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179479
```
3403c115

SLPVectorizer: Add support for trees that don't start at binary operators, and... · 54b413d1

Nadav Rotem authored Apr 14, 2013

SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree.

llvm-svn: 179475

54b413d1

SLPVectorizer: add initial support for reduction variable vectorization. · 0b9cf856
Nadav Rotem authored Apr 14, 2013
```
llvm-svn: 179470
```
0b9cf856

Apr 13, 2013

GlobalDCE: Fix an oversight in my last commit that could lead to crashes. · adc1727c
Benjamin Kramer authored Apr 13, 2013
```
There is a Constant with non-constant operands: blockaddress.

llvm-svn: 179460
```
adc1727c

Fix a scalability issue with complex ConstantExprs. · 89ca4bc6

Benjamin Kramer authored Apr 13, 2013

This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.

For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.

The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.

llvm-svn: 179458

89ca4bc6

Apr 12, 2013

InstCombine: Check the operand types before merging fcmp ord & fcmp ord. · e89c7050
Benjamin Kramer authored Apr 12, 2013
```
Fixes PR15737.

llvm-svn: 179417
```
e89c7050

SLPVectorizer: add support for vectorization of diamond shaped trees. We now... · 8543ba3e

Nadav Rotem authored Apr 12, 2013

SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from. 

llvm-svn: 179414

8543ba3e

Add debug prints. · 4da0ab1d
Nadav Rotem authored Apr 12, 2013
```
llvm-svn: 179412
```
4da0ab1d

Simplify (A & ~B) in icmp if A is a power of 2 · 1a08accb

David Majnemer authored Apr 12, 2013

The transform will execute like so:
(A & ~B) == 0 --> (A & B) != 0
(A & ~B) != 0 --> (A & B) == 0

llvm-svn: 179386

1a08accb

LoopVectorizer: integer division is not a reduction operation · f9cea17f

Arnold Schwaighofer authored Apr 12, 2013

Don't classify idiv/udiv as a reduction operation. Integer division is lossy.
For example : (1 / 2) * 4 != 4/2.

Example:

int a[] = { 2, 5, 2, 2}
int x = 80;

for()
  x /= a[i];

Scalar:
  x /= 2 // = 40
  x /= 5 // = 8
  x /= 2 // = 4
  x /= 2 // = 2

Vectorized:

 <80, 1> / <2,5> //= <40,0>
 <40, 0> / <2,2> //= <20,0>

 20*0 = 0

radar://13640654

llvm-svn: 179381

f9cea17f