Commits · ad0b69fe3ec570779c4e7f4ef833d568d1d41096 · Roger Ferrer / llvm-epi-0.8

Oct 29, 2012

Remove a wrapper around getIntPtrType added to GVN by Hal in commit 166624 (the · 5bdd9dda

Duncan Sands authored Oct 29, 2012

wrapper returns a vector of integers when passed a vector of pointers) by having
getIntPtrType itself return a vector of integers in this case. Outside of this
wrapper, I didn't find anywhere in the codebase that was relying on the old
behaviour for vectors of pointers, so give this a whirl through the buildbots.

llvm-svn: 166939

5bdd9dda

Change the PassManagerBuilder (used by -O3) loop vectorizer flag from... · c59ae207

Nadav Rotem authored Oct 29, 2012

Change the PassManagerBuilder (used by -O3) loop vectorizer flag from -vectorize to -vectorize-loops because we dont want to share the same flag as the bb-vectorizer.

llvm-svn: 166937

c59ae207

llvm-extract changes linkages so that functions on both sides of the · 56183fbe

Rafael Espindola authored Oct 29, 2012

split module can see each other. If it is keeping a symbol that already has
a non local linkage, it doesn't need to change it.

llvm-svn: 166908

56183fbe

llvm-extract was unable to handle aliases. It would leave a copy on the · 9d30d0fc

Rafael Espindola authored Oct 29, 2012

output of both

llvm-extract foo.ll -func=bar
and
llvm-extract foo.ll -func=bar -delete

so the two new files could not be linked together anymore. With this change
alias are handled almost like functions and global variables. Almost because
with alias we cannot just clear the initializer/body, we have to create a new
declaration and replace the alias with it.

The net result is that now the output of the above commands can be linked
even if foo.ll has aliases.

llvm-svn: 166907

9d30d0fc

Oct 27, 2012

LoopIdiom: Add checks to avoid turning memmove into an infinite loop. · 8d2ee55a
Benjamin Kramer authored Oct 27, 2012
```
I don't think this is possible with the current implementation but that may change eventually.

llvm-svn: 166877
```
8d2ee55a

LoopIdiom: Recognize memmove loops. · 1c9e5186

Benjamin Kramer authored Oct 27, 2012

This turns loops like
  for (unsigned i = 0; i != n; ++i)
    p[i] = p[i+1];
into memmove, which has a highly optimized implementation in most libcs.

This was really easy with the new DependenceAnalysis :)

llvm-svn: 166875

1c9e5186

LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. · d5c9be82

Benjamin Kramer authored Oct 27, 2012

Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481.

Compile time performance seems to be slightly worse, but this is mostly due
to an extra LCSSA run scheduled by the PassManager and should be fixed there.

llvm-svn: 166874

d5c9be82

Update BBVectorize to use the new VTTI instr. cost interfaces. · bad10bb2

Hal Finkel authored Oct 27, 2012

The monolithic interface for instruction costs has been split into
several functions. This is the corresponding change. No functionality
change is intended.

llvm-svn: 166865

bad10bb2

· 859366f9

Nadav Rotem authored Oct 27, 2012

1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result.
2. Change the maximum vectorization width from 4 to 8.
3. A test for both.

llvm-svn: 166864

859366f9

· afae78ed

Nadav Rotem authored Oct 26, 2012

Refactor the VectorTargetTransformInfo interface.

Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.

llvm-svn: 166836

afae78ed

Oct 26, 2012

Change the internalize pass to internalize all symbols when given an empty · 4253bd8f

Rafael Espindola authored Oct 26, 2012

list of externals. This makes sense since a shared library with no symbols
can still be useful if it has static constructors.

llvm-svn: 166795

4253bd8f

LoopSimplify: Preserve DependenceAnalysis. · 77360858

Benjamin Kramer authored Oct 26, 2012

This is currently true, but may change when DA grows more aggressive caching.
Without this setting it's impossible to use DA from a LoopPass because DA is a
function pass and cannot be properly scheduled in between LoopPasses. The
LoopManager reacts to this with an infinite loop which made this really annoying
to debug.

llvm-svn: 166788

77360858

Fix SCEV cache invalidation in LCSSA and LoopSimplify. · e3d821a4

Benjamin Kramer authored Oct 26, 2012

The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable
to analyzable but the LCSSA bug is very nasty. It only comes into play with a
specific order of the LoopPassManager worklist and can cause actual
miscompilations, when a SCEV refers to a value that has been replaced with PHI
node. SCEVExpander may then insert code into the wrong place, either violating
domination or randomly miscompiling stuff.

Comes with an extensive test case reduced from the test-suite with
bugpoint+SCEVValidator.

llvm-svn: 166787

e3d821a4

Use VTTI->getNumberOfParts in BBVectorize. · 4863448d
Hal Finkel authored Oct 26, 2012
```
This change reflects VTTI refactoring; no functionality change intended.

llvm-svn: 166752
```
4863448d
Disable generation of pointer vectors by BBVectorize. · 41a6ded4
Hal Finkel authored Oct 26, 2012
```
Once vector-of-pointer support works, then this can be reverted.

llvm-svn: 166741
```
41a6ded4

BBVectorize, when using VTTI, should not form types that will be split. · 20a49d6f

Hal Finkel authored Oct 25, 2012

This is needed so that perl's SHA can be compiled (otherwise
BBVectorize takes far too long to find its fixed point).

I'll try to come up with a reduced test case.

llvm-svn: 166738

20a49d6f

Oct 25, 2012

Begin incorporating target information into BBVectorize. · cbf9365f

Hal Finkel authored Oct 25, 2012

This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

1. Target information is used to determine if it is profitable to fuse two
instructions. This means that the cost of the vector operation must not
be more expensive than the cost of the two original operations. Pairs that
are not profitable are no longer considered (because current cost information
is incomplete, for intrinsics for example, equal-cost pairs are still
considered).

2. The 'cost savings' computed for the profitability check are also used to
rank the DAGs that represent the potential vectorization plans. Specifically,
for nodes of non-trivial depth, the cost savings is used as the node
weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

llvm-svn: 166716

cbf9365f

LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1. · 579042f7
Nadav Rotem authored Oct 25, 2012
```
llvm-svn: 166715
```
579042f7

Also optimize large switch statements. · 977f41a1

Jakob Stoklund Olesen authored Oct 25, 2012

The isValueEqualityComparison() guard at the top of SimplifySwitch()
only applies to some of the possible transformations.

The newer transformations work just fine on large switches, and the
check on predecessor count is nonsensical.

llvm-svn: 166710

977f41a1

Teach SROA how to split whole-alloca integer loads and stores into · 58d05567

Chandler Carruth authored Oct 25, 2012

smaller integer loads and stores.

The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.

Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.

If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.

Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.

llvm-svn: 166662

58d05567

Add support for additional reduction variables: AND, OR, XOR. · 5ffb049a
Nadav Rotem authored Oct 25, 2012
```
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
```
5ffb049a
revert accidental change · 086ea5c1
Nadav Rotem authored Oct 24, 2012
```
llvm-svn: 166643
```
086ea5c1
Implement a basic cost model for vector and scalar instructions. · 4a87683a
Nadav Rotem authored Oct 24, 2012
```
llvm-svn: 166642
```
4a87683a
Fix a compiler warning with an unused variable. · f07b9628
Micah Villmow authored Oct 24, 2012
```
llvm-svn: 166634
```
f07b9628

Oct 24, 2012
- Update GVN to support vectors of pointers. · 69b07a2c
  Hal Finkel authored Oct 24, 2012
```
GVN will now generate ptrtoint instructions for vectors of pointers.
Fixes PR14166.

llvm-svn: 166624
```
  69b07a2c
- whitespace · e4f491e7
  Nadav Rotem authored Oct 24, 2012
```
llvm-svn: 166622
```
  e4f491e7
- LoopVectorizer: Add a basic cost model which uses the VTTI interface. · a721b21c
  Nadav Rotem authored Oct 24, 2012
```
llvm-svn: 166620
```
  a721b21c
- Add some cleanup to the DataLayout changes requested by Chandler. · bf3eeb2d
  Micah Villmow authored Oct 24, 2012
```
llvm-svn: 166607
```
  bf3eeb2d
- Back out r166591, not sure why this made it through since I cancelled the... · 51e7246c
  Micah Villmow authored Oct 24, 2012
```
Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!

llvm-svn: 166596
```
  51e7246c
- Delete a directory that wasn't supposed to be checked in yet. · 6a8f3f9e
  Micah Villmow authored Oct 24, 2012
```
llvm-svn: 166591
```
  6a8f3f9e
- Add in support for getIntPtrType to get the pointer type based on the address space. · 12d91278
  Micah Villmow authored Oct 24, 2012
```
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
```
  12d91278
Oct 23, 2012

· 5bed7b4f

Nadav Rotem authored Oct 23, 2012

Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.

PR14158.

llvm-svn: 166491

5bed7b4f

Fix typo that somehow escaped both testing and code inspection. · 5ed3900d
Duncan Sands authored Oct 23, 2012
```
llvm-svn: 166475
```
5ed3900d

Transform code like this · 533c8ae7

Duncan Sands authored Oct 23, 2012

 %V = mul i64 %N, 4
 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
 %t1 = getelementptr i32* %arr, i32 %N
 %t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
  int foo(int *A, int N) {
    return A[N];
  }
because gcc turns this into byte pointer arithmetic before it hits the plugin:
  D.1590_2 = (long unsigned int) N_1(D);
  D.1591_3 = D.1590_2 * 4;
  D.1592_5 = A_4(D) + D.1591_3;
  D.1589_6 = *D.1592_5;
  return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.

An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.

llvm-svn: 166474

533c8ae7

Per the C++ standard, we need to include the definition of llvm::Calculate in · 6289a4e8

Richard Smith authored Oct 23, 2012

every TU where it's implicitly instantiated, even if there's an implicit
instantiation for the same types available in another TU.

llvm-svn: 166470

6289a4e8

Fix typo. · a302b6d9
Julien Lerouge authored Oct 23, 2012
```
llvm-svn: 166456
```
a302b6d9
Explain why DenseMap is still used here instead of MapVector. · d7fa5e42
Julien Lerouge authored Oct 23, 2012
```
llvm-svn: 166454
```
d7fa5e42

Oct 22, 2012
- Iterating over a DenseMap<std::pair<BasicBlock*, unsigned>, PHINode*> is not · 8cf84fa4
  Julien Lerouge authored Oct 22, 2012
```
deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>,
PHINode*> (we already have a map from BasicBlock to unsigned).

<rdar://problem/12541389>

llvm-svn: 166435
```
  8cf84fa4
- Don't crash if the load/store pointer is not a GEP. · 1c7fc71e
  Nadav Rotem authored Oct 22, 2012
```
Fix by Shivarama Rao <Shivarama.Rao@amd.com>

llvm-svn: 166427
```
  1c7fc71e
- Revert r166407 because it caused analyzer tests to crash and broke self-host bots. · 54ff5e81
  Argyrios Kyrtzidis authored Oct 22, 2012
```
llvm-svn: 166424
```
  54ff5e81