Commits · 94faa4d0d4258ab534c1c9642f46ae41b52f07e5 · Roger Ferrer / llvm-epi-0.8

Jul 27, 2013

SLP Vectorier: Don't vectorize really short chains because they are already... · cfd40da9

Nadav Rotem authored Jul 26, 2013

SLP Vectorier:  Don't vectorize really short chains because they are already handled by the SelectionDAG store-vectorizer, which does a better job in deciding when to vectorize.

llvm-svn: 187267

cfd40da9

SLP Vectorizer: Disable the vectorization of non power of two chains, such as... · 9ce0f779

Nadav Rotem authored Jul 26, 2013

SLP Vectorizer: Disable the vectorization of non power of two chains, such as <3 x float>, because we dont have a good cost model for these types.

llvm-svn: 187265

9ce0f779

Fix variable name. · d6d4da09
Owen Anderson authored Jul 26, 2013
```
llvm-svn: 187253
```
d6d4da09

Jul 26, 2013

When InstCombine tries to fold away (fsub x, (fneg y)) into (fadd x, y), it is · e37c2e4d
Owen Anderson authored Jul 26, 2013
```
also worthwhile for it to look through FP extensions and truncations, whose
application commutes with fneg.

llvm-svn: 187249
```
e37c2e4d
Correct case of m_UIToFp to m_UIToFP to match instruction name, add m_SIToFP for consistency. · 4ef13872
Stephen Lin authored Jul 26, 2013
```
llvm-svn: 187225
```
4ef13872

Re-implement the analysis of uses in mem2reg to be significantly more · 9af38fc2

Chandler Carruth authored Jul 26, 2013

robust. It now uses an InstVisitor and worklist to actually walk the
uses of the Alloca transitively and detect the pattern which we can
directly promote: loads & stores of the whole alloca and instructions we
can completely ignore.

Also, with this new implementation teach both the predicate for testing
whether we can promote and the promotion engine itself to use the same
code so we no longer have strange divergence between the two code paths.

I've added some silly test cases to demonstrate that we can handle
slightly more degenerate code patterns now. See the below for why this
is even interesting.

Performance impact: roughly 1% regression in the performance of SROA or
ScalarRepl on a large C++-ish test case where most of the allocas are
basically ready for promotion. The reason is because of silly redundant
work that I've left FIXMEs for and which I'll address in the next
commit. I wanted to separate this commit as it changes the behavior.
Once the redundant work in removing the dead uses of the alloca is
fixed, this code appears to be faster than the old version. =]

So why is this useful? Because the previous requirement for promotion
required a *specific* visit pattern of the uses of the alloca to verify:
we *had* to look for no more than 1 intervening use. The end goal is to
have SROA automatically detect when an alloca is already promotable and
directly hand it to the mem2reg machinery rather than trying to
partition and rewrite it. This is a 25% or more performance improvement
for SROA, and a significant chunk of the delta between it and
ScalarRepl. To get there, we need to make mem2reg actually capable of
promoting allocas which *look* promotable to SROA without have SROA do
tons of work to massage the code into just the right form.

This is actually the tip of the iceberg. There are tremendous potential
savings we can realize here by de-duplicating work between mem2reg and
SROA.

llvm-svn: 187191

9af38fc2

[PowerPC] Support powerpc64le as a syntax-checking target. · 0a9170d9

Bill Schmidt authored Jul 26, 2013

This patch provides basic support for powerpc64le as an LLVM target.
However, use of this target will not actually generate little-endian
code.  Instead, use of the target will cause the correct little-endian
built-in defines to be generated, so that code that tests for
__LITTLE_ENDIAN__, for example, will be correctly parsed for
syntax-only testing.  Code generation will otherwise be the same as
powerpc64 (big-endian), for now.

The patch leaves open the possibility of creating a little-endian
PowerPC64 back end, but there is no immediate intent to create such a
thing.

The LLVM portions of this patch simply add ppc64le coverage everywhere
that ppc64 coverage currently exists.  There is nothing of any import
worth testing until such time as little-endian code generation is
implemented.  In the corresponding Clang patch, there is a new test
case variant to ensure that correct built-in defines for little-endian
code are generated.

llvm-svn: 187179

0a9170d9

Jul 25, 2013

Respect llvm.used in Internalize. · 17600e29

Rafael Espindola authored Jul 25, 2013

The language reference says that:

"If a symbol appears in the @llvm.used list, then the compiler,
assembler, and linker are required to treat the symbol as if there is
a reference to the symbol that it cannot see"

Since even the linker cannot see the reference, we must assume that
the reference can be using the symbol table. For example, a user can add
__attribute__((used)) to a debug helper function like dump and use it from
a debugger.

llvm-svn: 187103

17600e29

Check that TD isn't NULL before dereferencing it down this path. · 5b15037f
Nick Lewycky authored Jul 25, 2013
```
llvm-svn: 187099
```
5b15037f
Make these methods const correct. · ec2375fb
Rafael Espindola authored Jul 25, 2013
```
Thanks to Nick Lewycky for noticing it.

llvm-svn: 187098
```
ec2375fb

Jul 24, 2013

TRE: Move class into anonymous namespace. · 328da33d
Benjamin Kramer authored Jul 24, 2013
```
While there shrink a dangerously large SmallPtrSet.

llvm-svn: 187050
```
328da33d

Fix a problem I introduced in r187029 where we would over-eagerly · 58e25d39

Chandler Carruth authored Jul 24, 2013

schedule an alloca for another iteration in SROA. This only showed up
with a mixture of promotable and unpromotable selects and phis. Added
a test case for this.

llvm-svn: 187031

58e25d39

Fix PR16687 where we were incorrectly promoting an alloca that had · 83ea195d

Chandler Carruth authored Jul 24, 2013

pending speculation for a phi node. The problem here is that we were
using growth of the specluation set as an indicator of whether
speculation would occur, and if the phi node is already in the set we
don't see it grow. This is a symptom of the fact that this signal is
a total hack.

Unfortunately, I couldn't really come up with a non-hacky way of
signaling that promotion remains valid *after* speculation occurs, such
that we only speculate when all else looks good for promotion. In the
end, I went with at least a much more explicit approach of doing the
work of queuing inside the phi and select processing and setting
a preposterously named flag to convey that we're in the special state of
requiring speculating before promotion.

Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing
a testcase for this from a pretty giant, nasty assert in a big
application. =] The testcase was excellent.

llvm-svn: 187029

83ea195d

Fix spelling · f64212b2
Matt Arsenault authored Jul 23, 2013
```
llvm-svn: 186997
```
f64212b2

Jul 23, 2013
- Remove extraneous null statement. No functionality change! · 6ab9d936
  Nick Lewycky authored Jul 22, 2013
```
llvm-svn: 186893
```
  6ab9d936
- Use switch instead of if. No functionality change. · d4d94065
  Jakub Staszak authored Jul 22, 2013
```
llvm-svn: 186892
```
  d4d94065
- Remove trailing spaces. · 8e1a6e7d
  Jakub Staszak authored Jul 22, 2013
```
llvm-svn: 186890
```
  8e1a6e7d
- When we vectorize across multiple basic blocks we may vectorize PHINodes that... · cf0dcdc7
  Nadav Rotem authored Jul 22, 2013
```
When we vectorize across multiple basic blocks we may vectorize PHINodes that create a cycle. We already break the cycle on phi-nodes, but arithmetic operations are still uplicated. This patch adds code that checks if the operation that we are vectorizing was vectorized during the visit of the operands and uses this value if it can.

llvm-svn: 186883
```
  cf0dcdc7
- OldPtr is llvm::Instruction. Remove unneeded cast<>. · cb132fac
  Jakub Staszak authored Jul 22, 2013
```
llvm-svn: 186880
```
  cb132fac
Jul 22, 2013

Change tabs to spaces. · 6b36db08
Jakub Staszak authored Jul 22, 2013
```
llvm-svn: 186877
```
6b36db08
Fix spelling and grammar · fb183238
Matt Arsenault authored Jul 22, 2013
```
llvm-svn: 186858
```
fb183238

Fix an obvious typo in the loop vectorizer where the cost model uses the wrong... · 8c45d4b2

Nadav Rotem authored Jul 22, 2013

Fix an obvious typo in the loop vectorizer where the cost model uses the wrong variable. The variable BlockCost is ignored.
We don't have tests for the effect of if-conversion loops because it requires a big test (that includes if-converted loops) and it is difficult to find and balance a loop to do the right thing.

llvm-svn: 186845

8c45d4b2

Delete unused helper functions. · d7ff88a8
Nadav Rotem authored Jul 22, 2013
```
llvm-svn: 186808
```
d7ff88a8

Jul 21, 2013

mem2reg: Minor STL usage cleanup. No functionality change. · 2fdb758c
Benjamin Kramer authored Jul 21, 2013
```
llvm-svn: 186790
```
2fdb758c
Make the mem2reg interface use an ArrayRef as it keeps a copy of these · 7aa9ebb5
Chandler Carruth authored Jul 21, 2013
```
to iterate over.

llvm-svn: 186788
```
7aa9ebb5
Revert a part of r186420. Don't forbid multiple store chains that merge. · f6bb6a46
Nadav Rotem authored Jul 21, 2013
```
llvm-svn: 186786
```
f6bb6a46

Hoist the rest of the logic for promoting single-store allocas into the · b1ca98c4

Chandler Carruth authored Jul 21, 2013

helper function. This leaves both trivial cases handled entirely in
helper functions and merely manages the list of allocas to process in
the run method.

The next step will be to handle all of the trivial promotion work prior
to even creating the core class and the subsequent simplifications that
enables.

llvm-svn: 186784

b1ca98c4

Hoist the rest of the logic for fully promoting allocas with all uses in · f9e7e1dd

Chandler Carruth authored Jul 21, 2013

a single block into the helper routine. This takes advantage of the fact
that we can directly replace uses prior to any store with undef to
simplify matters and unconditionally promote allocas only used within
one block.

I've removed the special handling for the case of no stores existing.
This has no semantic effect but might slow things down. I'll fix that in
a later patch when I refactor this entire thing to be easier to manage
the different cases.

llvm-svn: 186783

f9e7e1dd

Remove a method made dead by the prior refactoring. · e99f9315
Chandler Carruth authored Jul 21, 2013
```
llvm-svn: 186782
```
e99f9315

Hoist the two trivial promotion routines out of the big class that · 420fafef

Chandler Carruth authored Jul 20, 2013

handles the general cases.

The hope is to refactor this so that we don't end up building the entire
class for the trivial cases. I also want to lift a lot of the early
pre-processing in the initial segment of run() into a separate routine,
and really none of it needs to happen inside the primary promotion
class.

These routines in particular used none of the actual state in the
promotion class, so they don't really make sense as members.

llvm-svn: 186781

420fafef

Hoist the AllocaInfo struct to the top of the file. · 48e11fd7

Chandler Carruth authored Jul 20, 2013

This struct is nicely independent of everything else, and we already
needed a foward declaration here. It's simpler to just define it
immediately.

llvm-svn: 186780

48e11fd7

Sink a typedef and comparator down to the function that actually uses them. · 4711793e
Chandler Carruth authored Jul 20, 2013
```
llvm-svn: 186779
```
4711793e

Don't crash when llvm.compiler.used becomes empty. · c2bb73fc

Rafael Espindola authored Jul 20, 2013

GlobalOpt simplifies llvm.compiler.used by removing any members that are also
in the more strict llvm.used. Handle the special case where llvm.compiler.used
becomes empty.

llvm-svn: 186778

c2bb73fc

Don't allocate the DIBuilder on the heap and remove all the complexity · f3878f46
Chandler Carruth authored Jul 20, 2013
```
that ensued from that.

llvm-svn: 186777
```
f3878f46
Rename constructor parameters to follow the common member-shadowing · e62f211b
Chandler Carruth authored Jul 20, 2013
```
pattern and conform to the naming conventions.

llvm-svn: 186776
```
e62f211b
Reformat the implementation of mem2reg with clang-format so that my · b3e8e6f1
Chandler Carruth authored Jul 20, 2013
```
subsequent changes don't introduce inconsistencies.

llvm-svn: 186775
```
b3e8e6f1
Remove a DenseMapInfo specialization for std::pair -- we have one of · 985eb0b5
Chandler Carruth authored Jul 20, 2013
```
those baked into DenseMap now.

llvm-svn: 186773
```
985eb0b5
Update mem2reg's comments to conform to the new doxygen standards. No · 01951610
Chandler Carruth authored Jul 20, 2013
```
functionality changed.

llvm-svn: 186772
```
01951610

Jul 20, 2013
- SROA: Microoptimization: Remove dead entries first, then sort. · 08e5070b
  Benjamin Kramer authored Jul 20, 2013
```
While there replace an explicit struct with std::mem_fun.

llvm-svn: 186761
```
  08e5070b
- InstCombine: call FoldOpIntoSelect for all floating binops, not just fmul · a9b57f6b
  Stephen Lin authored Jul 20, 2013
```
llvm-svn: 186759
```
  a9b57f6b