Commits · fa7278f18f7e91ad2dc4a0e687d0c67e427b608c · Roger Ferrer / llvm-epi

Apr 26, 2014

Trivial test commit. · fa7278f1
Dan Liew authored Apr 26, 2014
```
llvm-svn: 207328
```
fa7278f1
Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>. · 48d114be
Craig Topper authored Apr 26, 2014
```
llvm-svn: 207327
```
48d114be

Remove an unused version of getMemIntrinsicNode and getNode. Additionally,... · 963c5d5e

Craig Topper authored Apr 26, 2014

Remove an unused version of getMemIntrinsicNode and getNode. Additionally, these were calling makeVTList with the pointers passed in which would were unlikely to belong to SelectionDAG and likely would have just been stack pointers.

llvm-svn: 207326

963c5d5e

[Sema] Adjust Sema::getCurBlock()/getCurLambda() to take into account that we may have · ea75aad3
Argyrios Kyrtzidis authored Apr 26, 2014
```
switch CurContext due to class template instantiation.

Fixes crash of the included test case.
rdar://16527205

llvm-svn: 207325
```
ea75aad3
Include C++ source for debug info test case committed in r207323 · 9c34526c
David Blaikie authored Apr 26, 2014
```
llvm-svn: 207324
```
9c34526c

DWARF Type Units: Avoid emitting type units under fission if the type requires an address. · e12b49a6

David Blaikie authored Apr 26, 2014

Since there's no way to ensure the type unit in the .dwo and the type
unit skeleton in the .o are correlated, this cannot work.

This implementation is a bit inefficient for a few reasons, called out
in comments.

llvm-svn: 207323

e12b49a6

Print X86ISD::PMULDQ nodes properly in debug output. · c2ad8f3e
Benjamin Kramer authored Apr 26, 2014
```
llvm-svn: 207322
```
c2ad8f3e

DwarfDebug: Minor refactoring around type unit construction · f3de2ab4

David Blaikie authored Apr 26, 2014

Sinking addition of the declaration attribute down to where the
signature is added. So that if the signature is not added neither is the
declaration attribute (this will come in handy when aborting type unit
construction to instead emit the type into the CU directly in some
cases)

Pull out type unit identifier hashing just to simplify the function a
little, it'll be getting longer.

llvm-svn: 207321

f3de2ab4

X86TTI: i16/i32 vector div with a constant (splat) divisor are reasonably cheap now. · 7c372272
Benjamin Kramer authored Apr 26, 2014
```
Turn vectorization back on.

llvm-svn: 207320
```
7c372272

libclang: remove 'CXDiagnostic_Remark' · 87d39753

Alp Toker authored Apr 26, 2014

The change was landed without review or test cases.

It trivially broke almost any stable application checking for Severity >=
CXDiagnostic_Error or indeed any other kind of severity comparison upon
encountering a 'remark'.

Mapped to CXDiagnostic_Warning until a workable solution is proposed to the
list that preserves API stability.

(It's also not clear why the rest of r202475 wasn't simply implemented as a
modifier to the existing 'warning' level.)

llvm-svn: 207319

87d39753

X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available. · 6d2dff61
Benjamin Kramer authored Apr 26, 2014
```
llvm-svn: 207318
```
6d2dff61

X86: Add patterns for MULHU/MULHS of v8i16 and v16i16. · c9827ab1

Benjamin Kramer authored Apr 26, 2014

This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.

llvm-svn: 207317

c9827ab1

Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner... · ad016870
Benjamin Kramer authored Apr 26, 2014
```
Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors.

llvm-svn: 207316
```
ad016870

DAGCombiner: Turn divs of vector splats into vectorized multiplications. · 4dae598b

Benjamin Kramer authored Apr 26, 2014

Otherwise the legalizer would just scalarize everything. Support for
mulhi in the targets isn't that great yet so on most targets we get
exactly the same scalarized output. Add a test for x86 vector udiv.

I had to disable the mulhi nodes on ARM because there aren't any patterns
for it. As far as I know ARM has instructions for getting the high part of
a multiply so this should be fixed.

llvm-svn: 207315

4dae598b

X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs. · 29139d5c
Benjamin Kramer authored Apr 26, 2014
```
Test will follow soon.

llvm-svn: 207314
```
29139d5c
Revert r206749 till a final decision about the intrinsics is made. · 1a97a7bc
Michael Zolotukhin authored Apr 26, 2014
```
llvm-svn: 207313
```
1a97a7bc

[LCG] Rather than removing nodes from the SCC entry set when we process · 90821c2a

Chandler Carruth authored Apr 26, 2014

them, just skip over any DFS-numbered nodes when finding the next root
of a DFS. This allows the entry set to just be a vector as we populate
it from a uniqued source. It also removes the possibility for a linear
scan of the entry set to actually do the removal which can make things
go quadratic if we get unlucky.

llvm-svn: 207312

90821c2a

[LCG] Rotate the full SCC finding algorithm to avoid round-trips through · 5e2d70b9

Chandler Carruth authored Apr 26, 2014

the DFS stack for leaves in the call graph. As mentioned in my previous
commit, this is particularly interesting for graphs which have high fan
out but low connectivity resulting in many leaves. For such graphs, this
can remove a large % of the DFS stack traffic even though it doesn't
make the stack much smaller.

It's a bit easier to formulate this for the full algorithm because that
one stops completely for each SCC. For example, I was able to directly
eliminate the "Recurse" boolean used to continue an outer loop from the
inner loop.

llvm-svn: 207311

5e2d70b9

[LCG] Hoist the main DFS loop out of the edge removal function. This · aca48d04

Chandler Carruth authored Apr 26, 2014

makes working through the worklist much cleaner, and makes it possible
to avoid the 'bool-to-continue-the-outer-loop' hack. Not a huge
difference, but I think this is approaching as polished as I can make
it.

llvm-svn: 207310

aca48d04

RecursivelyDeleteTriviallyDeadInstructions() could remove · af7a87d2

Gerolf Hoflehner authored Apr 26, 2014

more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

Repaired r207302.

llvm-svn: 207309

af7a87d2

Restore CloneFunction.cpp which got accidently · 1da7cbd5
Gerolf Hoflehner authored Apr 26, 2014
```
overwritten by previous backout of r207303

llvm-svn: 207308
```
1da7cbd5
Fix bug #18350. Add tests for tuples of all the smart pointers (except auto_ptr) · 85d3e7a7
Marshall Clow authored Apr 26, 2014
```
llvm-svn: 207307
```
85d3e7a7

[LCG] In the incremental SCC re-formation, lift the node currently being · 680af7a7

Chandler Carruth authored Apr 26, 2014

processed in the DFS out of the stack completely. Keep it exclusively in
a variable. Re-shuffle some code structure to make this easier. This can
have a very dramatic effect in some cases because call graphs tend to
look like a high fan-out spanning tree. As a consequence, there are
a large number of leaf nodes in the graph, and this technique causes
leaf nodes to never even go into the stack. While this only reduces the
max depth by 1, it may cause the total number of round trips through the
stack to drop by a lot.

Now, most of this isn't really relevant for the incremental version. =]
But I wanted to prototype it first here as this variant is in ways more
complex. As long as I can get the code factored well here, I'll next
make the primary walk look the same. There are several refactorings this
exposes I think.

llvm-svn: 207306

680af7a7

[LCG] Special case the removal of self edges. These don't impact the SCC · a7205b61

Chandler Carruth authored Apr 26, 2014

graph in any way because we don't track edges in the SCC graph, just
nodes. This also lets us add a nice assert about the invariant that
we're working on at least a certain number of nodes within the SCC.

llvm-svn: 207305

a7205b61

[DAG] During DAG legalization keep opaque constants even after expanding. · a6bda8ba

Juergen Ributzka authored Apr 26, 2014

The included test case would return the incorrect results, because the expansion
of an shift with a constant shift amount of 0 would generate undefined behavior.

This is because ExpandShiftByConstant assumes that all shifts by constants with
a value of 0 have already been optimized away. This doesn't happen for opaque
constants and usually this isn't a problem, because opaque constants won't take
this code path - they are not supposed to. In the case that the opaque constant
has to be expanded by the legalizer, the legalizer would drop the opaque flag.
In this case we hit the limitations of ExpandShiftByConstant and create incorrect
code.

This commit fixes the legalizer by not dropping the opaque flag when expanding
opaque constants and adding an assertion to ExpandShiftByConstant to catch this
not supported case in the future.

This fixes <rdar://problem/16718472>

llvm-svn: 207304

a6bda8ba

Revert commit r207302 since build failures · c46e9b04
Gerolf Hoflehner authored Apr 26, 2014
```
have been reported.

llvm-svn: 207303
```
c46e9b04

RecursivelyDeleteTriviallyDeadInstructions() could remove · 34210108

Gerolf Hoflehner authored Apr 26, 2014

more than 1 instruction. The caller need to be aware of this
and adjust instruction iterators accordingly.

rdar://16679376

llvm-svn: 207302

34210108

[X86] Implement TargetLowering::getScalingFactorCost hook. · ea18933d

Quentin Colombet authored Apr 26, 2014

Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.

<rdar://problem/16730541>

llvm-svn: 207301

ea18933d

[LCG] Refactor the duplicated code I added in my last commit here into · 8f92d6db

Chandler Carruth authored Apr 26, 2014

a helper function. Also factor the other two places where we did the
same thing into the helper function. =] Much cleaner this way. NFC.

llvm-svn: 207300

8f92d6db

[InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift · 8cc9059c

Andrea Di Biagio authored Apr 26, 2014

right intrinsics.

A packed logical shift right with a shift count bigger than or equal to the
element size always produces a zero vector. In all other cases, it can be
safely replaced by a 'lshr' instruction.

llvm-svn: 207299

8cc9059c

Add missing include guards and missing #include, found by modules build. · 8d039e44
Richard Smith authored Apr 26, 2014
```
llvm-svn: 207298
```
8d039e44

[PECOFF] Allow multiple directives in one module-definition file. · f33946d5

Rui Ueyama authored Apr 26, 2014

I'm a bit surprised that I have not implemented this yet. This is
definitely needed to handle real-world module definition files.
This patch contains a unit test for r207294.

llvm-svn: 207297

f33946d5

Add mangling for attribute enable_if. The demangling patch for libcxxabi is still in review. · 0c2986f7
Nick Lewycky authored Apr 26, 2014
```
llvm-svn: 207296
```
0c2986f7
Appease the almighty buildbots. · d71f110f
Filipe Cabecinhas authored Apr 26, 2014
```
llvm-svn: 207295
```
d71f110f

[PECOFF] Fix off-by-one error in .def file parser. · 637300ea

Rui Ueyama authored Apr 25, 2014

I'm fixing another bug in the parser, and I wanted to submit this
fix as a separate change as it's logically independent from the other.
I'll add a test for this shortly.

llvm-svn: 207294

637300ea

Since one or more Editline instances of the same kind (lldb commands,... · 28432c24

Greg Clayton authored Apr 25, 2014

Since one or more Editline instances of the same kind (lldb commands, expressions, etc) can exist at once, they should all shared a ref counted history object.

Now they do.

llvm-svn: 207293

28432c24

Free the strong reference to a lldb::SBDebugger that the script interpreter... · ed6499fe

Greg Clayton authored Apr 25, 2014

Free the strong reference to a lldb::SBDebugger that the script interpreter was holding onto in the "lldb.debugger" global variable.

llvm-svn: 207292

ed6499fe

Optimization for certain shufflevector by using insertps. · 363b570d

Filipe Cabecinhas authored Apr 25, 2014

Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.

Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.

Reviewers: nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3475

llvm-svn: 207291

363b570d

Revert "blockfreq: Approximate irreducible control flow" · 42292cea

Duncan P. N. Exon Smith authored Apr 25, 2014

This reverts commit r207286.  It causes an ICE on the
cmake-llvm-x86_64-linux buildbot [1]:

    llvm/lib/Analysis/BlockFrequencyInfo.cpp: In lambda function:
    llvm/lib/Analysis/BlockFrequencyInfo.cpp:182:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035

[1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/12093/steps/build_llvm/logs/stdio

llvm-svn: 207287

42292cea

blockfreq: Approximate irreducible control flow · 384d0e8a

Duncan P. N. Exon Smith authored Apr 25, 2014

Previously, irreducible backedges were ignored.  With this commit,
irreducible SCCs are discovered on the fly, and modelled as loops with
multiple headers.

This approximation specifies the headers of irreducible sub-SCCs as its
entry blocks and all nodes that are targets of a backedge within it
(excluding backedges within true sub-loops).  Block frequency
calculations act as if we insert a new block that intercepts all the
edges to the headers.  All backedges and entries to the irreducible SCC
point to this imaginary block.  This imaginary block has an edge (with
even probability) to each header block.

The result is now reasonable enough that I've added a number of
testcases for irreducible control flow.  I've outlined in
`BlockFrequencyInfoImpl.h` ways to improve the approximation.

<rdar://problem/14292693>

llvm-svn: 207286

384d0e8a