Commits · 071b0fa06a51f44da8190836b84fc3abf222bf5c · Roger Ferrer / llvm-epi

Dec 14, 2016

Revert revision 289721. · 071b0fa0
Jan Sjodin authored Dec 14, 2016
```
llvm-svn: 289723
```
071b0fa0
Dummy commit. · 9419021b
Jan Sjodin authored Dec 14, 2016
```
llvm-svn: 289721
```
9419021b
[LTO] Add the missing datalayout in a test. · ebed410c
Davide Italiano authored Dec 14, 2016
```
llvm-svn: 289720
```
ebed410c

[LTO] Reject modules without datalayout. · 2ceb628f

Davide Italiano authored Dec 14, 2016

Also, udpate the ~60 failing tests in the tree which did
not contain a valid datalayout.
This fixes PR31123. lld will be updated in a following patch,
immediately after this is committed.

Differential Revision:  https://reviews.llvm.org/D27082

llvm-svn: 289719

2ceb628f

[asan] Don't skip instrumentation of masked load/store unless we've seen a... · dd968870

Filipe Cabecinhas authored Dec 14, 2016

[asan] Don't skip instrumentation of masked load/store unless we've seen a full load/store on that pointer.

Reviewers: kcc, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27625

llvm-svn: 289718

dd968870

[asan] Hook ClInstrumentWrites and ClInstrumentReads to masked operation instrumentation. · 1e69017a
Filipe Cabecinhas authored Dec 14, 2016
```
Reviewers: kcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27548

llvm-svn: 289717
```
1e69017a

Create SampleProfileLoader pass in llvm instead of clang · a99e082e

Dehao Chen authored Dec 14, 2016

Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.

Reviewers: tejohnson, davidxl, dnovillo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27743

llvm-svn: 289714

a99e082e

[ARM] Split 128-bit vectors in BUILD_VECTOR lowering · cbed30c5

Eli Friedman authored Dec 14, 2016

Given that INSERT_VECTOR_ELT operates on D registers anyway, combining
64-bit vectors into a 128-bit vector is basically free. Therefore, try
to split BUILD_VECTOR nodes before giving up and lowering them to a series
of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically
better lowerings; see testcases for examples. Inspired by similar code
in the x86 backend for AVX.

Differential Revision: https://reviews.llvm.org/D27624

llvm-svn: 289706

cbed30c5

fix gcc warning about a superfluous ; · 53816d07
Nico Weber authored Dec 14, 2016
```
llvm-svn: 289705
```
53816d07

[InstCombine] Folding of a compare with RHS const should merge debug locations · cfd71986

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are compares that have a RHS constant,
instcombine will try to pull them through the phi node, combining them into
a single operation. When it does this, the debug location of the new op
should be the merged debug locations of the phi node arguments.

Patch 8 of 8 for D26256.  Folding of a compare that has a RHS constant.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289704

cfd71986

[ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently. · 10576e73

Eli Friedman authored Dec 14, 2016

Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.

Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).

Differential Revision: https://reviews.llvm.org/D27694

llvm-svn: 289703

10576e73

[DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of... · 43c8b6b7

Amjad Aboud authored Dec 14, 2016

[DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of FileName and Directory.
This way it will be easier to expand DIFile (e.g., to contain checksum) without the need to modify the createCompileUnit() API.

Reviewers: llvm-commits, rnk

Differential Revision: https://reviews.llvm.org/D27762

llvm-svn: 289702

43c8b6b7

Fix build failure due to r289674 on certain systems · 04334b52
Yaxun Liu authored Dec 14, 2016
```
Removed a useless include which caused conflict.

llvm-svn: 289700
```
04334b52

[InstCombine] Folding of a binop with RHS const should merge the debug locations · c9f73547

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are a binop with a RHS constant, instcombine
will try to pull them through the phi node, combining them into a single
operation. When it does this, the debug location of the new op should be the
merged debug locations of the phi node arguments.

Patch 7 of 8 for D26256.  Folding of a binop with RHS constant.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289699

c9f73547

DebugInfo: Improve type safety and simplify some subprogram finalization code · b4614689

David Blaikie authored Dec 14, 2016

This probably ended up this way aften the subprogram<>function link
inversion and debug info metadata schema changes.

llvm-svn: 289697

b4614689

[GVNHoist] Move GVNHoist to function simplification part of pipeline. · ca11a1e1

Geoff Berry authored Dec 14, 2016

Summary:
Move GVNHoist to later in the optimization pipeline, specifically, to
the function simplification part of the pipeline.  The new pipeline
location allows GVNHoist to run on a function after its callees have
been inlined but before the function has been considered for inlining
into its callers, exposing more opportunities for hoisting.

Performance results on AArch64 kryo:
Improvements:
  Benchmarks/CoyoteBench/fftbench  -24.952%
  spec2006/bzip2                    -4.071%
  internal bmark                    -3.177%
  Benchmarks/PAQ8p/paq8p            -1.754%
  spec2000/perlbmk                  -1.328%
  spec2006/h264ref                  -1.140%

Regressions:
  internal bmark                    +1.818%
  Benchmarks/mafft/pairlocalalign   +1.084%

Reviewers: sebpop, dberlin, hiraditya

Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D27722

llvm-svn: 289696

ca11a1e1

[WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements · ce3bcae6
Andrew Kaylor authored Dec 14, 2016
```
Differential Revision: https://reviews.llvm.org/D27693

llvm-svn: 289694
```
ce3bcae6

[InstCombine] When folding casts through a phi node merge the debug locations · f02d9b83

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are a cast, instcombine will try to pull
them through the phi node, combining them into a single cast. When it does
this, the debug location of the new cast should be the merged debug locations
of the phi node arguments.

Patch 6 of 8 for D26256.  Folding of a cast operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289693

f02d9b83

Include <cstdarg> in PrettyStackTrace.cpp, fixing the bots. · 62204ad7
Sean Callanan authored Dec 14, 2016
```
llvm-svn: 289691
```
62204ad7

Prepare PrettyStackTrace for LLDB adoption · 032dbf9e

Sean Callanan authored Dec 14, 2016

This patch fixes the linkage for __crashtracer_info__, making it have the proper mangling (extern "C") and linkage (private extern).
It also adds a new PrettyStackTrace type, allowing LLDB to adopt this instead of Host::SetCrashDescriptionWithFormat().

Without this patch, CrashTracer on macOS won't pick up pretty stack traces from any LLVM client.
An LLDB commit adopting this API will follow shortly.

Differential Revision: https://reviews.llvm.org/D27683

llvm-svn: 289689

032dbf9e

[InstCombine] Folding loads through a phi node should merge the debug locations · 373e36a4

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are a load, instcombine will try to pull
them through the phi node, combining them into a single load. When it does
this, the debug location of the new load should be the merged debug locations
of the phi node arguments.

Patch 5 of 8 for D26256.  Folding of a load operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289688

373e36a4

[InstCombine] When folding GEP through a phi node merge the debug locations · 8fc1e89b

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are getelementptr, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the new getelementptr
should be the merged debug locations of the phi node arguments.

Patch 4 of 8 for D26256.  Folding of a getelementptr operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289684

8fc1e89b

This change does two things: · ba1024cf

Eric Christopher authored Dec 14, 2016

Adds a "Discriminator" field to struct DILineInfo, which defaults to 0.
Fills out the "Discriminator" field in DILineInfo in DWARFDebugLine::LineTable::getFileLineInfoForAddress().

in order to have a slightly nicer interface in getFileLineInfoForAddress.

Patch by Simon Que!

Differential Revision: https://reviews.llvm.org/D27649

llvm-svn: 289683

ba1024cf

[InstCombine] Merge debug locations when folding through a phi node · 4b0790d4

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.

Patch 3 of 8 for D26256.  Folding of a compare operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289681

4b0790d4

[libFuzzer] disable msan for one more hook that reads target's data that might be uninitialized · d9d9a545
Kostya Serebryany authored Dec 14, 2016
```
llvm-svn: 289680
```
d9d9a545

[InstCombine] Merge debug locations when folding through a phi node · 2428a405

Robert Lougher authored Dec 14, 2016

If all the operands to a phi node are of the same operation, instcombine
will try to pull them through the phi node, combining them into a single
operation.  When it does this, the debug location of the operation should
be the merged debug locations of the phi node arguments.

Patch 2 of 8 for D26256.  Folding of a binary operation.

Differential Revision: https://reviews.llvm.org/D26256

llvm-svn: 289679

2428a405

revert r289669 which breaks bots · 23025f84
Dehao Chen authored Dec 14, 2016
```
llvm-svn: 289676
```
23025f84
AMDGPU: Emit runtime metadata version 2 as YAML · 07d659bc
Yaxun Liu authored Dec 14, 2016
```
Differential Revision: https://reviews.llvm.org/D25046

llvm-svn: 289674
```
07d659bc
lit.cfg: Check value of build config rather than converting to boolean · ebd8110a
Derek Schuff authored Dec 14, 2016
```
This is a CMake var which never evaluates to false.

llvm-svn: 289673
```
ebd8110a

AMDGPU: Make AllocationPriority of SGPRs higher than VGPRs · bdc0ac0a

Matt Arsenault authored Dec 14, 2016

Since SGPRs should spill to VGPRs, they should be allocated first.
I don't think this is sufficient for SGPRs to always spill to
VGPRs though.

llvm-svn: 289671

bdc0ac0a

Create SampleProfileLoader pass in llvm instead of clang · cb61c94d

Dehao Chen authored Dec 14, 2016

Reviewers: tejohnson, davidxl, dnovillo

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D27743

llvm-svn: 289669

cb61c94d

Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." · f5bf03c7
Nirav Dave authored Dec 14, 2016
```
Reverting due to ARM MCJIT and MIPS LLD error.

This reverts commit r289659.

llvm-svn: 289667
```
f5bf03c7
AMDGPU: Change vintrp printing · ebfba702
Matt Arsenault authored Dec 14, 2016
```
llvm-svn: 289664
```
ebfba702
Revert gold part of change, just liblto · 112b3039
Derek Schuff authored Dec 14, 2016
```
llvm-svn: 289663
```
112b3039

Disable libLTO tests when libLTO is not built · 0c2796dc

Derek Schuff authored Dec 14, 2016

Summary:
The current test only checks whether ld64 is available, causing tests
to fail when ld64 is avilable but libLTO is not built.

Reviewers: beanz, mehdi_amini

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D27739

llvm-svn: 289662

0c2796dc

New API for merging debug locations. NFC. · 7bd04e3b

Robert Lougher authored Dec 14, 2016

Given two debug locations the function getMergedLocation combines the
locations into a single location (which may be an empty location).
Please see https://reviews.llvm.org/D26256 for the discussion leading
up to this API.

Note the function is currently a stub.  This allows optimisations to
use the API although no location will actually be used.

This is patch 1 out of 8 for D26256.  As suggested by David Blaikie,
each change in D26256 has been broken out into a separate patch.

llvm-svn: 289661

7bd04e3b

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. · 8527ab0a

Nirav Dave authored Dec 14, 2016

Retrying after fixing after removing load-store factoring through
token factors in favor of improved token factor operand pruning

Simplify Consecutive Merge Store Candidate Search

Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.

Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).

Additional Minor Changes:

1. Finishes removing unused AliasLoad code
2. Unifies the the chain aggregation in the merged stores across
code paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seemed sufficient to not cause regressions in
tests.

This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.

Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations

Noteworthy tests:

CodeGen/AArch64/argument-blocks.ll -
It's not entirely clear what the test_varargs_stackalign test is
supposed to be asserting, but the new code looks right.

CodeGen/AArch64/arm64-memset-inline.lli -
CodeGen/AArch64/arm64-stur.ll -
CodeGen/ARM/memset-inline.ll -

The backend now generates *worse* code due to store merging
succeeding, as we do do a 16-byte constant-zero store efficiently.

CodeGen/AArch64/merge-store.ll -
Improved, but there still seems to be an extraneous vector insert
from an element to itself?

CodeGen/PowerPC/ppc64-align-long-double.ll -
Worse code emitted in this case, due to the improved store->load
forwarding.

CodeGen/X86/dag-merge-fast-accesses.ll -
CodeGen/X86/MergeConsecutiveStores.ll -
CodeGen/X86/stores-merging.ll -
CodeGen/Mips/load-store-left-right.ll -
Restored correct merging of non-aligned stores

CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
Improved. Correctly merges buffer_store_dword calls

CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
Improved. Sidesteps loading a stored value and
merges two stores

CodeGen/X86/pr18023.ll -
This test has been removed, as it was asserting incorrect
behavior. Non-volatile stores *CAN* be moved past volatile loads,
and now are.

CodeGen/X86/vector-idiv.ll -
CodeGen/X86/vector-lzcnt-128.ll -
It's basically impossible to tell what these tests are actually
testing. But, looks like the code got better due to the memory
operations being recognized as non-aliasing.

CodeGen/X86/win32-eh.ll -
Both loads of the securitycookie are now merged.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 289659

8527ab0a

Wdocumentation fix · facbd356
Simon Pilgrim authored Dec 14, 2016
```
llvm-svn: 289655
```
facbd356

[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of just APInt::isPowerOf2 · 05ab8ffc

Simon Pilgrim authored Dec 14, 2016

Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases.

Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value.

Differential Revision: https://reviews.llvm.org/D27714

llvm-svn: 289654

05ab8ffc

Fix bug 30945- [AVX512] Failure to flip vector comparison to remove not mask instruction · 1ce2a23a

Michael Zuckerman authored Dec 14, 2016

adding new optimization opportunity by adding new X86ISelLowering pattern. The test case was shown in https://llvm.org/bugs/show_bug.cgi?id=30945.

Test explanation:
Select gets three arguments mask, op and op2. In this case, the Mask is a result of ICMP. The ICMP instruction compares (with equal operand) the zero initializer vector and the result of the first ICMP.

In general, The result of "cmp eq, op1, zero initializers" is "not(op1)" where op1 is a mask. By rearranging of the two arguments inside the Select instruction, we can get the same result. Without the necessary of the middle phase ("cmp eq, op1, zero initializers").

Missed optimization opportunity: 
vpcmpled %zmm0, %zmm1, %k0
knotw %k0, %k1

can be combine to 
vpcmpgtd %zmm0, %zmm2, %k1

Reviewers: 
1. delena
2. igorb 

Commited after check all 
Differential Revision: https://reviews.llvm.org/D27160

llvm-svn: 289653

1ce2a23a