Commits · 2ef18fb41aeb0e71744fd3e97cf25da49bdf6b64 · Roger Ferrer / llvm-epi

Oct 02, 2019

Reland "[utils] Implement the llvm-locstats tool" · 2ef18fb4

Djordje Todorovic authored Oct 02, 2019

The tool reports verbose output for the DWARF debug location coverage.
The llvm-locstats for each variable or formal parameter DIE computes what
percentage from the code section bytes, where it is in scope, it has
location description. The line 0 shows the number (and the percentage) of
DIEs with no location information, but the line 100 shows the number (and
the percentage) of DIEs where there is location information in all code
section bytes (where the variable or parameter is in the scope). The line
50..59 shows the number (and the percentage) of DIEs where the location
information is in between 50 and 59 percentage of its scope covered.

Differential Revision: https://reviews.llvm.org/D66526

The cause of the test failure was resolved.

llvm-svn: 373427

2ef18fb4

[llvm-lib] Detect duplicate input files · 60e9df33
Rui Ueyama authored Oct 02, 2019
```
Differential Revision: https://reviews.llvm.org/D68320

llvm-svn: 373426
```
60e9df33

[llvm-lib] Correctly handle .lib input files · 64a362e7

Rui Ueyama authored Oct 02, 2019

If archive files are passed as input files, llvm-lib needs to append
the members of the input archive files to the output file. This patch
implements that behavior.

This patch splits an existing function into smaller functions.
Effectively, the new code is only `if (Magic == file_magic::archive)
{ ... }` part.

Fixes https://bugs.llvm.org/show_bug.cgi?id=32674

Differential Revision: https://reviews.llvm.org/D68204

llvm-svn: 373424

64a362e7

[X86] Add broadcast load folding patterns to the NoVLX compare patterns. · 8d6a863b

Craig Topper authored Oct 02, 2019

These patterns use zmm registers for 128/256-bit compares when
the VLX instructions aren't available. Previously we only
supported registers, but as PR36191 notes we can fold broadcast
loads, but not regular loads.

llvm-svn: 373423

8d6a863b

Fix GCC -Wreturn-type warnings. NFC. · c3aab6ea
Michael Liao authored Oct 02, 2019
```
llvm-svn: 373422
```
c3aab6ea
DebugInfo: Update support for detecting C++ language variants in debug info emission · bfc68885
David Blaikie authored Oct 02, 2019
```
llvm-svn: 373420
```
bfc68885
gn build: (manually) merge r373407 · 9e763e1b
Nico Weber authored Oct 02, 2019
```
llvm-svn: 373419
```
9e763e1b
AMDGPU/GlobalISel: Use getIntrinsicID helper · 86f864da
Matt Arsenault authored Oct 02, 2019
```
llvm-svn: 373417
```
86f864da

AMDGPU/GlobalISel: Assume VGPR for G_FRAME_INDEX · cdfe5efe

Matt Arsenault authored Oct 02, 2019

In principle this should behave as any other constant. However
eliminateFrameIndex currently assumes a VALU use and uses a vector
shift. Work around this by selecting to VGPR for now until
eliminateFrameIndex is fixed.

llvm-svn: 373415

cdfe5efe

AMDGPU/GlobalISel: Private loads always use VGPRs · bfce0c26
Matt Arsenault authored Oct 02, 2019
```
llvm-svn: 373414
```
bfce0c26
AMDGPU/GlobalISel: Legalize 1024-bit G_BUILD_VECTOR · 05aa8a73
Matt Arsenault authored Oct 02, 2019
```
This will be needed to support AGPR operations.

llvm-svn: 373413
```
05aa8a73
AMDGPU/GlobalISel: Fix RegBankSelect for 1024-bit values · 3a657afb
Matt Arsenault authored Oct 02, 2019
```
llvm-svn: 373412
```
3a657afb

[AMDGPU] separate accounting for agprs · 075bc48a

Stanislav Mekhanoshin authored Oct 02, 2019

Account and report agprs separately on gfx908. Other targets
do not change the reporting.

Differential Revision: https://reviews.llvm.org/D68307

llvm-svn: 373411

075bc48a

[X86] Add a DAG combine to shrink vXi64 gather/scatter indices that are... · 8c19925f

Craig Topper authored Oct 01, 2019

[X86] Add a DAG combine to shrink vXi64 gather/scatter indices that are constant with sufficient sign bits to fit in vXi32

The gather/scatter instructions can implicitly sign extend the indices. If we're operating on 32-bit data, an v16i64 index can force a v16i32 gather to be split in two since the index needs 2 registers. If we can shrink the index to the i32 we can avoid the split. It should always be safe to shrink the index regardless of the number of elements. We have gather/scatter instructions that can use v2i32 index stored in a v4i32 register with v2i64 data size.

I've limited this to before legalize types to avoid creating a v2i32 after type legalization. We could check for it, but we'd also need testing. I'm also only handling build_vectors with no bitcasts to be sure the truncate will constant fold.

Differential Revision: https://reviews.llvm.org/D68247

llvm-svn: 373408

8c19925f

AMDGPU: Fix an out of date assert in addressing FrameIndex · e4ee28d1
Changpeng Fang authored Oct 01, 2019
```
Reviewers:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D67574

llvm-svn: 373404
```
e4ee28d1

Revert r373172 "[X86] Add custom isel logic to match VPTERNLOG from 2 logic ops." · 0da163a2

Craig Topper authored Oct 01, 2019

This seems to be causing some performance regresions that I'm
trying to investigate.

One thing that stands out is that this transform can increase
the live range of the operands of the earlier logic op. This
can be bad for register allocation. If there are two logic
op inputs we should really combine the one that is closest, but
SelectionDAG doesn't have a good way to do that. Maybe we need
to do this as a basic block transform in Machine IR.

llvm-svn: 373401

0da163a2

Oct 01, 2019

[X86] convertToThreeAddress, make sure second operand of SUB32ri is really an... · 91287057

Craig Topper authored Oct 01, 2019

[X86] convertToThreeAddress, make sure second operand of SUB32ri is really an immediate before calling getImm().

It might be a symbol instead. We can't fold those since we can't
negate them.

Similar for other SUB with immediates.

Fixes PR43529.

llvm-svn: 373397

91287057

[FileCheck] Move private interface to its own header · ed117868

Thomas Preud'homme authored Oct 01, 2019

Summary:
Most of the class definition in llvm/include/llvm/Support/FileCheck.h
are actually implementation details that should not be relied upon. This
commit moves all of it in a new header file under
llvm/lib/Support/FileCheck. It also takes advantage of the code movement
to put the code into a new llvm::filecheck namespace.

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67649

llvm-svn: 373395

ed117868

[BypassSlowDivision][CodeGenPrepare] avoid crashing on unused code (PR43514) · 9738fd63
Sanjay Patel authored Oct 01, 2019
```
https://bugs.llvm.org/show_bug.cgi?id=43514

llvm-svn: 373394
```
9738fd63
gn build: Merge r373392 · 081e9df1
LLVM GN Syncbot authored Oct 01, 2019
```
llvm-svn: 373393
```
081e9df1

[ASan][NFC] Address remaining comments for https://reviews.llvm.org/D68287 · 8830975c

Leonard Chan authored Oct 01, 2019

I submitted that patch after I got the LGTM, but the comments didn't
appear until after I submitted the change. This adds `const` to the
constructor argument and makes it a pointer.

llvm-svn: 373391

8830975c

[ASan] Make GlobalsMD member a const reference. · 63663616

Leonard Chan authored Oct 01, 2019

PR42924 points out that copying the GlobalsMetadata type during
construction of AddressSanitizer can result in exteremely lengthened
build times for translation units that have many globals. This can be addressed
by just making the GlobalsMD member in AddressSanitizer a reference to
avoid the copy. The GlobalsMetadata type is already passed to the
constructor as a reference anyway.

Differential Revision: https://reviews.llvm.org/D68287

llvm-svn: 373389

63663616

[DDG] Data Dependence Graph - Root Node · 91b62d5c

Bardia Mahjour authored Oct 01, 2019

Summary:
This patch adds Root Node to the DDG. The purpose of the root node is to create a single entry node that allows graph walk iterators to iterate through all nodes of the graph, making sure that no node is left unvisited during a graph walk (eg. SCC or DFS). Once the DDG is fully constructed it will have exactly one root node. Every node in the graph is reachable from the root. The algorithm for connecting the root node is based on depth-first-search that keeps track of visited nodes to try to avoid creating unnecessary edges.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D67970

llvm-svn: 373386

91b62d5c

[MemorySSA] Check for unreachable blocks when getting last definition. · 890090f7

Alina Sbirlea authored Oct 01, 2019

If a single predecessor is found, still check if the block is
unreachable. The test that found this had a self loop unreachable block.
Resolves PR43493.

llvm-svn: 373383

890090f7

Add a missing pass in ARM O3 pipeline · 7ed4fb38
Jakub Kuderski authored Oct 01, 2019
```
llvm-svn: 373382
```
7ed4fb38

[MemorySSA] Update last_access_in_block check. · ae40dfc1

Alina Sbirlea authored Oct 01, 2019

The check for "was there an access in this block" should be: is the last
access in this block and is it not a newly inserted phi.
Resolves new test in PR43438.

Also fix a typo when simplifying trivial Phis to match the comment.

llvm-svn: 373380

ae40dfc1

[Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in MachineLICM · 856c1cd8
Jakub Kuderski authored Oct 01, 2019
```
llvm-svn: 373378
```
856c1cd8

[Dominators][CodeGen] Fix MachineDominatorTree preservation in PHIElimination · 5be08ee9

Jakub Kuderski authored Oct 01, 2019

Summary:
PHIElimination modifies CFG and marks MachineDominatorTree as preserved. Therefore, it the CFG changes it should also update the MDT, when available. This patch teaches PHIElimination to recalculate MDT when necessary.

This fixes the `tailmerging_in_mbp.ll` test failure discovered after switching to generic DomTree verification algorithm in MachineDominators in D67976.

Reviewers: arsenm, hliao, alex-t, rampitec, vpykhtin, grosser

Reviewed By: rampitec

Subscribers: MatzeB, wdng, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68154

llvm-svn: 373377

5be08ee9

Reapply [Dominators][CodeGen] Clean up MachineDominators · 925c285f

Jakub Kuderski authored Oct 01, 2019

This reverts r373117 (git commit 159ef377)

Phabricator review: https://reviews.llvm.org/D67976.

llvm-svn: 373376

925c285f

[PGO] Fix typos from r359612. NFC. · e0fa2689
Rong Xu authored Oct 01, 2019
```
llvm-svn: 373369
```
e0fa2689
[ARM] Some MVE shuffle plus extend tests. NFC · a3ebcfe5
David Green authored Oct 01, 2019
```
llvm-svn: 373368
```
a3ebcfe5

AMDGPU/SILoadStoreOptimizer: Add helper functions for working with CombineInfo · 004c7915

Tom Stellard authored Oct 01, 2019

Summary:
This is a refactoring that will make future improvements to this pass easier.
This change should not change the behavior of the pass.

Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin

Reviewed By: nhaehnle, vpykhtin

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65496

llvm-svn: 373366

004c7915

[InstCombine] Deal with -(trunc(X >>u 63)) -> trunc(X >>s 63) · 053014f8

Roman Lebedev authored Oct 01, 2019

Identical to it's trunc-less variant, just pretent-to hoist
trunc, and everything else still holds:
https://rise4fun.com/Alive/JRU

llvm-svn: 373364

053014f8

[InstCombine] Preserve 'exact' in -(X >>u 31) -> (X >>s 31) fold · 65144149
Roman Lebedev authored Oct 01, 2019
```
https://rise4fun.com/Alive/yR4

llvm-svn: 373363
```
65144149
[NFC][InstCombine] (Better) tests for sign-bit-smearing pattern · f273fc79
Roman Lebedev authored Oct 01, 2019
```
https://rise4fun.com/Alive/JRU
https://rise4fun.com/Alive/yR4 <- we can preserve 'exact'

llvm-svn: 373362
```
f273fc79

[llvm-mca] Add a -mattr flag · 92929831

David Green authored Oct 01, 2019

This adds a -mattr flag to llvm-mca, for cases where the -mcpu option does not
contain all optional features.

Differential Revision: https://reviews.llvm.org/D68190

llvm-svn: 373358

92929831

[ReleaseProcess] Document requirement to set MACOSX_DEPLOYMENT_TARGET · a1e7efaa
Vedant Kumar authored Oct 01, 2019
```
llvm-svn: 373356
```
a1e7efaa

[IndVars] An implementation of loop predication without a need for speculation · 0200626f

Philip Reames authored Oct 01, 2019

This patch implements a variation of a well known techniques for JIT compilers - we have an implementation in tree as LoopPredication - but with an interesting twist. This version does not assume the ability to execute a path which wasn't taken in the original program (such as a guard or widenable.condition intrinsic). The benefit is that this works for arbitrary IR from any frontend (including C/C++/Fortran). The tradeoff is that it's restricted to read only loops without implicit exits.

This builds on SCEV, and can thus eliminate the loop varying portion of the any early exit where all exits are understandable by SCEV. A key advantage is that fixing deficiency exposed in SCEV - already found one while writing test cases - will also benefit all of full redundancy elimination (and most other loop transforms).

I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.

Factoring wise, I chose to put this in IndVarSimplify since it's a generally applicable to all workloads. I could split this off into it's own pass, but we'd then probably want to add that new pass every place we use IndVars. One solid argument for splitting it off into it's own pass is that this transform is "too good". It breaks a huge number of existing IndVars test cases as they tend to be simple read only loops. At the moment, I've opted it off by default, but if we add this to IndVars and enable, we'll have to update around 20 test files to add side effects or disable this transform.

Near term plan is to fuzz this extensively while off by default, reflect and discuss on the factoring issue mentioned just above, and then enable by default. I also need to give some though to supporting widenable conditions in this framing.

Differential Revision: https://reviews.llvm.org/D67408

llvm-svn: 373351

0200626f

AMDGPU/GlobalISel: Increase max legal size to 1024 · 9dba6037

Matt Arsenault authored Oct 01, 2019

There are 1024 bit register classes defined for AGPRs. Additionally
OpenCL defines vectors up to 16 x i64, and this helps those tests
legalize.

llvm-svn: 373350

9dba6037

[X86] Add a VBROADCAST_LOAD ISD opcode representing a scalar load broadcasted to a vector. · 105e82ed

Craig Topper authored Oct 01, 2019

Summary:
This adds the ISD opcode and a DAG combine to create it. There are
probably some places where we can directly create it, but I'll
leave that for future work.

This updates all of the isel patterns to look for this new node.
I had to add a few additional isel patterns for aligned extloads
which we should probably fix with a DAG combine or something. This
does mean that the broadcast load folding for avx512 can no
longer match a broadcasted aligned extload.

There's still some work to do here for combining a broadcast of
a broadcast_load. We also need to improve extractelement or
demanded vector elements of a broadcast_load. I'll try to get
those done before I submit this patch.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68198

llvm-svn: 373349

105e82ed