Commits · 4333f9700dc454f5e47706ac5033232f25bb3e50 · Lorenzo Albano / LLVM bpEVL

Apr 11, 2018

Rename *CommandFlags.def to *CommandFlags.inc · 4333f970

David Blaikie authored Apr 11, 2018

These aren't the .def style files used in LLVM that require a macro
defined before their inclusion - they're just basic non-modular includes
to stamp out command line flag variables.

llvm-svn: 329840

4333f970

[DSE] Regenerate tests with update_test_checks.py (NFC) · 9cfa786f

Daniel Neilson authored Apr 11, 2018

Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/OverwriteStoreBegin.ll
test/Transforms/DeadStoreElimination/OverwriteStoreEnd.ll

llvm-svn: 329839

9cfa786f

CodeGen: Don't try to canonicalize Unix-style paths in CodeView debug info. · cb8a666f

Peter Collingbourne authored Apr 11, 2018

Most importantly, we should not replace slashes with backslashes
because that would invalidate the path.

Differential Revision: https://reviews.llvm.org/D45473

llvm-svn: 329838

cb8a666f

[X86][Atom] Convert Atom scheduler model to SchedRW (PR32431) · 8fc2b496

Simon Pilgrim authored Apr 11, 2018

Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550.

I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here,

There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers.

There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical.

NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329837

8fc2b496

[llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack... · b24953bb

Andrea Di Biagio authored Apr 11, 2018

[llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources.

This patch moves part of the logic that notifies dispatch stall events from the
DispatchUnit to the Scheduler.

The main goal of this patch is to remove (yet another) dependency between the
DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know
about `Scheduler::Event` and how to classify stalls due to the lack of scheduling
resources. This patch removes that knowledge and simplifies the logic in
DispatchUnit::checkScheduler.

This is another change done in preparation for the work to fix PR36663.

No functional change intended.

llvm-svn: 329835

b24953bb

[X86] Generalize X86PadShortFunction to work with TargetSchedModel · 7f321d8c

Simon Pilgrim authored Apr 11, 2018

Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call.

Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329834

7f321d8c

[NVPTX] Removed 'satom' feature which is no longer used. · 2f8efcf3
Artem Belevich authored Apr 11, 2018
```
Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329830
```
2f8efcf3

[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins. · 24e8a680

Artem Belevich authored Apr 11, 2018

When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.

Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329829

24e8a680

[AMDGPU] Ensure there are enough registers for wave dispatch · fd8d4af3

Tim Renouf authored Apr 11, 2018

Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Re-landed after noticing that the buildbot failure from 329808 seemed to
be unrelated.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329826

fd8d4af3

[DSE] Regenerate tests with update_test_checks.py (NFC) · 7e2e5c3c

Daniel Neilson authored Apr 11, 2018

Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/simple.ll
test/Transforms/DeadStoreElimination/memintrinsics.ll

llvm-svn: 329824

7e2e5c3c

[FastISel] Disable local value sinking by default · 08286994

Reid Kleckner authored Apr 11, 2018

This is causing compilation timeouts on code with long sequences of
local values and calls (i.e. foo(1); foo(2); foo(3); ...).  It turns out
that code coverage instrumentation is a great way to create sequences
like this, which how our users ran into the issue in practice.

Intel has a tool that detects these kinds of non-linear compile time
issues, and Andy Kaylor reported it as PR37010.

The current sinking code scans the whole basic block once per local
value sink, which happens before emitting each call. In theory, local
values should only be introduced to be used by instructions between the
current flush point and the last flush point, so we should only need to
scan those instructions.

llvm-svn: 329822

08286994

[InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse() · ff98682c
Sanjay Patel authored Apr 11, 2018
```
llvm-svn: 329821
```
ff98682c

[DWARFv5] Fuss with asm syntax for conveying MD5 checksum. · 0195469a

Paul Robinson authored Apr 11, 2018

Previously the MD5 option of the .file directive provided the checksum
as a quoted hex string; now it's a normal hex number with 0x prefix,
same as the .octa directive accepts.

Differential Revision: https://reviews.llvm.org/D45459

llvm-svn: 329820

0195469a

[MIPS GlobalISel] Select add i32, i32 · 366857a2

Petar Jovanovic authored Apr 11, 2018

Add the minimal support necessary to lower a function that returns the
sum of two i32 values.
Support argument/return lowering of i32 values through registers only.
Add tablegen for regbankselect and instructionselect.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D44304

llvm-svn: 329819

366857a2

[SLP] update a test case. NFC. · 5ba37955
Haicheng Wu authored Apr 11, 2018
```
llvm-svn: 329818
```
5ba37955

[AMDGPU] Fix lowering enqueue_kernel · 9381ae97

Yaxun Liu authored Apr 11, 2018

Two issues were fixed:

runtime has difficulty to allocate memory for an external symbol of a
kernel and set the address of the external symbol, therefore make the runtime
handle of an enqueued kernel an ordinary global variable. Runtime only needs
to store the address of the loaded kernel to the handle and has verified
that this approach works.

handle the situation where __enqueue_kernel* gets inlined therefore
the enqueued kernel may be used through a constant expr instead
of an instruction.

Differential Revision: https://reviews.llvm.org/D45187

llvm-svn: 329815

9381ae97

Revert "[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS" · b15737e0
Andrea Di Biagio authored Apr 11, 2018
```
It caused a buildbot failure (clang-ppc64le-linux-multistage - build #6424)

llvm-svn: 329812
```
b15737e0

Revert "[AMDGPU] Ensure there are enough registers for wave dispatch" · 8ca33bfc

Tim Renouf authored Apr 11, 2018

This reverts 329808. That change caused a report of a failure in
test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect
it is an expensive-check-only error.

Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0
llvm-svn: 329811

8ca33bfc

[AArch64][AsmParser] Split index parsing from vector list. · c88f9a1a

Sander de Smalen authored Apr 11, 2018

Summary:
Place parsing of a vector index into a separate function to reduce
duplication, since the code is duplicated in both the parsing of a
Neon vector register operand and a Neon vector list.

This is patch [2/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: rengolin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45428

llvm-svn: 329809

c88f9a1a

[AMDGPU] Ensure there are enough registers for wave dispatch · f26b7234

Tim Renouf authored Apr 11, 2018

Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329808

f26b7234

[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS. · 5782ec29
Andrea Di Biagio authored Apr 11, 2018
```
llvm-svn: 329807
```
5782ec29

[X86] Add variable shuffle schedule classes · 89c8a10f

Simon Pilgrim authored Apr 11, 2018

Split variable index shuffles from immediate index shuffles

WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.)
WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.)

WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.)
WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.)

Differential Revision: https://reviews.llvm.org/D45404

llvm-svn: 329806

89c8a10f

[AArch64] Add test case for r329797 · 7bcb5720
Francis Visoiu Mistrih authored Apr 11, 2018
```
Forgot to add a test case in the previous commit.

llvm-svn: 329805
```
7bcb5720
[X86][SSE] Tweak cmpps schedule test so that it works properly with just sse1 · 6f97328b
Simon Pilgrim authored Apr 11, 2018
```
movhps/movlps test are still broken so we can't disable sse2 yet

llvm-svn: 329802
```
6f97328b

[AMDGPU][MC][GFX9] Added v_screen_partition_4se_b32 · fc715551

Dmitry Preobrazhensky authored Apr 11, 2018

See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845

Differential Revision: https://reviews.llvm.org/D45443

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329801

fc715551

[AArch64] Fix regression after r329691 · 6463922e

Francis Visoiu Mistrih authored Apr 11, 2018

In r329691, we would choose FP even if the offset wouldn't fit, just
because the offset is smaller than the one from BP. This made many
accesses through FP need to scavenge a register, which resulted in
slower and bigger code for no good reason.

This patch now always picks the offset that fits first, even if FP is
preferred.

llvm-svn: 329797

6463922e

[llvm-mca] Minor code cleanup. NFC · 074ff7c5
Andrea Di Biagio authored Apr 11, 2018
```
llvm-svn: 329796
```
074ff7c5
[llvm-mca] Renamed BackendStatistics to RetireControlUnitStatistics. · f41ad5c5
Andrea Di Biagio authored Apr 11, 2018
```
Also, removed flag -verbose in favor of flag -retire-stats.

llvm-svn: 329794
```
f41ad5c5
[llvm-mca] Move the logic that prints scheduler statistics from BackendStatistics to its own view. · 1cc29c04
Andrea Di Biagio authored Apr 11, 2018
```
Added flag -scheduler-stats to print scheduler related statistics.

llvm-svn: 329792
```
1cc29c04

Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max. · d928201a

Artur Gainullin authored Apr 11, 2018

Bitwise 'not' of the min/max could be eliminated in the pattern:

%notx = xor i32 %x, -1
%cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y
%smax = select i1 %cmp1, i32 %notx, i32 %y
%res = xor i32 %smax, -1

https://rise4fun.com/Alive/lCN

Reviewers: spatel

Reviewed by: spatel

Subscribers: a.elovikov, llvm-commits

Differential Revision: https://reviews.llvm.org/D45317

llvm-svn: 329791

d928201a

[ARM] FP16 VSEL codegen · ac96d7c4

Sjoerd Meijer authored Apr 11, 2018

This is a follow up of rL327695 to instruction select more variants of VSELGT
and VSELGE, for which it is necessary to custom lower SELECT.

More work is required in this area, which will be addressed soon:
- more variants need to be regression tested, but this depends on the next point.
- first LowerConstantFP need to be adjusted for fp16 values.

Differential Revision: https://reviews.llvm.org/D45205

llvm-svn: 329788

ac96d7c4

[Build][NFC] Split off libpfm detection to a separate module. · 33922a51
Clement Courbet authored Apr 11, 2018
```
llvm-svn: 329783
```
33922a51

[AArch64][AsmParser] Unify code for parsing Neon/SVE vectors. · 73937b7c

Sander de Smalen authored Apr 11, 2018

Summary:
Merged 'tryMatchVectorRegister' (specific to Neon) and
'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and
created a generic 'parseVectorKind()' function that returns the #Elements
and ElementWidth of a vector suffix. This reduces the duplication of
this functionality between two the vector implementations.

This is patch [1/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: fhahn

Subscribers: tschuett, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45427

llvm-svn: 329782

73937b7c

[llvm-exegesis] Add a flag to disable libpfm even if present. · 23db1744

Clement Courbet authored Apr 11, 2018

Summary: Fixes PR37053.

Reviewers: uabelho, gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D45436

llvm-svn: 329781

23db1744

[CMake][runtimes] Process common options in runtimes build · 9b4035a8

Petr Hosek authored Apr 11, 2018

This was removed in D39932 but turned out this is actually needed
because runtimes such as compiler-rt and libc++ rely on common options
processing for setting certain flags such as -ffunction-sections and
-fdata-sections.

Differential Revision: https://reviews.llvm.org/D45507

llvm-svn: 329778

9b4035a8

[X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace... · 9507fa35

Craig Topper authored Apr 11, 2018

[X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select.

The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.

llvm-svn: 329774

9507fa35

[X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit... · ee2c1dea

Craig Topper authored Apr 11, 2018

[X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction.

Previously the code only knew how to handle setcc to a register.

This should fix a crash in the chromium build.

llvm-svn: 329771

ee2c1dea

[X86] Switch a test from grep to FileCheck. NFC · 72fa9f12
Craig Topper authored Apr 11, 2018
```
llvm-svn: 329769
```
72fa9f12

Simplification of libcall like printf->puts must check for RtLibUseGOT metadata. · 182f2df7

Sriraman Tallam authored Apr 10, 2018

With -fno-plt, for example, calls to printf when getting converted to puts
still use the PLT. This patch checks for the metadata "RtLibUseGOT" and
annotates the declaration with the right attributes.

Differential Revision: https://reviews.llvm.org/D45180

llvm-svn: 329768

182f2df7

Use contains_lower() instead of find_lower() != StringRef::npos. NFC. · eb820c3a
Rui Ueyama authored Apr 10, 2018
```
llvm-svn: 329767
```
eb820c3a