Commits · d0d38df0914fad4d128b178dd26e11bb035e83ae · Lorenzo Albano / LLVM bpEVL

Mar 02, 2020

[LoopVectorizer] Change types of lists from pointers to references. NFC · d0d38df0

David Green authored Mar 02, 2020

getReductionVars, getInductionVars and getFirstOrderRecurrences were all
being returned from LoopVectorizationLegality as pointers to lists. This
just changes them to be references, cleaning up the interface slightly.

Differential Revision: https://reviews.llvm.org/D75448

d0d38df0

[ARM] Add Cortex-M55 Support for clang and llvm · 7d594cf0

Luke Geeson authored Feb 14, 2020

This patch upstreams support for the ARM Armv8.1m cpu Cortex-M55.

In detail adding support for:

 - mcpu option in clang
 - Arm Target Features in clang
 - llvm Arm TargetParser definitions

details of the CPU can be found here:
https://developer.arm.com/ip-products/processors/cortex-m/cortex-m55

Reviewers: chill

Reviewed By: chill

Subscribers: dmgreen, kristof.beyls, hiraditya, cfe-commits,
llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74966

7d594cf0

Fix shadow variable warning. NFC. · d20fb7ea
Simon Pilgrim authored Mar 02, 2020

d20fb7ea
[CostModel][X86] Add vXi1 extract/insert cost tests · 174cb7c6
Simon Pilgrim authored Mar 02, 2020

174cb7c6

Reland "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters · 7a42babe

Awanish Pandey authored Mar 02, 2020

in C++ templates."

This was reverted in 802b22b5 due to
missing .bc file and a chromium bot failure.
https://bugs.chromium.org/p/chromium/issues/detail?id=1057559#c1
This revision address both of them.

Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.

Reviewers: probinson, aprantl, dblaikie

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D73462

7a42babe

Fix operator precedence warning. NFCI. · e4380b07
Simon Pilgrim authored Mar 02, 2020

e4380b07

[AArch64][SVE] Add intrinsics for non-temporal gather-loads/scatter-stores · 9249f606

Andrzej Warzynski authored Feb 19, 2020

Summary:
This patch adds the following LLVM IR intrinsics for SVE:
1. non-temporal gather loads
  * @llvm.aarch64.sve.ldnt1.gather
  * @llvm.aarch64.sve.ldnt1.gather.uxtw
  * @llvm.aarch64.sve.ldnt1.gather.scalar.offset
2. non-temporal scatter stores
  * @llvm.aarch64.sve.stnt1.scatter
  * @llvm.aarch64.sve.ldnt1.gather.uxtw
  * @llvm.aarch64.sve.ldnt1.gather.scalar.offset
These intrinsic are mapped to the corresponding SVE instructions
(example for half-words, zero-extending):
  * ldnt1h { z0.s }, p0/z, [z0.s, x0]
  * stnt1h { z0.s }, p0/z, [z0.s, x0]

Note that for non-temporal gathers/scatters, the SVE spec defines only
one instruction type: "vector + scalar". For this reason, we swap the
arguments when processing intrinsics that implement the "scalar +
vector" addressing mode:
  * @llvm.aarch64.sve.ldnt1.gather
  * @llvm.aarch64.sve.ldnt1.gather.uxtw
  * @llvm.aarch64.sve.stnt1.scatter
  * @llvm.aarch64.sve.ldnt1.gather.uxtw
In other words, all intrinsics for gather-loads and scatter-stores
implemented in this patch are mapped to the same load and store
instruction, respectively.

The sve2_mem_gldnt_vs multiclass (and it's counterpart for scatter
stores) from SVEInstrFormats.td was split into:
  * sve2_mem_gldnt_vec_vs_32_ptrs (32bit wide base addresses)
  * sve2_mem_gldnt_vec_vs_62_ptrs (64bit wide base addresses)
This is consistent with what we did for
@llvm.aarch64.sve.ld1.scalar.offset and highlights the actual split in
the spec and the implementation.

Reviewed by: sdesmalen

Differential Revision: https://reviews.llvm.org/D74858

9249f606

[ARM,MVE] Add ACLE intrinsics for VCVT[ANPM] family. · 1a8cbfa5

Simon Tatham authored Mar 02, 2020

Summary:
These instructions convert a vector of floats to a vector of integers
of the same size, with assorted non-default rounding modes.
Implemented in IR as target-specific intrinsics, because as far as I
can see there are no matches for that functionality in the standard IR
intrinsics list.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75255

1a8cbfa5

[ARM,MVE] Add ACLE intrinsics for VCVT.F32.F16 family. · b08d2ddd

Simon Tatham authored Mar 02, 2020

Summary:
These instructions make a vector of `<4 x float>` by widening every
other lane of a vector of `<8 x half>`.

I wondered about representing these using standard IR, along the lines
of a shufflevector to extract elements of the input into a `<4 x half>`
followed by an `fpext` to turn that into `<4 x float>`. But it looks as
if that would take a lot of work in isel lowering to make it match any
pattern I could sensibly write in Tablegen, and also I haven't been
able to think of any other case where that pattern might be generated
in IR, so there wouldn't be any extra code generation win from doing
it that way.

Therefore, I've just used another target-specific intrinsic. We can
always change it to the other way later if anyone thinks of a good
reason.

(In order to put the intrinsic definition near similar things in
`IntrinsicsARM.td`, I've also lifted the definition of the
`MVEMXPredicated` multiclass higher up the file, without changing it.)

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75254

b08d2ddd

[ARM,MVE] Correct MC operands in VCVT.F32.F16. (NFC) · 69441e53

Simon Tatham authored Mar 02, 2020

Summary:
The two MVE instructions that convert between v4f32 and v8f16 were
implemented as instances of the same class, with the same MC operand
list.

But that's not really appropriate, because the narrowing conversion
only partially overwrites its output register (it only has 4 f16
values to write into a vector of 8), so even when unpredicated, it
needs a $Qd_src input, a constraint tying that to the $Qd output, and
a vpred_n.

The widening conversion is better represented like any other
instruction that completely replaces its output when unpredicated: it
should have no $Qd_src operand, and instead, a vpred_r containing a
$inactive parameter. That's a better match to other similar
instructions, such as its integer analogue, the VMOVL instruction that
makes a v4i32 by sign- or zero-extending every other lane of a v8i16.

This commit brings the widening VCVT.F32.F16 into line with the other
instructions that behave like it. That means you can write isel
patterns that use it unpredicated, without having to add a pointless
undefined $QdSrc operand.

No existing code generation uses that instruction yet, so there should
be no functional change from this fix.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75253

69441e53

[ARM,MVE] Add ACLE intrinsics for VQMOV[U]N family. · a41ecf0e

Simon Tatham authored Mar 02, 2020

Summary:
These instructions work like VMOVN (narrowing a vector of wide values
to half size, and overwriting every other lane of an output register
with the result), except that the narrowing conversion is saturating.
They come in three signedness flavours: signed to signed, unsigned to
unsigned, and signed to unsigned. All are represented in IR by a
target-specific intrinsic that takes two separate 'unsigned' flags.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75252

a41ecf0e

[DWARF] Use DWARFDataExtractor::getInitialLength to parse debug_names · dba683cc

Pavel Labath authored Feb 25, 2020

Summary:
In this patch I've done a slightly bigger rewrite to also remove the
hardcoded header lengths.

Reviewers: jhenderson, dblaikie, ikudrin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75119

dba683cc

[DWARF] Use getInitialLength in range list parsing · 164e2c85

Pavel Labath authored Feb 25, 2020

Summary:
This could be considered obvious, but I am putting it up to illustrate
the usefulness/impact of the getInitialLength change.

Reviewers: dblaikie, jhenderson, ikudrin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75117

164e2c85

[DWARFDebugLine] Use new DWARFDataExtractor::getInitialLength · d978656f

Pavel Labath authored Feb 14, 2020

Summary:
The error messages change somewhat, but I believe the overall
informational value remains unchanged.

Reviewers: jhenderson, dblaikie, ikudrin

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75116

d978656f

Fix Base64Test - for StringRef size · b52355f8

serge-sans-paille authored Mar 02, 2020

Original failures: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15975/steps/test-stage1-compiler/logs/stdio

b52355f8

[ARM][MVE] Restrict allowed types of gather/scatter offsets · 39497411

Anna Welker authored Mar 02, 2020

The MVE gather instructions smaller than 32bits zext extend the values
in the offset register, as opposed to sign extending them. We need to
make sure that the code that we select from is suitably extended, which
this patch attempts to fix by tightening up the offset checks.

Differential Revision: https://reviews.llvm.org/D75361

39497411

[NFC][PowerPC] Move some alias definition from PPCInstrInfo.td to PPCInstr64Bit.td · 4962a0b2
Kang Zhang authored Mar 02, 2020
```
Summary:
Some 64-bit instructions alias definition is in PPCInstrInfo.td, it should be
moved to PPCInstr64Bit.td.
```
4962a0b2
[gn build] Port 5a1958f2 · 8c7c32b4
LLVM GN Syncbot authored Mar 02, 2020

8c7c32b4

Syndicate, test and fix base64 implementation · 5a1958f2

serge-sans-paille authored Feb 24, 2020

Move Base64 implementation from clangd/SemanticHighlighting to
llvm/Support/Base64, fix its implementation and provide a decent test suite.

Previous implementation code was using + operator instead of | to combine some
results, which is a problem when shifting signed values. (0xFF << 16) is
implicitly converted to a (signed) int, and thus results in 0xffff0000, which is
negative. Combining negative numbers with a + in that context is not what we
want to do.

This fixes https://github.com/llvm/llvm-project/issues/149.

Differential Revision: https://reviews.llvm.org/D75057

5a1958f2

Revert "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for... · 802b22b5

Hans Wennborg authored Mar 02, 2020

Revert "[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters"

The Bitcode/DITemplateParameter-5.0.ll test is failing:

FAIL: LLVM :: Bitcode/DITemplateParameter-5.0.ll (5894 of 36324)
******************** TEST 'LLVM :: Bitcode/DITemplateParameter-5.0.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/llvm-dis -o - /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll.bc | /usr/local/google/home/thakis/src/llvm-project/out/gn/bin/FileCheck /usr/local/google/home/thakis/src/llvm-project/llvm/test/Bitcode/DITemplateParameter-5.0.ll
--
Exit Code: 2

Command Output (stderr):
--

It looks like the Bitcode/DITemplateParameter-5.0.ll.bc file was never checked in.

This reverts commit c2b437d5.

802b22b5

[DebugInfo][clang][DWARF5]: Added support for debuginfo generation for defaulted parameters · c2b437d5

Awanish Pandey authored Mar 02, 2020

in C++ templates.

Summary:
This patch adds support for debuginfo generation for defaulted
parameters in clang and also extends corresponding DebugMetadata/IR to support this feature.

Reviewers: probinson, aprantl, dblaikie

Reviewed By: aprantl, dblaikie

Differential Revision: https://reviews.llvm.org/D73462

c2b437d5

[PowerPC][test] Improve .got2 and .toc tests · daab6ad5
Fangrui Song authored Mar 01, 2020
```
There is no .got2 test for powerpc32.
There is no comdat variable test for powerpc{32,64}.
```
daab6ad5

[InlineSpiller] Relax re-materialization restriction for statepoint · 496e0a99

Serguei Katkov authored Feb 28, 2020

We should be careful to allow count of re-materialization of operands to be less
then number of physical registers.

STATEPOINT instruction has a variable number of operands and potentially very big.
So re-materialization for all operands is disabled at the moment if restrict-statepoint-remat is true.

The patch relaxes the re-materialization restriction for STATEPOINT instruction allowing it for
fixed operands. Specifically it is about call target.

Reviewers: reames
Reviewed By: reames
Subscribers: llvm-commits, qcolombet, hiraditya
Differential Revision: https://reviews.llvm.org/D75335

496e0a99

[Sparc] Fix incorrect operand for matching CMPri pattern · bfdb834b

Jim Lin authored Mar 02, 2020

Summary:
It should be normal constant instead of target constant.
Pattern CMPri can be matched if the constant can be fitted into immediate field.
Otherwise, pattern CMPrr will be matched.
This fixed bug https://bugs.llvm.org/show_bug.cgi?id=44091.

Reviewers: dcederman, jyknight

Reviewed By: jyknight

Subscribers: jonpa, hiraditya, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75227

bfdb834b

[DAGCombiner][X86] Disable narrowExtractedVectorLoad if the element type size isn't byte sized · 0cd6712a

Craig Topper authored Mar 01, 2020

The address calculation for the offset assumes that you can calculate the offset by multiplying the index by the store size of the element. But that only works if the element's store size is exactly its real size since we store vectors tightly packed in memory. There are improvements we could make to this like special casing extracting element 0. I think we could also handle cases where the extracted VT is byte sized and the index is aligned with the extract element count.

Differential Revision: https://reviews.llvm.org/D75377

0cd6712a

[X86] Not track size of the boudaryalign fragment during the layout · 2ac19feb

Shengchen Kan authored Mar 01, 2020

Summary:
Currently the boundaryalign fragment caches its size during the process
of layout and then it is relaxed and update the size in each iteration. This
behaviour is unnecessary and ugly.

Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight

Reviewed By: MaskRay

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75404

2ac19feb

[X86][TwoAddressInstructionPass] Teach tryInstructionCommute to continue... · b6e27961

Craig Topper authored Feb 27, 2020

[X86][TwoAddressInstructionPass] Teach tryInstructionCommute to continue checking for commutable FMA operands in more cases.

Previously we would only check for another commutable operand if the first commute was an aggressive commute.

But if we have two kill operands and neither is tied to the def at the start, we should consider both operands as the one to use as the new def.

This improves the loop in the fma-commute-loop.ll test. This test is derived from a post from discourse here https://llvm.discourse.group/t/unnecessary-vmovapd-instructions-generated-can-you-hint-in-favor-of-vfmadd231pd/582

Differential Revision: https://reviews.llvm.org/D75016

b6e27961

Mar 01, 2020

[DAGCombiner] Don't emit select_cc from visitSINT_TO_FP/visitUINT_TO_FP. Use plain select instead. · 211fb91f

Craig Topper authored Mar 01, 2020

Select_cc isn't used by all targets. X86 doesn't have optimizations
for it.

Since we already know the input to the sint_to_fp/uint_to_fp is
a setcc we can just emit a plain select using that setcc as the
condition. Other DAG combines can turn that into a select_cc on
targets that support it.

Differential Revision: https://reviews.llvm.org/D75415

211fb91f

[JITLink] Update DEBUG_TYPE string for llvm-jitlink. · 66128c48
Lang Hames authored Mar 01, 2020
```
Apparently LLVM_DEBUG doesn't like dashes in strings.
```
66128c48
Fix [ADT][NFC] SCCIterator: Change hasLoop() to hasCycle() · 6fa0b6dd
Stefanos Baziotis authored Mar 01, 2020

6fa0b6dd
[ADT][NFC] SCCIterator: Change hasLoop() to hasCycle() · 21390eab
Stefanos Baziotis authored Mar 01, 2020

21390eab
Attempt to fix ZLIB CMake logic on Windows · 1079c68a
Reid Kleckner authored Mar 01, 2020
```
CMake doesn't seem to like it when you regex search for "^".
```
1079c68a

[WinEH] Fix inttoptr+phi optimization in presence of catchswitch · 1adbe86d

Reid Kleckner authored Mar 01, 2020

getFirstInsertionPt's return value must be checked for validity before
casting it to Instruction*. Don't attempt to insert casts after a phi in
a catchswitch block.

Fixes PR45033, introduced in D37832.

Reviewed By: davidxl, hfinkel

Differential Revision: https://reviews.llvm.org/D75381

1adbe86d

[DAGCombiner] recognize shuffle (shuffle X, Mask0), Mask --> splat X · 619d7dc3

Sanjay Patel authored Mar 01, 2020

We get the simple cases of this via demanded elements and other folds,
but that doesn't work if the values have >1 use, so add a dedicated
match for the pattern.

We already have this transform in IR, but it doesn't help the
motivating x86 tests (based on PR42024) because the shuffles don't
exist until after legalization and other combines have happened.
The AArch64 test shows a minimal IR example of the problem.

Differential Revision: https://reviews.llvm.org/D75348

619d7dc3

[Coroutines][New pass manager] Move CoroElide pass to right position · 624dbfcc
Jun Ma authored Mar 01, 2020
```
Differential Revision: https://reviews.llvm.org/D75345
```
624dbfcc
Revert "[Coroutines][new pass manager] Move CoroElide pass to right position" · 44d83671
Jun Ma authored Mar 01, 2020
```
This reverts commit 4c0a133a.
```
44d83671
[Coroutines][new pass manager] Move CoroElide pass to right position · 4c0a133a
Jun Ma authored Mar 01, 2020
```
Differential Revision: https://reviews.llvm.org/D75345
```
4c0a133a

[X86] Don't add DELETED_NODES to DAG combine worklist after calling... · 2f4f8fcf

Craig Topper authored Mar 01, 2020

[X86] Don't add DELETED_NODES to DAG combine worklist after calling SimplifyDemandedBits/SimplifyDemandedVectorElts.

These AddToWorklist calls were added in 84cd968f.
It's possible the SimplifyDemandedBits/SimplifyDemandedVectorElts
triggered CSE that deleted N. Detect that and avoid adding N
to the worklist.

Fixes PR45067.

2f4f8fcf

[PowerPC] Move .got2/.toc logic from PPCLinuxAsmPrinter::doFinalization() to emitEndOfAsmFile() · 9569a147
Fangrui Song authored Feb 29, 2020
```
Delete redundant .p2align 2 and improve tests.
```
9569a147

Feb 29, 2020

[ValueTracking] Let getGuaranteedNonFullPoisonOp consider assume, remove mentioning about br · 644e7476

Juneyoung Lee authored Mar 01, 2020

Summary:
This patch helps getGuaranteedNonFullPoisonOp handle llvm.assume call.
Also, a comment about the semantics of branch is removed to prevent confusion.
As llvm.assume does, branching on poison directly raises UB (as LangRef says), and this allows transformations such as introduction of llvm.assume on branch condition at each successor, or freely replacing values after conditional branch (such as at loop exit).
Handling br is not addressed in this patch. It makes SCEV more accurate, causing existing LoopVectorize/IndVar/etc tests to fail.

Reviewers: spatel, lebedev.ri, nlopes

Reviewed By: nlopes

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75397

644e7476