Commits · c7d4aa711ab7981358b5e17e56f1fb6f7f585ac1 · Raul Torres / llvm-target-spread

Oct 02, 2020

[libc++] Move the weak symbols list to libc++abi · c7d4aa71

Louis Dionne authored Sep 30, 2020

Those symbols are exported from libc++abi in the first place, so it
makes more sense to have them there.

c7d4aa71

BlockFrequencyInfoImpl.h - use const references to avoid FrequencyData copies. NFCI. · 4edd74a1
Simon Pilgrim authored Oct 02, 2020

4edd74a1
LoopAccessAnalysis.cpp - use const reference in for-range loops. NFCI. · 71b89b14
Simon Pilgrim authored Oct 02, 2020

71b89b14
[SLP] Add test where reduction result is used in PHI. · bb448a24
Florian Hahn authored Oct 02, 2020
```
Test case for PR47670.
```
bb448a24
[InstCombine] Add partial bswap vector test from D88578 · 53fb9d06
Simon Pilgrim authored Oct 02, 2020

53fb9d06

[AArch64] Add CPU Cortex-R82 · 8825fec3

Sjoerd Meijer authored Oct 01, 2020

This adds support for -mcpu=cortex-r82. Some more information about this
core can be found here:

https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r82

One note about the system register: that is a bit of a refactoring because of
small differences between v8.4-A AArch64 and v8-R AArch64.

This is based on patches from Mark Murray and Mikhail Maltsev.

Differential Revision: https://reviews.llvm.org/D88660

8825fec3

[clangd] Make PopulateSwitch a fix. · 57ac47d7

Sam McCall authored Oct 02, 2020

It fixes the -Wswitch warning, though we mark it as a fix even if that is off.
This makes it the "recommended" action on an incomplete switch, which seems OK.

Differential Revision: https://reviews.llvm.org/D88726

57ac47d7

[PhaseOrdering] Add test that requires peeling before vectorization. · 6481a764
Florian Hahn authored Sep 29, 2020
```
Test case for PR47671.
```
6481a764

[GVN LoadPRE] Add test to show an opportunty. · 8ae1369f

Serguei Katkov authored Oct 02, 2020

We can use context to prove that load can be safely executed
at a point where load is being hoisted.

8ae1369f

[MLIR][LLVM] Fixed `topologicalSort()` to iterative version · d4568ed7

George Mitenkov authored Oct 02, 2020

Instead of recursive helper method `topologicalSortImpl()`,
sort's implementation is moved to `topologicalSort()` function's
body directly. `llvm::ReversePostOrderTraversal` is used to create
a traversal of blocks in reverse post order.

Reviewed By: kiranchandramohan, rriddle

Differential Revision: https://reviews.llvm.org/D88544

d4568ed7

[mlir] Add subtensor_insert operation · cf9503c1
Nicolas Vasilache authored Oct 02, 2020
```
Differential revision: https://reviews.llvm.org/D88657
```
cf9503c1
[clangd][lit] Update document-link.test to respect custom resource-dir locations · 54c03d8f
Kadir Cetinkaya authored Oct 02, 2020
```
Differential Revision: https://reviews.llvm.org/D88721
```
54c03d8f
[InstCombine] Add some basic vector bswap tests · ec07ae2a
Simon Pilgrim authored Oct 02, 2020
```
We get the vNi16 cases already via matching as a rotate followed by the fshl -> bswap combines
```
ec07ae2a
[mlir] Add canonicalization for the `subtensor` op · 787bf5e3
Nicolas Vasilache authored Oct 02, 2020
```
Differential revision: https://reviews.llvm.org/D88656
```
787bf5e3

[mlir] Add a subtensor operation · e3de249a

Nicolas Vasilache authored Oct 02, 2020

This revision introduces a `subtensor` op, which is the counterpart of `subview` for a tensor operand. This also refactors the relevant pieces to allow reusing the `subview` implementation where appropriate.

This operation will be used to implement tiling for Linalg on tensors.

e3de249a

[InstCombine] Add partial bswap test from D88578 · 670e60c0
Simon Pilgrim authored Oct 02, 2020

670e60c0

[ARM] Prevent constants from iCmp instruction from being hoisted if part of a min(max()) pattern · f7c0e2b8

Meera Nakrani authored Oct 02, 2020

Marks constants of an ICmp instruction as free if it's only user is a select
instruction that is part of a min(max()) pattern. Ensures that in loops, in
particular when loop unrolling is turned on, SSAT will still be correctly generated.

Differential Revision: https://reviews.llvm.org/D88662

f7c0e2b8

[RISCV] Support vmsge.vx and vmsgeu.vx pseudo instructions in RVV. · 067add7b

Hsiangkai Wang authored Jul 28, 2020

Implement vmsge{u}.vx pseudo instruction.

According to RISC-V V specification, there are different scenarios for this
pseudo instruction. I list them below.

unmasked va >= x

  pseudoinstruction: vmsge{u}.vx vd, va, x
  expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd

masked va >= x, vd != v0

  pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t
  expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0

masked va >= x, vd == v0

  pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt
  expansion: vmslt{u}.vx vt, va, x;  vmandnot.mm vd, vd, vt

Use pseudo instruction to model vmsge{u}.vx. The pseudo instruction will convert
to different expansion according to the condition.

Differential Revision: https://reviews.llvm.org/D84732

067add7b

[clangd] Remove Tweak::Intent, use CodeAction kind directly. NFC · 17747d2e

Sam McCall authored Sep 28, 2020

Intent was a nice idea but it ends up being a bit awkward/heavyweight
without adding much.

In particular, it makes it hard to implement `CodeActionParams.only` properly
(there's an inheritance hierarchy for kinds).

Differential Revision: https://reviews.llvm.org/D88427

17747d2e

Fix limit behavior of dynamic alloca · 9573c9f2

serge-sans-paille authored Sep 30, 2020

When the allocation size is 0, we shouldn't probe. Within [1,  PAGE_SIZE], we
should probe once etc.

This fixes https://bugs.llvm.org/show_bug.cgi?id=47657

Differential Revision: https://reviews.llvm.org/D88548

9573c9f2

[yaml2obj][elf2yaml] - Add a support for the `EntSize` field for `SHT_HASH` sections. · 5829dc92

Georgii Rymar authored Oct 01, 2020

Specification for SHT_HASH table says (https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html#hash)
that it contains Elf32_Word entries for both 32/64 bit objects.

Currently both GNU linkers and LLD sets the `sh_entsize` field to `4`.

At the same time, `yaml2obj` ignores the `EntSize` field for SHT_HASH sections.
This patch fixes this and also adds a support for obj2yaml: it will not
dump this field when the `sh_entsize` contains the default value (`4`).

Differential revision: https://reviews.llvm.org/D88652

5829dc92

Handle unused variable without asserts · bfd7ee92
Tres Popp authored Oct 02, 2020

bfd7ee92
[clangd] Drop dependence on standard library in check.test · bc18d8d9
Sam McCall authored Oct 02, 2020

bc18d8d9

[WebAssembly] Emulate v128.const efficiently · 542523a6

Thomas Lively authored Oct 02, 2020

v128.const was recently implemented in V8, but until it rolls into Chrome
stable, we can't enable it in the WebAssembly backend without breaking origin
trial users. So far we have been lowering build_vectors that would otherwise
have been lowered to v128.const to splats followed by sequences of replace_lane
instructions to initialize each lane individually. That produces large and
inefficient code, so this patch introduces new logic to lower integer vector
constants to a single i64x2.splat where possible, with at most a single
i64x2.replace_lane following it if necessary.

Adapted from a patch authored by @omnisip.

Differential Revision: https://reviews.llvm.org/D88591

542523a6

[SVE][CodeGen] Fix implicit TypeSize->uint64_t casts in TypePromotion · b0ce9f0f

David Sherwood authored Sep 30, 2020

The TypePromotion pass only operates on scalar types so I've fixed up
all places where we were relying upon the implicit cast from
TypeSize->uint64_t.

Differential Revision: https://reviews.llvm.org/D88575

b0ce9f0f

[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions · b8ce6a67

David Sherwood authored Oct 01, 2020

When we know that a particular type is always going to be fixed
width we have so far been writing code like this:

  getSizeInBits().getFixedSize()

Since we are doing this in quite a few places now it seems to make
sense to add a new helper function that allows us to replace
these calls with a single getFixedSizeInBits() call.

Differential Revision: https://reviews.llvm.org/D88649

b8ce6a67

[AArch64] Omit SEH directives for the epilogue if none are needed · afb4e0f2

Martin Storsjö authored Oct 01, 2020

For these cases, we already omit the prologue directives, if
(!AFI->hasStackFrame() && !windowsRequiresStackProbe && !NumBytes).

When writing the epilogue (after the prolog has been written), if
the function doesn't have the WinCFI flag set (i.e. if no prologue
was generated), assume that no epilogue will be needed either,
and don't emit any epilog start pseudo instruction. After completing
the epilogue, make sure that it actually matched the prologue.

Previously, when epilogue start/end was generated, but no prologue,
the unwind info for such functions actually was huge; 12 bytes xdata
(4 bytes header, 4 bytes for one non-folded epilogue header, 4 bytes
for padded opcodes) and 8 bytes pdata. Because the epilog consisted of
one opcode (end) but the prolog was empty (no .seh_endprologue), the
epilogue couldn't be folded into the prologue, and thus couldn't be
considered for packed form either.

On a 6.5 MB DLL with 110 KB pdata and 166 KB xdata, this gets rid of
38 KB pdata and 62 KB xdata.

Differential Revision: https://reviews.llvm.org/D88641

afb4e0f2

[MLIR] Updates around MemRef Normalization · 47df8c57

Stephen Neuendorffer authored Sep 29, 2020

The documentation for the NormalizeMemRefs pass and the associated MemRefsNormalizable
traits was confusing and not on the website. This update clarifies the language
around the difference between a MemRef Type, an operation that accesses the value of
MemRef Type, and better documents the limitations of the current implementation.
This patch also includes some basic debugging information for the pass so people
might have a chance of figuring out why it doesn't work on their code.

Differential Revision: https://reviews.llvm.org/D88532

47df8c57

[SCEV] Limited support for unsigned preds in isImpliedViaOperations · b8ac19cf

Max Kazantsev authored Oct 02, 2020

The logic there only considers `SLT/SGT` predicates. We can use the same logic
for proving `ULT/UGT` predicates if all involved values are non-negative.

Adding full-scale support for unsigned might be challenging because of code amount,
so we can consider this in the future.

Differential Revision: https://reviews.llvm.org/D88087
Reviewed By: reames

b8ac19cf

[gvn] Handle a corner case w/vectors of non-integral pointers · f29645e7

Philip Reames authored Oct 01, 2020

If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers.  In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back.

This shows up as wrong code bugs, and in some cases, crashes due to failed asserts.  Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.

f29645e7

[AMDGPU] SIInsertSkips: Tidy block splitting to use splitAt · 2ef9d21e

Carl Ritson authored Oct 02, 2020

Convert to use new MachineBasicBlock splitAt function.
Place code in splitBlock function for reuse in future changes.
Should yield no functional change.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88537

2ef9d21e

Have kernel binary scanner load dSYMs as binary+dSYM if best thing found · a1e97923

Jason Molenda authored Oct 01, 2020

lldb's PlatforDarwinKernel scans the local filesystem (well known
locations, plus user-specified directories) for kernels and kexts
when doing kernel debugging, and loads them automatically.  Sometimes
kernel developers want to debug with *only* a dSYM, in which case they
give lldb the DWARF binary + the dSYM as a binary and symbol file.
This patch adds code to lldb to do this automatically if that's the
best thing lldb can find.

A few other bits of cleanup in PlatformDarwinKernel that I undertook
at the same time:

1. Remove the 'platform.plugin.darwin-kernel.search-locally-for-kexts'
setting.  When I added the local filesystem index at start of kernel
debugging, I thought people might object to the cost of the search
and want a way to disable it.  No one has.

2. Change the behavior of
'plugin.dynamic-loader.darwin-kernel.load-kexts' setting so it does
not disable the local filesystem scan, or use of the local filesystem
binaries.

3. PlatformDarwinKernel::GetSharedModule into GetSharedModuleKext and
GetSharedModuleKernel for easier readability & maintenance.

4. Added accounting of .dSYM.yaa files (an archive format akin to tar)
that I come across during the scan.  I'm not using these for now; it
would be very expensive to expand the archives & see if the UUID matches
what I'm searching for.

<rdar://problem/69774993>
Differential Revision: https://reviews.llvm.org/D88632

a1e97923

CodeGen: Fix livein calculation in MachineBasicBlock splitAt · 5136f474

Carl Ritson authored Oct 02, 2020

Fix and simplify computation of liveins for new block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D88535

5136f474

[PowerPC] Put the CR field in low bits of GRC during copying CRRC to GRC. · c4690b00

Esme-Yi authored Oct 02, 2020

Summary: How we copying the CRRC to GRC is using a single MFOCRF to copy the contents of CR field n (CR bits 4×n+32:4×n+35) into bits 4×n+32:4×n+35 of register GRC. That’s not correct because we expect the value of destination register equals to source so we have to put the the contents of CR field in the lowest 4 bits. This patch adds a RLWINM after MFOCRF to achieve that.
The problem came up when adding builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp, as posted in D88278. We need to move the outputs (in CR register) to GRC. However outputs of these instructions may not in a fixed CR# register, so we can’t directly add a rotation instruction in the .td patterns, but need to wait until the CR register is determined. Then we confirmed this should be a bug in POST-RA PSEUDO PASS.

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D88274

c4690b00

[OpenMP] Add Missing Runtime Call for Globalization Remarks · 82453e75

Joseph Huber authored Sep 30, 2020

Summary:
Add a missing runtime call to perform data globalization checks.

Reviewers: jdoerfert

Subscribers: guansong hiraditya llvm-commits sstefan1 yaxunl

Tags: #LLVM #OpenMP

Differential Revision: https://reviews.llvm.org/D88621

82453e75

[flang][openacc] Update loop construct lowering · c1dcb573

Valentin Clement authored Oct 01, 2020

Update the loop construct lowering to support multiple occurences of the same clauses
such as private. Add some utility functions used by other constructs.

Upstreaming part of https://github.com/flang-compiler/f18-llvm-project/pull/438/

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D88253

c1dcb573

[flang] Extend runtime API for PAUSE to allow a stop code · 3261aefc

peter klausler authored Oct 01, 2020

Support integer and default character stop codes on PAUSE
statements.  Add length argument to STOP statement with a
character stop code.

Differential revision: https://reviews.llvm.org/D88692

3261aefc

[flang] Fix actions at end of output record · a94d943f

peter klausler authored Oct 01, 2020

It turns out that unformatted fixed-size output records
do need to be padded out if short, in order to avoid a
spurious EOF crash on a short record at the end of the file.
While here in AdvanceRecord(), move the unformatted
variable-length record header/footer writing code to here
from EndIoStatement().

Differential revision: https://reviews.llvm.org/D88685

a94d943f

[XCOFF] Enable -fdata-sections on AIX · 78a9e62a

jasonliu authored Oct 01, 2020

Summary:
Some design decision worth noting about:

I've noticed a recent mailing discussing about why string literal is
not affected by -fdata-sections for ELF target:
http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html

But on AIX, our linker could not split the mergeable string like other target.
So I think it would make more sense for us to emit separate csect for
every mergeable string in -fdata-sections mode,
as there might not be other ways for linker to do garbage collection
on unused mergeable string.

Reviewed By: daltenty, hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88339

78a9e62a

[flang] Fix buffering read->write transition · 61687f3a

peter klausler authored Sep 30, 2020

The buffer needs to be Reset() after a Flush(), since the
Flush() can be a no-op after a read->write transition.
And record numbers are 1-based, not 0-based.
This fixes a bug with rewrites of records that have been
recently read.

Differential revision: https://reviews.llvm.org/D88612

61687f3a