Commits · 3614df3537f9d699fe0835baf6fc0ddd5c9d699d · Lorenzo Albano / LLVM bpEVL

Mar 18, 2021

Revert "[VPlan] Add plain text (not DOT's digraph) dumps" · 3614df35

Mehdi Amini authored Mar 18, 2021

This reverts commit 6b053c98.
The build is broken:

ld.lld: error: undefined symbol: llvm::VPlan::printDOT(llvm::raw_ostream&) const
>>> referenced by LoopVectorize.cpp
>>>               LoopVectorize.cpp.o:(llvm::LoopVectorizationPlanner::printPlans(llvm::raw_ostream&)) in archive lib/libLLVMVectorize.a

3614df35

[VPlan] Add plain text (not DOT's digraph) dumps · 6b053c98

Andrei Elovikov authored Mar 18, 2021

I foresee two uses for this:
1) It's easier to use those in debugger.
2) Once we start implementing more VPlan-to-VPlan transformations (especially
   inner loop massaging stuff), using the vectorized LLVM IR as CHECK targets in
   LIT test would become too obscure. I can imagine that we'd want to CHECK
   against VPlan dumps after multiple transformations instead. That would be
   easier with plain text dumps than with DOT format.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D96628

6b053c98

[WebAssembly] Finalize SIMD names and opcodes · f5764a86

Thomas Lively authored Mar 18, 2021

Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all
SIMD instructions to match the finalized SIMD spec. Deliberately does not change
the public interface in wasm_simd128.h yet; that will require more care.

Depends on D98466.

Differential Revision: https://reviews.llvm.org/D98676

f5764a86

[WebAssembly] Remove experimental SIMD instructions · 2f2ae08d

Thomas Lively authored Mar 18, 2021

Removes the instruction definitions, intrinsics, and builtins for qfma/qfms,
signselect, and prefetch instructions, which were not included in the final
WebAssembly SIMD spec.

Depends on D98457.

Differential Revision: https://reviews.llvm.org/D98466

2f2ae08d

[docs] Document regular LLVM sync-ups · 64bb3759

Kristof Beyls authored Mar 17, 2021

This documents current regular LLVM sync-ups that are happening in the
Getting Involved section.

I hope this gives a bit more visibility to regular sync-ups that are
happening in the LLVM community, documenting another way communication
in the community happens.
Of course the downside is that this is another location that sync-up
metadata needs to be maintained. That being said, the structure as
proposed means that no changes are needed once a new sync-up is added,
apart from maybe removing the entry once it becomes clear that that
particular sync-up series is completely cancelled.

Documenting a few pointers on how current sync-ups happen may also
encourage others to organize useful sync-ups on specific topics.

I've started with adding the sync-ups I'm aware of. There's a good
chance I've missed some.

If most sync-ups end up having a public google calendar, we could also
create and maintain a public google calendar that shows all events
happening in the LLVM community, including dev meetings, sync-ups,
socials, etc - assuming that would be valuable.

Differential Revision: https://reviews.llvm.org/D98797

64bb3759

[WebAssembly] Remove unimplemented-simd target feature · 8638c897

Thomas Lively authored Mar 18, 2021

Now that the WebAssembly SIMD specification is finalized and engines are
generally up-to-date, there is no need for a separate target feature for gating
SIMD instructions that engines have not implemented. With this change,
v128.const is now enabled by default with the simd128 target feature.

Differential Revision: https://reviews.llvm.org/D98457

8638c897

[llvm][AArch64][SVE] Lower fixed length vector fabs · 0d6482a7
Peter Waller authored Mar 11, 2021
```
Seemingly striaghtforward.

Differential Revision: https://reviews.llvm.org/D98434
```
0d6482a7
[AMDGPU] Support SCC on buffer atomics · 961e4384
Stanislav Mekhanoshin authored Mar 16, 2021
```
Differential Revision: https://reviews.llvm.org/D98731
```
961e4384

[SampleFDO] Don't mix up the existing indirect call value profile with the new · 14756b70

Wei Mi authored Mar 17, 2021

value profile annotated after inlining.

In https://reviews.llvm.org/D96806 and https://reviews.llvm.org/D97350, we
use the magic number -1 in the value profile to avoid repeated indirect call
promotion to the same target for an indirect call. Function updateIDTMetaData
is used to mark an target as being promoted in the value profile with the
magic number. updateIDTMetaData is also used to update the value profile
when an indirect call is inlined and new inline instance profile should be
applied. For the second case, currently updateIDTMetaData mixes up the
existing value profile of the indirect call with the new profile, leading
to the problematic senario that a target count is larger than the total count
in the value profile.

The patch fixes the problem. When updateIDTMetaData is used to update the
value profile after inlining, all the values in the existing value profile
will be dropped except the values with the magic number counts.

Differential Revision: https://reviews.llvm.org/D98835

14756b70

Reapply "[NPM][CGSCC] FunctionAnalysisManagerCGSCCProxy: do not clear immutable function passes" · 92ccc6cb

Mircea Trofin authored Mar 12, 2021

This reverts commit 11b70b9e.

The bot failure was due to ArgumentPromotion deleting functions
without deleting their analyses. This was separately fixed in 4b1c8070.

92ccc6cb

[NFC][ArgumentPromotion] Clear FAM cached results of erased function. · 4b1c8070

Mircea Trofin authored Mar 18, 2021

Not doing it here can lead to subtle bugs - the analysis results are
associated by the Function object's address. Nothing stops the memory
allocator from allocating new functions at the same address.

4b1c8070

[OPENMP51]Support for the 'destroy' clause with interop variable. · c2f8e158

Mike Rice authored Mar 17, 2021

Added basic parsing/sema/serialization support to extend the
existing 'destroy' clause for use with the 'interop' directive.

Differential Revision: https://reviews.llvm.org/D98834

c2f8e158

[libsupport] Silence a bogus valgrind warning. · ced72567

Chris Lattner authored Mar 17, 2021

Valgrind is reporting this bogus warning because it doesn't model
pthread_sigmask fully accurately.  This is a valgrind bug, but
silencing it has effectively no cost, so just do it.

==73662== Syscall param __pthread_sigmask(set) points to uninitialised byte(s)
==73662==    at 0x101E9D4C2: __pthread_sigmask (in /usr/lib/system/libsystem_kernel.dylib)
==73662==    by 0x101EFB5EA: pthread_sigmask (in /usr/lib/system/libsystem_pthread.dylib)
==73662==    by 0x1000D9F6D: llvm::sys::Process::SafelyCloseFileDescriptor(int) (in /Users/chrisl/Projects/circt/build/bin/firtool)
==73662==    by 0x100072795: llvm::ErrorOr<std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer> > > getFileAux<llvm::MemoryBuffer>(llvm::Twine const&, long long, unsigned long long, unsigned long long, bool, bool) (in /Users/chrisl/Projects/circt/build/bin/firtool)
==73662==    by 0x100072573: llvm::MemoryBuffer::getFileOrSTDIN(llvm::Twine const&, long long, bool) (in /Users/chrisl/Projects/circt/build/bin/firtool)
==73662==    by 0x100282C25: mlir::openInputFile(llvm::StringRef, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >*) (in /Users/chrisl/Projects/circt/build/bin

Differential Revision: https://reviews.llvm.org/D98830

ced72567

[AMDGPU] Remove unused template parameters of MUBUF_Real_AllAddr_vi · 3f37c282
Stanislav Mekhanoshin authored Mar 17, 2021
```
Differential Revision: https://reviews.llvm.org/D98804
```
3f37c282

[amdgpu] Update med3 combine to skip i64 · 253f804d

Jon Chesterfield authored Mar 18, 2021

[amdgpu] Update med3 combine to skip i64

Fixes an assumption that a type which is not i32 will be i16. This asserts
when trying to sign/zero extend an i64 to i32.

Test case was cut down from an openmp application. Variations on it are hit by
other combines before reaching the problematic one, e.g. replacing the
immediate values with other function arguments changes the codegen path and
misses this combine.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D98872

253f804d

[XCore] Remove XFAIL: xcore from passing test. · 1a6ab32f

Nigel Perks authored Mar 10, 2021

The pass can be seen on staging buildbot clang-xcore-ubuntu-20-x64.

Differential Revision: https://reviews.llvm.org/D98352

1a6ab32f

[DAG] Improve folding (sext_in_reg (*_extend_vector_inreg x)) -> (sext_vector_inreg x) · 1ba5c550

Simon Pilgrim authored Mar 18, 2021

Extend this to support ComputeNumSignBits of the (used) source vector elements so that we can handle more than just the case where we're sext_in_reg from the source element signbit.

Noticed while investigating the poor codegen in D98587.

1ba5c550

[Hexagon] Add support for named registers cs0 and cs1 · c539be1d
Sid Manning authored Mar 11, 2021
```
Allow inline assembly code to referece cs0 and cs1.
```
c539be1d
[gn build] Port ed8bff13 · 6333ee21
LLVM GN Syncbot authored Mar 18, 2021

6333ee21

[MCA] Ensure that writes occur in-order · e6ce0db3

Andrew Savonichev authored Mar 12, 2021

Delay the issue of a new instruction if that leads to out-of-order
commits of writes.

This patch fixes the problem described in:
https://bugs.llvm.org/show_bug.cgi?id=41796#c3

Differential Revision: https://reviews.llvm.org/D98604

e6ce0db3

[AMDGPU] Add some gfx1010 test coverage. NFC. · 078b338b
Jay Foad authored Mar 18, 2021

078b338b
[X86][SSE] Regenerate PR18054 test case · 758efce3
Simon Pilgrim authored Mar 18, 2021

758efce3
GlobalISel: Preserve source value information for outgoing byval args · b9a03849
Matt Arsenault authored Mar 14, 2021
```
Pass through the original argument IR value in order to preserve the
aliasing information in the memcpy memory operands.
```
b9a03849

GlobalISel: Insert memcpy for outgoing byval arguments · 61f834cc

Matt Arsenault authored Mar 12, 2021

byval requires an implicit copy between the caller and callee such
that the callee may write into the stack area without it modifying the
value in the parent. Previously, this was passing through the raw
pointer value which would break if the callee wrote into it.

Most of the time, this copy can be optimized out (however we don't
have the optimization SelectionDAG does yet).

This will trigger more fallbacks for AMDGPU now, since we don't have
legalization for memcpy yet (although we should stop using byval
anyway).

61f834cc

[SLP]Fix crash on extending scheduling region. · b3ced985

Alexey Bataev authored Mar 12, 2021

If SLP vectorizer tries to extend the scheduling region and runs out of
the budget too early, but still extends the region to the new ending
instructions (i.e., it was able to extend the region for the first
instruction in the bundle, but not for the second), the compiler need to
recalculate dependecies in full, just like if the extending was
successfull. Without it, the schedule data chunks may end up with the
wrong number of (unscheduled) dependecies and it may end up with the
incorrect function, where the vectorized instruction does not dominate
on the extractelement instruction.

Differential Revision: https://reviews.llvm.org/D98531

b3ced985

[llvm-objcopy][NFC][Wasm] Do not use internal buffer while writing into the output. · eb4c85e4

Alexey Lapshin authored Dec 27, 2020

This patch is follow-up for D91028. It implements direct writing into the
output stream for wasm.

Depends on D91028

Differential Revision: https://reviews.llvm.org/D95478

eb4c85e4

[NFC] One more use case for evaluatePredicate · 26ec76ad
Max Kazantsev authored Mar 18, 2021

26ec76ad
[NFC] Use evaluatePredicate in eliminateComparison · 1067a13c
Max Kazantsev authored Mar 18, 2021
```
Just makes code simpler.
```
1067a13c

[SCEV][NFC] API for predicate evaluation · b3a1500e

Max Kazantsev authored Mar 18, 2021

Provides API that allows to check predicate for being true or
false with one call. Current implementation is naive and just
calls isKnownPredicate twice, but further we can rework this
logic trying to use one check to prove both facts.

b3a1500e

[test] Fix incorrect use of string variable use · b7904439

Thomas Preud'homme authored Mar 18, 2021

LLVM test CodeGen/AArch64/machine-outliner-retaddr-sign-thunk.ll uses
a string substitution block that contains a regex matching block. This
seems like as a copy/paste from other similar test where the match also
defines a variable, hence the [[]] syntax. In this case however this is
a CHECK-NOT variable so nothing should match. No variable definition is
thus expected and the square brackets can be dropped.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D98853

b7904439

[LoopVectorize] relax FMF constraint for FP induction · c8893f3b

Sanjay Patel authored Mar 18, 2021

This makes the induction part of the loop vectorizer match the reduction part.
We do not need all of the fast-math-flags. For example, there are some that
clearly are not in play like arcp or afn.

If we want to make FMF constraints consistent across the IR optimizer, we
might want to add nsz too, but that's up for debate (users can't expect
associative FP math and preservation of sign-of-zero at the same time?).

The calling code was fixed to avoid miscompiles with:
1bee5497

Differential Revision: https://reviews.llvm.org/D98708

c8893f3b

[AMDGPU] Regenerate atomic_optimizations_global_pointer.ll tests · 388fbefb
Simon Pilgrim authored Mar 18, 2021

388fbefb
[ARM] Regenerate select-imm.ll tests · d9b5338c
Simon Pilgrim authored Mar 18, 2021

d9b5338c

[llvm-objcopy] remove split dwo file creation from executeObjcopyOnBinary. · f134a715

Alexey Lapshin authored Mar 12, 2021

This patch removes creation of the resulting file from the
executeObjcopyOnBinary() function. For the most use cases, the
executeObjcopyOnBinary receives output file as a parameter
- raw_ostream &Out. The splitting .dwo file is implemented differently:
file containg .dwo tables is created inside executeObjcopyOnBinary().
When objcopy functionality would be moved into separate library,
current implementation will become inconvenient. The goal of that
refactoring is to separate concerns: It might be convenient to
to do dwo tables splitting but to create resulting file differently.

Differential Revision: https://reviews.llvm.org/D98582

f134a715

[DAG] SelectionDAG::isSplatValue - add ISD::ABS handling · b1afa187

Simon Pilgrim authored Mar 18, 2021

Add ISD::ABS to the existing unary instructions handling for splat detection

This is similar to D83605, but doesn't appear to need to touch any of the wasm refactoring.

Differential Revision: https://reviews.llvm.org/D98778

b1afa187

[RISCV] Support scalable-vector masked scatter operations · 3495031a

Fraser Cormack authored Feb 08, 2021

This patch adds support for masked scatter intrinsics on scalable vector
types. It is mostly an extension of the earlier masked gather support
introduced in D96263, since the addressing mode legalization is the
same.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96486

3495031a

[Test][DebugInfo] Check for backend object emission support. · 251fe986

Nigel Perks authored Sep 14, 2020

The XCore backend does not support object emission. Several tests fail for this
reason when XCore is the default target. See staging buildbot builder:
clang-xcore-ubuntu-20-x64.

So check for backend object emission before running the tests requiring it.

Incorporate isConfigurationSupported functionality in isObjectEmissionSupported,
to avoid calling them both in the same tests.

Differential Revision: https://reviews.llvm.org/D98400

251fe986

[RISCV] Support scalable-vector masked gather operations · 0331399d

Fraser Cormack authored Feb 04, 2021

This patch supports the masked gather intrinsics in RVV.

The RVV indexed load/store instructions only support the "unsigned unscaled"
addressing mode; indices are implicitly zero-extended or truncated to XLEN and
are treated as byte offsets. This ISA supports the intrinsics directly, but not
the majority of various forms of the MGATHER SDNode that LLVM combines to. Any
signed or scaled indexing is extended to the XLEN value type and scaled
accordingly. This is done during DAG combining as widening the index types to
XLEN may produce illegal vectors that require splitting, e.g.
nxv16i8->nxv16i64.

Support for scalable-vector CONCAT_VECTORS was added to avoid spilling via the
stack when lowering split legalized index operands.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D96263

0331399d

[X86][NFC] Pre-commit test case for the fix of ldtilecfg insertion. · 209a626e
Wang, Pengfei authored Mar 18, 2021

209a626e
[X86][AMX][NFC] Give correct Passname for Tile Register Pre-configure · 0002d4bf
Bing1 Yu authored Mar 18, 2021

0002d4bf