Commits · cefa1c5ca93d3a130623a5ccb90216832f3c0a03 · Roger Ferrer / llvm-epi

Feb 24, 2022

[AMDGPU] Fix combined MMO in load-store merge · cefa1c5c

Stanislav Mekhanoshin authored Feb 22, 2022

Loads and stores can be out of order in the SILoadStoreOptimizer.
When combining MachineMemOperands of two instructions operands are
sent in the IR order into the combineKnownAdjacentMMOs. At the
moment it picks the first operand and just replaces its offset and
size. This essentially loses alignment information and may generally
result in an incorrect base pointer to be used.

Use a base pointer in memory addresses order instead and only adjust
size.

Differential Revision: https://reviews.llvm.org/D120370

cefa1c5c

[X86] Introduce x86-cmov-converter-force-all · e38fc14c

Amir Ayupov authored Feb 15, 2022

Introduce an option to expand all CMOV groups into hammocks, matching GCC's
`-fno-if-conversion2` flag. The motivation is to leave CMOV conversion
opportunities to a binary optimizer that can make the decision based on branch
misprediction rate (available e.g. in Intel's LBR).

Reviewed By: MaskRay, skan

Differential Revision: https://reviews.llvm.org/D119777

e38fc14c

[tblgen] Compress CompositeSequences to 1/8th of its size. NFCI. · dc7a624e
Benjamin Kramer authored Feb 24, 2022

dc7a624e
[RISCV] Add Zbb RUN lines to neg-abs.ll. · 2aa1c6fc
Craig Topper authored Feb 24, 2022

2aa1c6fc
[AArch64] Async unwind - helper functions to decide on CFI emission · 25e92920
Momchil Velikov authored Feb 24, 2022
```
Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D112327
```
25e92920
[analyzer] Just use default capture after 7fd60ee6 · ecff9b65
Fangrui Song authored Feb 24, 2022

ecff9b65

[mlir][memref] Add transformation to do loop multi-buffering · b1357fe6

Thomas Raoux authored Feb 15, 2022

This transformation is useful to break dependency between consecutive loop
iterations by increasing the size of a temporary buffer. This is usually
combined with heavy software pipelining.

Differential Revision: https://reviews.llvm.org/D119406

b1357fe6

[RISCV] Update some tests to use floating point ABI where it makes sense. · f69078b7

Craig Topper authored Feb 24, 2022

Trying to reduce the diffs from D118333 for cases where it makes
more sense to use an FP ABI.

Reviewed By: asb, kito-cheng

Differential Revision: https://reviews.llvm.org/D120447

f69078b7

[AArch64] Async unwind - do not schedule frame setup/destroy · fd7e59f0

Momchil Velikov authored Feb 24, 2022

The PostRA scheduler can reorder non-CFI instructions in a way that
makes the unwind info not instruction precise.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D112326

fd7e59f0

[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). · a975ca97

Craig Topper authored Feb 24, 2022

Add a new ISD opcode to represent the sign extending behavior of
vmv.x.h. Keep the previous anyext opcode to allow the existing
(fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing
to generate a sign extend.

For fmv.x.w we are able to match the sext_inreg in an isel pattern,
but a 16-bit sext_inreg is lowered to a shift pair before isel. This
seemed like a larger match than we should do in isel.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118974

a975ca97

[flang] Lower allocatable assignment for scalar · 2a59ead1

Valentin Clement authored Feb 24, 2022

Add lowering for simple assignement on allocatable
scalars.

This patch is part of the upstreaming effort from fir-dev branch.

Depends on D120483

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D120488



Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>

2a59ead1

[clang][dataflow] Update StructValue child when assigning a value · baa0f221

Stanislav Gatev authored Feb 23, 2022

When assigning a value to a storage location of a struct member we
need to also update the value in the corresponding `StructValue`.

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Reviewed-by: ymandel, xazax.hun

Differential Revision: https://reviews.llvm.org/D120414

baa0f221

[clang-tidy] Remove opencl-c.h inclusion from tests · ba18c360

Sven van Haastregt authored Feb 24, 2022

After D120254 some clang-tidy tests started failing on release builds.

clang-tidy has been using the `-fdeclare-opencl-builtins` functionality
since this became the default in clang, so there is no need to include
`opencl-c.h`.

Differential Revision: https://reviews.llvm.org/D120470

ba18c360

[SDAG] remove shift that is redundant with part of funnel shift · 4a3708cd

Sanjay Patel authored Feb 24, 2022

This is the SDAG translation of D120253 :
https://alive2.llvm.org/ce/z/qHpmNn

The SDAG nodes can have different operand types than the result value.
We can see an example of that with AArch64 - the funnel shift amount
is an i64 rather than i32.

We may need to make that match even more flexible to handle
post-legalization nodes, but I have not stepped into that yet.

Differential Revision: https://reviews.llvm.org/D120264

4a3708cd

[flang] Handle allocatable dummy arguments · 914061bb

Valentin Clement authored Feb 24, 2022

This patch handles allocatable dummy argument lowering
in function and subroutines.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D120483



Co-authored-by: Jean Perier <jperier@nvidia.com>

914061bb

Disable Mailgun click tracking · 64cc8b27
Anton Korobeynikov authored Feb 24, 2022

64cc8b27
Remove useless RUN lines in the middle of the file and pipe to FileCheck; NFC · 00392496
Aaron Ballman authored Feb 24, 2022

00392496

[OpenMP] Make section variable external to prevent collisions · 7aef8b37

Joseph Huber authored Feb 24, 2022

Summary:
We use a section to embed offloading code into the host for later
linking. This is normally unique to the translation unit as it is thrown
away during linking. However, if the user performs a relocatable link
the sections will be merged and we won't be able to access the files
stored inside. This patch changes the section variables to have external
linkage and a name defined by the section name, so if two sections are
combined during linking we get an error.

7aef8b37

[InstCombine] try harder to preserve 'nsz' in fneg-of-select transform · 5379f76e

Sanjay Patel authored Feb 24, 2022

The corner case where 'nsz' needs to be removed is very narrow
as discussed here:
https://reviews.llvm.org/rG3cdd05e519dd

If the select condition is not undef, there's no problem with
propagating 'nsz':
https://alive2.llvm.org/ce/z/4GWJdq

5379f76e

[InstCombine] add test for fneg of select with FMF; NFC · 788b08a5
Sanjay Patel authored Feb 24, 2022

788b08a5

[MIRParser] Diagnose too large align values in MachineMemOperands · 719bac55

Jay Foad authored Feb 23, 2022

When parsing MachineMemOperands, MIRParser treated the "align" keyword
the same as "basealign". Really "basealign" should specify the
alignment of the MachinePointerInfo base value, and "align" should
specify the alignment of that base value plus the offset.

This worked OK when the specified alignment was no larger than the
alignment of the offset, but in cases like this it just caused
confusion:

    STW killed %18, 4, %stack.1.ap2.i.i :: (store (s32) into %stack.1.ap2.i.i + 4, align 8)

MIRPrinter would never have printed this, with an offset of 4 but an
align of 8, so it must have been written by hand. MIRParser would
interpret "align 8" as "basealign 8", but I think it is better to give
an error and force the user to write "basealign 8" if that is what they
really meant.

Differential Revision: https://reviews.llvm.org/D120400

Change-Id: I7eeeefc55c2df3554ba8d89f8809a2f45ada32d8

719bac55

[mlir][emitc] Add a variable op · 1fa12511

Marius Brehler authored Feb 17, 2022

This adds a variable op, emitted as C/C++ locale variable, which can be
used if the `emitc.constant` op is not sufficient.

As an example, the canonicalization pass would transform
```mlir
%0 = "emitc.constant"() {value = 0 : i32} : () -> i32
%1 = "emitc.constant"() {value = 0 : i32} : () -> i32
%2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
%3 = emitc.apply "&"(%1) : (i32) -> !emitc.ptr<i32>
emitc.call "write"(%2, %3) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> ()
```
into
```mlir
%0 = "emitc.constant"() {value = 0 : i32} : () -> i32
%1 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
%2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32>
emitc.call "write"(%1, %2) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> ()
```
resulting in pointer aliasing, as %1 and %2 point to the same address.
In such a case, the `emitc.variable` operation can be used instead.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D120098

1fa12511

[clang] Warn on unqualified calls to std::move and std::forward · 70b1f6de

Corentin Jabot authored Feb 24, 2022

This adds a diagnostic when an unqualified call is resolved
to std::move or std::forward.

This follows some C++ committee discussions where some
people where concerns that this might be an usual anti pattern
particularly britle worth warning about - both because move
is a common name and because these functions accept any values.

This warns inconditionnally of whether the current context is in
std:: or not, as implementations probably want to always qualify
these calls too, to avoid triggering adl accidentally.

Differential Revision: https://reviews.llvm.org/D119670

70b1f6de

[OpenCL] Handle TypeExtensions in OpenCLBuiltinFileEmitter · 28cdcf8e

Sven van Haastregt authored Feb 24, 2022

Until now, any types that had TypeExtensions attached to them were not
guarded with those extensions.  Extend the OpenCLBuiltinFileEmitter
such that all required extensions are emitted for the types of a
builtin function.

The `clang-tblgen -gen-clang-opencl-builtin-tests` emitter will now
produce e.g.:

  #if defined(cl_khr_fp16) && defined(cl_khr_fp64)
  half8 test11802_convert_half8_rtp(double8 arg1) {
    return convert_half8_rtp(arg1);
  }
  #endif // TypeExtension

Differential Revision: https://reviews.llvm.org/D120262

28cdcf8e

[AArch64] Simplify and extend tests added in 0c5ea01b . · 59101501
Florian Hahn authored Feb 24, 2022

59101501
[X86] LowerRotate - enable v8i16 ROTL/ROTR on all pre-SSE41 targets · a636801a
Simon Pilgrim authored Feb 24, 2022
```
We're still better off expanding this once we have PMOVZX
```
a636801a
[X86] SimplifyDemandedVectorEltsForTargetNode - add X86ISD::ANDNP handling · 0ea50bee
Simon Pilgrim authored Feb 24, 2022

0ea50bee

[mlir][OpenMP][NFC] Remove unused binary operator enum · f9fbaabe

Shraiysh Vaishay authored Feb 24, 2022

This patch removes binary operator enum which was introduced with `omp.atomic.update`. Now the update operation handles update in a region so this is no longer required.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D120458

f9fbaabe

[Symbolizer] Move ctor/dtor into .cpp file · 1e396aff

Benjamin Kramer authored Feb 24, 2022

On some standard library configurations these have a dependency on the
complete type of SymbolizableModule. They also do a lot of
copying/freeing so no point in inlining them.

1e396aff

[mlir] Document creation of Python bindings for a dialect · 51460675

Alex Zinenko authored Jan 11, 2022

Documentation exists about the details of the API but is missing a
description of the overall structure per dialect.

Reviewed By: shabalin

Differential Revision: https://reviews.llvm.org/D117002

51460675

[mlir][linalg] Cast back to the original type after making linalg.generic outputs more static · 92cf9f14
Benjamin Kramer authored Feb 24, 2022
```
This codepath was entirely untested.

Differential Revision: https://reviews.llvm.org/D120473
```
92cf9f14
[NFC][SROA] Update tests for D113520 · 67388b00
Roman Lebedev authored Feb 24, 2022

67388b00

Cleanup includes: ProfileData · fc97efa4

serge-sans-paille authored Feb 23, 2022

Estimation of the impact on preprocessor output:

before: 1067349756
after: 1065940348

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120434

fc97efa4

Cleanup include: DebugInfo/Symbolize · db29f437

serge-sans-paille authored Feb 23, 2022

Estimation of the impact on preprocessor output
after: 1067349756
before:1067487786

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120433

db29f437

[AMDGPU] Fix permissions on test files · aa1e5fbc
Jay Foad authored Feb 24, 2022

aa1e5fbc
[AArch64] Add vector select test showing redundant operations. · 0c5ea01b
Florian Hahn authored Feb 24, 2022
```
The tests show sub-optimal lowering of extend/cmp/select chains starting
with v16i8 vectors.
```
0c5ea01b
[NFC][RISCV] Reuse ISD::NodeType in float extension · 78b5f0fb
Shao-Ce SUN authored Feb 24, 2022
```
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D120412
```
78b5f0fb

[OpenCL] opencl-c.h: remove arg names for image builtins · 88182e2d

Sven van Haastregt authored Feb 24, 2022

This simplifies completeness comparisons against OpenCLBuiltins.td and
also makes the header no longer "claim" the identifiers "image",
"image_array", "coord", "sampler", "sample", "gradientX", "gradientY",
"lod", and "color".

Continues the direction set out in D119560.

88182e2d

[lldb] Fix macos build for D120425 · a85d3b66
Pavel Labath authored Feb 24, 2022

a85d3b66

[mlir][LLVM] Allow scalable vectors in ShuffleVectorOp · cd0d21b4

Javier Setoain authored Jan 27, 2022

The current implementation of ShuffleVectorOp assumes all vectors are
scalable. LLVM IR allows shufflevector operations on scalable vectors,
and the current translation between LLVM Dialect and LLVM IR does the
rigth thing when the shuffle mask is all zeroes. This is required to
do a splat operation on a scalable vector, but it doesn't make sense
for scalable vectors outside of that operation, i.e.: with non-all zero
masks.

Differential Revision: https://reviews.llvm.org/D118371

cd0d21b4