Commits · 1c10d5b175992a9d056a2d763a932e5652386fc1 · Lorenzo Albano / LLVM bpEVL

Feb 08, 2023

[AArch64][GlobalISel] Lower formal arguments of AAPCS & ms_abi variadic functions. · 1c10d5b1

Vladislav Dzhidzhoev authored Aug 01, 2022

Reimplemented SelectionDAG code for GlobalISel.

Fixes https://github.com/llvm/llvm-project/issues/54079

Differential Revision: https://reviews.llvm.org/D130903

1c10d5b1

[ARM] Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. · 2c580884
Simon Pilgrim authored Feb 08, 2023
```
Use APInt::setBit() method instead of OR'ing individual bits.
```
2c580884

[hexagon] Turning off sign mismatch warning by default. · f1a87d47

Brian Cain authored Feb 07, 2023



Patch-by: Colin Lemahieu <colinl@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D143531

f1a87d47

[flang] Support polymorphic inputs for UNPACK intrinsic · b5bffb72

Valentin Clement authored Feb 08, 2023

Result must carry the polymorphic type information
from the vector.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D143575

b5bffb72

Recommit "[ConstraintElimination] Move Value2Index map to ConstraintSystem (NFC)" · a9d6a86b
Zain Jaffal authored Feb 08, 2023
```
This reverts commit 665ee0cd.

Fix comments and formatting style.
```
a9d6a86b

[libc] Don't try to use MPFR with the GPU build for now · 22a5593b

Joseph Huber authored Feb 08, 2023

Summary:
We don't have the infastructure to support MPFR on the GPU. We should
disable this categorically on GPU builds for now.

22a5593b

[libc][bazel] Add missing libc_root dep · 6064742b
Guillaume Chatelet authored Feb 08, 2023

6064742b
[InstSimplify] add tests for strict fadd with SNaN operand; NFC · ed8dae9a
Sanjay Patel authored Feb 07, 2023

ed8dae9a
[DSE] Add test with llvm.memcpy & memcpy_chk. · 64233ae3
Florian Hahn authored Feb 08, 2023
```
This adds test coverage to avoid crashes with further changes.
```
64233ae3

[AArch64] Fix creation of invalid instructions with XZR register · b134c62f

David Green authored Feb 08, 2023

A combination of GlobalISel and MachineCombiner can end up creating
`SUB xrz, (MOVI -2105098)` instructions which have not been constant
folded. The AArch64MIPeepholeOpt pass will then attempt to create
`ADD xzr, 513, lsl 12`, which is not a valid instruction. This adds
a bail out of the transform if the register is xzr/wzr.

Fixes #60528

Differential Revision: https://reviews.llvm.org/D143475

b134c62f

[NVPTX] Increase inline threshold multiplier to 11 in nvptx backend. · 22d98280

JackAKirk authored Jan 20, 2023

I used https://github.com/zjin-lcf/HeCBench

 (with nvcc usage swapped to
clang++), which is an adaptation of the classic Rodinia benchmarks aimed
at CUDA and SYCL programming models, to compare different values of the
multiplier using both clang++ cuda and clang++ sycl nvptx backends. I
find that the value is currently too low for both cases. Qualitatively
(and in most cases there is very a close quantitative agreement across
both cases) the change in code execution time for a range of values from
5 to 1000 matches in both variations (CUDA clang++ vs SYCL (with cuda
backend) using the intel/llvm clang++ compiler) of the HeCbench samples.
This value of 11 is optimal for clang++ cuda for all cases I've
investigated. I have not found a single case where performance is
deprecated by this change of the value from 5 to 11. For one sample the
sycl cuda backend preferred a higher value. However we are happy to
prioritize clang++ cuda, and we find that this value is close to ideal
for both cases anyway. It would be good to do some further investigation
using clang++ openmp cuda offload. However since I do not know of an
appropriate set of benchmarks for this case, and the fact that we are
now getting complaints about register spills related to insufficient
inlining on a weekly basis, we have decided to propose this change and
potentially seek some more input from someone who may have more
expertise in the openmp case. Incidentally this value coincides with the
value used for the amd-gcn backend. We have also been able to use the
amd backend of the intel/llvm "dpc++" compiler to compare the inlining
behaviour of an identical code when targetting amd (compared to nvptx).
Unsurprisingly the amd backend with a multiplier value of 11 was
performing better (with regard to inlining) than the nvptx case when the
value of 5 was used. When the two backends use the same multiplier value
the inlining behaviors appear to align closely.

This also considerably improves the performance of at least one of the
most popular HPC applications: NWCHEMX.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

Reviewed by: tra
Differential Revision: https://reviews.llvm.org/D142232

22d98280

[SanitizerBinaryMetadata] Emit constants as ULEB128 · bf9814b7

Marco Elver authored Feb 08, 2023

Emit all constant integers produced by SanitizerBinaryMetadata as
ULEB128 to further reduce binary space used. Increasing the version is
not necessary given this change depends on (and will land) along with
the bump to v2.

To support this, the !pcsections metadata format is extended to allow
for per-section options, encoded in the first MD operator which must
always be a string and contain the section: "<section>!<options>".

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143484

bf9814b7

[SanitizerBinaryMetadata] Optimize used space for features and UAR stack args · 3d53b527

Marco Elver authored Feb 08, 2023

Optimize the encoding of "covered" metadata by:

 1. Reducing feature mask from 4 bytes to 1 byte (needs increase once we
    reach more than 8 features).

 2. Only emitting UAR stack args size if it is non-zero, saving 4 bytes
    in the common case.

One caveat is that the emitted metadata for function PC (offset), size,
and UAR size (if enabled) are no longer aligned to 4 bytes.

SanitizerBinaryMetadata version base is increased to 2, since the change
is backwards incompatible.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143482

3d53b527

[bazel] Actually put Importer in the right library · 938fdfc4
Benjamin Kramer authored Feb 08, 2023
```
Fixes a81136c3
```
938fdfc4
[bazel] Port b83caa32 · a81136c3
Benjamin Kramer authored Feb 08, 2023

a81136c3

[xxHash] Don't trigger UB on empty StringRef · 72eac42f

Benjamin Kramer authored Feb 08, 2023

This is quite silly, but casting to uintptr_t seems like the easiest
option to quiet ubsan.

llvm/lib/Support/xxhash.cpp:107:12: runtime error: applying non-zero offset 8 to null pointer
    #0 0x7fe3660404c0 in llvm::xxHash64(llvm::StringRef) llvm/lib/Support/xxhash.cpp:107:12

72eac42f

[flang][NFC] Move Procedure designator lowering in its own file · cfc48600

Jean Perier authored Feb 08, 2023

Code move without any change, the goal is to re-use this piece of
code for procedure designator lowering in HLFIR since there is no
significant changes in the way procedure designators will be
lowered.

Differential Revision: https://reviews.llvm.org/D143563

cfc48600

[DAG] Fold Op(vecreduce(a), vecreduce(b)) into vecreduce(Op(a,b)) · 1af3f596

David Green authored Feb 08, 2023

So long as the operation is reassociative, we can reassociate the double
vecreduce from for example fadd(vecreduce(a), vecreduce(b)) to
vecreduce(fadd(a,b)). This will in general save a few instructions, but some
architectures (MVE) require the opposite fold, so a shouldExpandReduction is
added to account for it. Only targets that use shouldExpandReduction will be
affected.

Differential Revision: https://reviews.llvm.org/D141870

1af3f596

Revert "[ConstraintElimination] Move Value2Index map to ConstraintSystem (NFC)" · 665ee0cd
Zain Jaffal authored Feb 08, 2023
```
This reverts commit 40ffe9c1.

Reverted because some comments where missed in the review https://reviews.llvm.org/D142647
```
665ee0cd

[mlir][llvm] Add MD_prof import error handling · c6ac7e9d

Christian Ulmann authored Feb 08, 2023

This commit adds additional checks and warning messages to the MD_prof
import. As LLVM does not verify most metadata, the import has the be
resilient towards ill-formatted inputs.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D143492

c6ac7e9d

[ConstraintElimination] Move Value2Index map to ConstraintSystem (NFC) · 40ffe9c1
Zain Jaffal authored Feb 07, 2023
```
Differential Revision: https://reviews.llvm.org/D142647
```
40ffe9c1

[mlir][llvm] Add support for loop metadata import · b83caa32

Christian Ulmann authored Feb 08, 2023

This commit introduces functionality to import loop metadata. Loop
metadata nodes are transformed into LoopAnnotationAttrs and attached to
the corresponding branch operations.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D143376

b83caa32

[flang] Use clang sysroot image to test fastmath linking · 09216cfd

Tom Eccles authored Jan 28, 2023

This test has been very unreliable across different machines. Update it
to use clang's sysroot image so that the fastmath object file name is
stable across different distributions and distro types.

Based on clang/test/Driver/linux-ld.c

Thanks to mnadeem for pointing this out at https://reviews.llvm.org/D138675

Differential Revision: https://reviews.llvm.org/D142807

09216cfd

[flang][NFC] add convertToX functions to HLFIRTools · 61c5c597

Tom Eccles authored Feb 07, 2023

These will be useful for sharing code with intrinsic argument processing
when lowering hlfir transformational intrinsic operations to FIR in
the BufferizeHLFIR pass.

Differential Revision: https://reviews.llvm.org/D143503

61c5c597

[clang][AIX] Remove test for the default OpenMP runtime · 5ae99be0

wangpc authored Feb 08, 2023

The default OpenMP runtime may not be libomp since it can be changed
by specified `CLANG_DEFAULT_OPENMP_RUNTIME`. This test will fail if
we change the default OpenMP runtime.

This patch removes test for the default OpenMP runtime and moves the
CHECKs downward.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D143549

5ae99be0

[X86] Add ISD::ABDS/ABDU vXi64 support on SSE41+ targets · d3188c7f
Simon Pilgrim authored Feb 08, 2023
```
If IMINMAX ops aren't legal, we can lower to the select(icmp(x,y),sub(x,y),sub(y,x)) pattern
```
d3188c7f

[flang] Add a proper TODO for polymorphic array lowering with vector subscript · 39e6bd9c

Valentin Clement authored Feb 08, 2023

Creation of polymorphic array temporary cannot be done inlined.
Add a TODO so the current code exit in a clean way when lowering
reach it. A solution involving the runtime will be put in place.

Depends on D143490

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D143491

39e6bd9c

[flang][NFC] Centralize fir.class addition in ConvertType · 1e413b90

Valentin Clement authored Feb 08, 2023

fir.class type is always needed for polymorphic and unlimited
polymorphic entities. Wrapping the element type with a fir.class
type was done in ConvertType for some case and else where in the
code for other. Centralize this in ConvertType when converting
from expr or symbol.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D143490

1e413b90

[mlir][MemRef] Add option to `-finalize-memref-to-llvm` to emit opaque pointers · 50ea17b8

Markus Böck authored Feb 06, 2023

This is the first patch in a series of patches part of this RFC: https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179

This patch adds the ability to lower the memref dialect to the LLVM Dialect with the use of opaque pointers instead of typed pointers. The latter are being phased out of LLVM and this patch is part of an effort to phase them out of MLIR as well. To do this, we'll need to support both typed and opaque pointers in lowering passes, to allow downstream projects to change without breakage.

The gist of changes required to change a conversion pass are:
* Change any `LLVM::LLVMPointerType::get` calls to NOT use an element type if opaque pointers are to be used.
* Use the `build` method of `llvm.load` with the explicit result type. Since the pointer does not have an element type anymore it has to be specified explicitly.
* Use the `build` method of `llvm.getelementptr` with the explicit `basePtrType`. Ditto to above, we have to now specify what the element type is so that GEP can do its indexing calculations
* Use the `build` method of `llvm.alloca` with the explicit `elementType`. Ditto to the above, alloca needs to know how many bytes to allocate through the element type.
* Get rid of any `llvm.bitcast`s
* Adapt the tests to the above. Note that `llvm.store` changes syntax as well when using opaque pointers

I'd like to note that the 3 `build` method changes work for both opaque and typed pointers, so unconditionally using the explicit element type form is always correct.

For the testsuite a practical approach suggested by @ftynse was taken: I created a separate test file for testing the typed pointer lowering of Ops. This mostly comes down to checking that bitcasts have been created at the appropiate places, since these are required for typed pointer support.

Differential Revision: https://reviews.llvm.org/D143268

50ea17b8

[lit] Pass LLVM_PROFILE_FILE environment · 3ecaf27c

Tobias Hieta authored Feb 08, 2023

When building a PGO version of LLVM you might want to customize
the output profile file when building tests. For this to work
we need to pass LLVM_PROFILE_FILE enviroment.

Reviewed By: abrachet

Differential Revision: https://reviews.llvm.org/D143556

3ecaf27c

[SanitizerBinaryMetadata] Make module_[cd]tor external · 6ce8e716

Fangrui Song authored Feb 08, 2023

If a COMDAT key has a local linkage, it behaves as `comdat nodeduplicate` and
llvm/lib/Linker/LinkModules.cpp does not deduplicate its members.
This is not intended. Switch to an external linkage to allow deduplication.

See also https://maskray.me/blog/2021-07-25-comdat-and-section-group#grp_comdat

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D143530

6ce8e716

[LoongArch] Merge the 12bit constant address into the offset field of the instruction · 653d823a

gonglingqin authored Feb 08, 2023

There are 12bit offset fields in the ld.[b/h/w/d] and st.[b/h/w/d].
When the constant address is less than 12 bits, the address
calculation is incorporated into the offset field of the instruction.

Differential Revision: https://reviews.llvm.org/D143470

653d823a

[C++20] [Modules] Allow -fmodule-file=<module-name>=<BMI-Path> for... · 1782e8f9

Chuanqi Xu authored Feb 08, 2023

[C++20] [Modules] Allow -fmodule-file=<module-name>=<BMI-Path> for implementation unit and document the behavior

Close https://github.com/llvm/llvm-project/issues/57293.

Previsouly we can't use `-fmodule-file=<module-name>=<BMI-Path>` for
implementation units, it is a bug. Also the behavior of the above option
is not tested nor documented for C++20 Modules. This patch addresses the
2 problems.

1782e8f9

[mlir][bufferize][NFC] OneShotAnalysis: Expose analysis hooks from AnalysisState · 6d14b110

Matthias Springer authored Feb 08, 2023

This is in preparation of reusing the same AnalysisState for tensor.empty elimination and One-Shot Bufferize (to address performance bottlenecks).

Differential Revision: https://reviews.llvm.org/D143379

6d14b110

[flang] Carry over the derived type from MOLD · b37e3597

Valentin Clement authored Feb 08, 2023

Derived type from the MOLD was not carried over
to the newly allocated pointer or allocatable.
This may lead to wrong dynamic type when the pointer or allocatable
is polymorphic as shown in the example below:

```
type :: p1
  integer :: a
end type

type, extends(p1) :: p2
  integer :: b
end type

class(p1), pointer :: p(:)

allocate(p(5), MOLD=p2(1,2))
```

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D143525

b37e3597

[mlir][bufferize][NFC] Merge AnalysisState and BufferizationAliasInfo · cf2d374e

Matthias Springer authored Feb 08, 2023

There is no longer a need to keep the two separate. This is in preparation of reusing the same AnalysisState for tensor.empty elimination and One-Shot Bufferize (to address performance bottlenecks).

Differential Revision: https://reviews.llvm.org/D143313

cf2d374e

[compiler-rt][macOS]: Disable iOS support if iOS SDK is not found · 78fb0210

Tobias Hieta authored Feb 08, 2023

If you are missing the iOS SDK on your macOS (for example you don't have
full Xcode but just CommandLineTools) then CMake currently errors
out without a helpful message. This patch disables iOS support in
compiler-rt if the iOS SDK is not found. This can be overriden by
passing -DCOMPILER_RT_ENABLE_IOS=ON.

Reviewed By: delcypher, thetruestblue

Differential Revision: https://reviews.llvm.org/D133273

78fb0210

Revert "[RISCV] Add performMULcombine to perform strength-reduction" · b4431b2d
Philipp Tomsich authored Feb 08, 2023
```
This reverts commit 3304d51b.
```
b4431b2d
Revert "[RISCV] Add vendor-defined XTHeadBs (single-bit) extension" · 0bda1992
Philipp Tomsich authored Feb 08, 2023
```
This reverts commit 656188dd.
```
0bda1992
Revert "[RISCV] Add vendor-defined XTheadBb (basic bit-manipulation) extension" · b0c31322
Philipp Tomsich authored Feb 08, 2023
```
This reverts commit 19a59099.
```
b0c31322