Commits · 58db5f6e959419aaca20798835d75d3646b99293 · Lorenzo Albano / LLVM bpEVL

Sep 07, 2021

[ConstFold] Support opaque pointers in constexpr GEPs · 58db5f6e

Nikita Popov authored Jul 20, 2021

Support opaque pointers in SymbolicallyEvaluateGEP() by using the
value type of a GlobalValue base or falling back to i8 if there
isn't one. We don't unconditionally generate i8 GEPs here because
that would lose inrange attribues, and because some optimizations
on globals currently rely on GEP types (e.g. the globals SROA
mentioned in the comment).

Differential Revision: https://reviews.llvm.org/D109297

58db5f6e

Copy Elementtype Attribute to IR at Link step · 34528c32

Andy Kaylor authored Sep 03, 2021

Copying IR during linking causes a type mismatch due to the field being missing in IRMover/Valuemapper. Adds the full range of typed attributes including elementtype attribute in the copy functions.

Patch by Chenyang Liu

Differential Revision: https://reviews.llvm.org/D108796

34528c32

[ELF][test] Improve gitBitcodeMachineKind tests · abd80ecf
Fangrui Song authored Sep 07, 2021

abd80ecf

[llvm-objdump] Fix 'llvm-objdump -dr' for executables with relocations · 6300e4ac

Maksim Panchenko authored Aug 31, 2021

Print relocations interleaved with disassembled instructions for
executables with relocatable sections, e.g. those built with "-Wl,-q".

Differential Revision: https://reviews.llvm.org/D109016

6300e4ac

[NFC][InstCombine] Make check for sret in a vararg function clearer · b81fc14f

Arthur Eubanks authored Sep 06, 2021

We're trying to get the parameter index of sret and see if it's part of
a function's varargs.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D109335

b81fc14f

Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed... · 35fa7b8a

Roman Lebedev authored Sep 07, 2021

Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)"

This reverts commit 91f7a4ff,
relanding commit 13ec913b.

The original commit was reverted because of (essentially)
https://bugs.llvm.org/show_bug.cgi?id=35922
which has now been addressed by d0eeb64b.

35fa7b8a

[libc++] Remove a stray `const` on ranges::data and ranges::ssize. NFCI. · 0a5ebc69

Arthur O'Dwyer authored Sep 06, 2021

These are specced as `inline constexpr auto`; the extra `const`
isn't doing anything except being inconsistent with the other CPOs.
Now all the implemented CPOs can be detected by
    git grep 'inline constexpr auto.*fn' ../libcxx/include/
and I think that's beautiful.

0a5ebc69

[libc++] Fix std::to_address(array). · dadbe88a

Arthur O'Dwyer authored Sep 06, 2021

There were basically two bugs here:

When C++20 `to_address` is called on `int arr[10]`, then `const _Ptr&` becomes
a reference to a const array, and then we dispatch to `__to_address<const int(&)[10]>`,
which, oops, gives us a `const int*` result instead of an `int*` result.
Solution: We need to provide the two standard-specified overloads of
`std::to_address` in exactly the same way that we provide two overloads
of `__to_address`.

When `__to_address` is called on a pointer type, `__to_address(const _Ptr&)`
is disabled so we successfully avoid trying to instantiate pointer_traits of
that pointer type. But when it's called on an array type, it's not disabled
for array types, so we go ahead and instantiate pointer_traits<int[10]>,
which goes boom. Solution: We need to disable `__to_address(const _Ptr&)`
for both pointer and array types. Also disable it for function types,
so that they get the nice error message; and put a test on it.

Differential Revision: https://reviews.llvm.org/D109331

dadbe88a

[libc++][NFC] Test span is nothrow trivially destructible · 84169fb6

Joe Loser authored Sep 07, 2021

Add tests showing `span` is trivially_destructible and nothrow_destructible.
Note that we do not need to explicitly default the destructor in `span`.

Reviewed By: ldionne, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D109286

84169fb6

[X86ISelLowering] avoid emitting libcalls to __mulodi4() · d0eeb64b

Nick Desaulniers authored Sep 07, 2021

Similar to D108842, D108844, and D108926.

__has_builtin(builtin_mul_overflow) returns true for 32b x86 targets,
but Clang is deferring to compiler RT when encountering long long types.
This breaks ARCH=i386 + CONFIG_BLK_DEV_NBD=y builds of the Linux kernel
that are using builtin_mul_overflow with these types for these targets.

If the semantics of __has_builtin mean "the compiler resolves these,
always" then we shouldn't conditionally emit a libcall.

This will still need to be worked around in the Linux kernel in order to
continue to support these builds of the Linux kernel for this
target with older releases of clang.

Link: https://bugs.llvm.org/show_bug.cgi?id=28629
Link: https://bugs.llvm.org/show_bug.cgi?id=35922
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

Reviewed By: lebedev.ri, RKSimon

Differential Revision: https://reviews.llvm.org/D108928

d0eeb64b

[flang] evaluate: Fold SQRT, HYPOT, & CABS · c9e9635f

peter klausler authored Aug 30, 2021

Implement IEEE Real::SQRT() operation, then use it to
also implement Real::HYPOT(), which can then be used directly
to implement Complex::ABS().

Differential Revision: https://reviews.llvm.org/D109250

c9e9635f

[lldb] Alphabetize some CMake files a bit better · ea04bf30

Nico Weber authored Sep 07, 2021

No observable behavior change, but makes the generated Plugins.def a bit easier
to read.

Differential Revision: https://reviews.llvm.org/D109367

ea04bf30

[mlir] Fix SplatOp lowering to the LLVM dialect · b841ae55

Alex Zinenko authored Sep 07, 2021

The lowering has been incorrectly using the operands of the original op instead
of rewritten operands provided to matchAndRewrite call. This may lead to
spurious materializations and generally invalid IR.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D109355

b841ae55

[Support] Automatically support `hash_value` when `HashBuilder` support is available. · c3c9312f

Alexandre Rames authored Sep 07, 2021

Use the `HBuilder` interface to provide default implementations of `llvm::hash_value`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109024

c3c9312f

Greedy set cover implementation of `Merger::Merge` · e6597dba

aristotelis authored Sep 07, 2021

Extend the existing single-pass algorithm for `Merger::Merge` with an algorithm that gives better results. This new implementation can be used with a new **set_cover_merge=1** flag.

This greedy set cover implementation gives a substantially smaller final corpus (40%-80% less testcases) while preserving the same features/coverage. At the same time, the execution time penalty is not that significant (+50% for ~1M corpus files and far less for smaller corpora). These results were obtained by comparing several targets with varying size corpora.

Change `Merger::CrashResistantMergeInternalStep` to collect all features from each file and not just unique ones. This is needed for the set cover algorithm to work correctly. The implementation of the algorithm in `Merger::SetCoverMerge` uses a bitvector to store features that are covered by a file while performing the pass. Collisions while indexing the bitvector are ignored similarly to the fuzzer.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105284

e6597dba

[NFC][support] Extract `IsHashableData` out of class · 0e627c93

Alexandre Rames authored Sep 02, 2021

Extract `HashBuilder::IsHashableData` out of class; it does not depend on
template parametres.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109205

0e627c93

[X86] X86InstrAVX512.td - remove unused template parameters. NFC. · 9eda4721
Simon Pilgrim authored Sep 07, 2021
```
Identified in D109359
```
9eda4721

[OpenMP] Add interface for 5.1 scope construct · 224f51d8

Hansang Bae authored Jun 11, 2021

The new interface only marks begin/end of a scope construct for
corresponding OMPT events, and we can use existing interfaces for
reduction operations.

Differential Revision: https://reviews.llvm.org/D108062

224f51d8

[Analysis, Target, Transforms] Construct SmallVector with iterator ranges (NFC) · 5648f717
Kazu Hirata authored Sep 07, 2021

5648f717
[RISCV] Fix "set but not used" warnings · 5c6338de
Kazu Hirata authored Sep 07, 2021

5c6338de

[flang] Fix GetHostProcedure() for main program · f348f30d

peter klausler authored Sep 03, 2021

It only worked for internal procedures of subprograms,
but must also allow for internal procedures of the
main program.  This broke the use of host-associated
implicitly-typed symbols in specification expressions
of internal procedures.

Differential Revision: https://reviews.llvm.org/D109262

f348f30d

[InstCombine] ror/rol(X, RotAmt) == C --> X == rol/ror(C, RotAmt) (PR51567) · 3b5f318f

Dávid Bolvanský authored Sep 07, 2021

```
----------------------------------------
define i1 @src(i32 %0) {
%1:
  %2 = fshl i32 %0, i32 %0, i32 25
  %3 = icmp eq i32 %2, 5
  ret i1 %3
}
=>
define i1 @tgt(i32 %0) {
%1:
  %2 = icmp eq i32 %0, 640
  ret i1 %2
}
Transformation seems to be correct!
```

https://alive2.llvm.org/ce/z/GdY8Jm

Solves PR51567

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D109283

3b5f318f

[IROutliner] Adding outlining for single entry/single exit multiblock regions · 81d3ac0c

Andrew Litteken authored Jul 28, 2021

Using the similarity found from the IRSimilarity Identifier, we take regions with structural similarity, and deduplicate them into a separate function. The Code Extractor is able to provide most of this functionality.

For simplicity, we start by only outlining regions with a single entry and single exit branch, this reduces the complexity in handling phi nodes outside the region, and handling many sets of outputs for each of the different exit blocks.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D106990

81d3ac0c

[PowerPC] Fixed the crash due to early if conversion with fixed CR fields · 4a226529

Victor Huang authored Aug 30, 2021

This patch adds a fix to do early if conversion to select when
conditional branch not using physical register to prevent the crash when
expanding ISEL instruction.

Reviewed By: lei, kamaub, PowerPC

Differential revision: https://reviews.llvm.org/D108302

4a226529

[libc++] Provide 'buildhost=<platform> feature for the tests. · 621e437e

Vladimir Vereschaka authored Sep 03, 2021

The target platform could differ from the host platform for the cross
platform builds. Some tests are depended on the build host features and
they need to determine a proper platform environment.

This commit adds a build host platform name feature for the libc++ tests
in format `buildhost=<platform>`, such as `buildhost=linux`, `buildhost=darwin`,
`buildhost=windows`, etc.

The Windows host gets two features: one `buildhost=windows` and another based
on Windows "sub-system", such as `buildhost=win32`, `buildhost=cygwin`, etc.

Differential Revision: https://reviews.llvm.org/D102045

621e437e

[lldb] Add missing newline to stderr output on failed attach · a97efde5
David Spickett authored Sep 07, 2021

a97efde5
[InstCombine] add tests for smear-a-set-bit; NFC · 76183552
Sanjay Patel authored Sep 07, 2021
```
Possible follow-ups from patterns discussed in D109155.
```
76183552

[lldb] Update crashlog.py to accept multiple results from mdfind · 4da5a446

Jonas Devlieghere authored Sep 07, 2021

mdfind can return multiple results, some of which are not even dSYM
bundles, but Xcode archives (.xcrachive).

Currently, we end up concatenating the paths, which is obviously bogus.
This patch not only fixes that, but now also skips paths that don't have
a Contents/Resources/DWARF subdirectory.

rdar://81270312

Differential revision: https://reviews.llvm.org/D109263

4da5a446

[X86] Add missing domain to avx512_ord_cmp_sae comis sae patterns · f8d2cd14

Simon Pilgrim authored Sep 07, 2021

It doesn't appear to be possible to generate this from tests atm, but it matches what we do in sse12_ord_cmp

Fixes unused template arg identified in D109359

f8d2cd14

[PowerPC] Guard XSRSP in P8 for FastISel · 042a6564

Jinsong Ji authored Sep 07, 2021

This is exposed by enabling FastIsel on 64bit AIX.
We are generating XSRSP regardless of the arch,
which may be wrong when -mcpu=pwr7.

The fix is to guard the generation in P8 only.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D109365

042a6564

[test] precommit a test for D109354 · 61d8e271
Jingu Kang authored Sep 07, 2021

61d8e271

Add llvm-ml to LLVM_TOOLCHAIN_TOOLS (PR50536) · c364dcbf

Hans Wennborg authored Sep 07, 2021

so that it gets installed in LLVM_INSTALL_TOOLCHAIN_ONLY builds,
such as used by the Windows installer.

Differential revision: https://reviews.llvm.org/D109358

c364dcbf

[Exegesis] Native clusterization: sub-partition by sched class id · e030f808

Roman Lebedev authored Sep 07, 2021

Currently native clusterization simply groups all benchmarks
by the opcode of key instruction, but that is suboptimal in certain cases,
e.g. where we can already tell that the particular instructions
already resolve into different sched classes.

e030f808

[NFC][exegesis] Add test for the following patch · b3b9b297
Roman Lebedev authored Sep 07, 2021

b3b9b297

[mlir] Fix GPU LaunchFunc conversion to the LLVM dialect · 821262ee

Alex Zinenko authored Sep 07, 2021

The conversion has been incorrectly using the operands of the original
operation instead of the converted operands provided to the matchAndRewrite
call. This may lead to spurious materializations and generally invalid IR if
the producer of the original operands is deleted in the process of conversion.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D109356

821262ee

[AArch64][SVE] Improve extract_subvector for predicates. · bd576e5a

Sander de Smalen authored Sep 07, 2021

Using PUNPKLO/HI instead of ZIP1/ZIP2, because that avoids
having to generate a predicate with all lanes inactive (PFALSE).

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D109312

bd576e5a

[MC] Use local MCSubtargetInfo in writeNops · e63455d5

Peter Smith authored Aug 09, 2021

On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.

On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.

For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.

This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.

I've attempted to take into account the in tree experimental backends.

Differential Revision: https://reviews.llvm.org/D45962

e63455d5

[MC] Add MCSubtargetInfo to MCAlignFragment · 5e71839f

Peter Smith authored Aug 06, 2021

In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.

There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.

For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.

This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.

Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.

Differential Revision: https://reviews.llvm.org/D45961

5e71839f

[amdgpu] Enable selection of `s_cselect_b64`. · 640beb38
Michael Liao authored Aug 30, 2021
```
Differential Revision: https://reviews.llvm.org/D109159
```
640beb38

[AMDGPU][GlobalISel] Legalize G_MUL for non-standard types · 6c4b634d

Mirko Brkusanin authored Sep 07, 2021

Legalizing G_MUL for non-standard types (like i33) generated an error. Putting
minScalar and maxScalar instead of clampScalar. Also using new rule, instead
of widening to the next power of 2, widen to the next multiple of the passed
argument (32 in this case), so instead of widening i65 to i128, we widen it to
i96.

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D109228

6c4b634d