Commits · 978883d254fdb882266f529b4bd404d0ece37732 · Lorenzo Albano / LLVM bpEVL

Dec 10, 2021

[VPlan] Add InductionDescriptor to VPWidenIntOrFpInduction. (NFC) · 978883d2

Florian Hahn authored Dec 10, 2021

This allows easier access to the induction descriptor from VPlan,
without needing to go through Legal. VPReductionPHIRecipe already
contains a RecurrenceDescriptor in a similar fashion.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D115111

978883d2

[msan] Implement -msan-disable-checks. · 1aa59ff2

Alexander Potapenko authored Dec 07, 2021

To ease the deployment of KMSAN, we need a way to apply
__attribute__((no_sanitize("kernel-memory"))) to the whole source file.

Passing -msan-disable-checks=1 to the compiler will make it
treat every function in the file as if it was lacking the
sanitize_memory attribute.

Differential Revision: https://reviews.llvm.org/D115236

1aa59ff2

[gn build] Port 1d0244ae · 37a395b3
LLVM GN Syncbot authored Dec 10, 2021

37a395b3

Reapply CycleInfo: Introduce cycles as a generalization of loops · 1d0244ae

Sameer Sahasrabuddhe authored Dec 10, 2021

Reverts 02940d6d. Fixes breakage in the modules build.

LLVM loops cannot represent irreducible structures in the CFG. This
change introduce the concept of cycles as a generalization of loops,
along with a CycleInfo analysis that discovers a nested
hierarchy of such cycles. This is based on Havlak (1997), Nesting of
Reducible and Irreducible Loops.

The cycle analysis is implemented as a generic template and then
instatiated for LLVM IR and Machine IR. The template relies on a new
GenericSSAContext template which must be specialized when used for
each IR.

This review is a restart of an older review request:
https://reviews.llvm.org/D83094

Original implementation by Nicolai Hähnle <nicolai.haehnle@amd.com>,
with recent refactoring by Sameer Sahasrabuddhe <sameer.sahasrabuddhe@amd.com>

Differential Revision: https://reviews.llvm.org/D112696

1d0244ae

[AMDGPU] Add AV class spill pseudo instructions · cf58b9ce

Christudasan Devadasan authored Dec 09, 2021

While enabling vector superclasses with D109301,
the AV spills are converted into VGPR spills by
introducing appropriate copies. The whole thing
ended up adding two instructions per spill (a copy
+ vgpr spill pseudo) and caused an incorrect
liverange update during inline spiller.

This patch adds the pseudo instructions for all
AV spills from 32b to 1024b and handles them in
the way all other spills are lowered.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115439

cf58b9ce

[Inline] Add test for exponential deferred inlining (NFC) · ccafd2d0

Nikita Popov authored Dec 10, 2021

This shows a case where deferred inlining produces an exponential
result. The test case demonstrates the basic exponential behavior,
but is nowhere close to the worst case. For example, the file at
https://gist.github.com/nikic/1262b5f7d27278e1b34a190ae10947f5
currently gets expanded from <100 lines to nearly 500000 lines of
IR by opt -inline.

ccafd2d0

[GlobalISel] Fix IRTranslator for constexpr fcmp · a3446537

Konstantin Schwarz authored Dec 09, 2021

The existing code assumed fcmp to always be an Instruction, but it can also be a ConstExpr.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115450

a3446537

[NFC] Format the newly added table for coro.end in coroutines.rst · b9321d48
Chuanqi Xu authored Dec 10, 2021
```
The intention should be formatted in two lines instead of one.
```
b9321d48

[RISCV] Unify depedency check and extension implication parsing logics · a4bf1b44

eopXD authored Oct 23, 2021

Originially there are two places that does parsing - `parseArchString` and
`parseFeatures`, each with its code on dependency check and implication.
This patch extracts common parts of the two  as functions of `RISCVISAInfo`
and let them 2 use it.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112359

a4bf1b44

[RISCV] Fix arch string parsing for multi-character extensions · e308b8e0

eopXD authored Oct 22, 2021

Current implementation can't parse extension names that contains digits
correctly (e.g. `zvl128b`). This patch fixes it.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D109215

e308b8e0

[llvm] Use llvm::count (NFC) · f829630d
Kazu Hirata authored Dec 09, 2021

f829630d

[AArch64][GlobalISel] Split vector stores of zero. · 98095afb

Amara Emerson authored Dec 09, 2021

This results in a very minor improvement in most cases, generating
stores of xzr instead of moving zero to a vector register.

Differential Revision: https://reviews.llvm.org/D115479

98095afb

[dfsan] Add missing test for the new pass manager with -dfsan-ignore-personality-routine · 50f33802

Taewook Oh authored Dec 09, 2021

A test for the new pass manager was missed from the original diff D115317.

Reviewed By: browneee

Differential Revision: https://reviews.llvm.org/D115477

50f33802

Support: Avoid using SmallVector::set_size() in MemoryBuffer · 11317386

Duncan P. N. Exon Smith authored Dec 07, 2021

Update getMemoryBufferForStream() to use `resize_for_overwrite()` and
`truncate()` instead of `reserve()` and `set_size()`.

Differential Revision: https://reviews.llvm.org/D115384

11317386

[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer. · 5bba0fe1

Noah Shutty authored Dec 10, 2021

Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Updates compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh to include Debuginfod library to fix sanitizer-x86_64-linux breakage.

Reviewed By: jhenderson, vitalybuka

Differential Revision: https://reviews.llvm.org/D113717

5bba0fe1

[X86][MS-InlineAsm] Make the constraint *m to be simple place holder · d7c07f60

Phoebe Wang authored Dec 10, 2021

D113096 solved the "undefined reference to xxx" issue by adding
constraint *m for the global var. But it has strong side effect due to
the symbol in the assembly being replaced with constraint variable.
This leads to some lowering fails. https://godbolt.org/z/h3nWoerPe

This patch fix the problem by use the constraint *m as place holder
rather than real constraint. It has negligible effect for the existing
code generation.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D115225

d7c07f60

Revert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer." · afa3c14e
Noah Shutty authored Dec 10, 2021
```
This reverts commit e2ad4f17 because it
does not correctly fix the sanitizer buildbot breakage.
```
afa3c14e
[amdgpu][nfc] Delete dead code in LowerModuleLDS · 7b9ab06d
Jon Chesterfield authored Dec 10, 2021

7b9ab06d

[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer. · e2ad4f17

Noah Shutty authored Dec 10, 2021

Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Adds new symbolizer symbols to `global_symbols.txt`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D113717

e2ad4f17

Dec 09, 2021

[PGO] Adjust BFI verification option default values [NFC] · ad2e5be4
Rong Xu authored Dec 09, 2021
```
Slightly changed the default option values.
Also avoided some bogus output.
```
ad2e5be4

[InstSimplify] Simplify bool icmp with not in LHS · c1cd698a

Hasyimi Bahrudin authored Dec 09, 2021

Refer to https://llvm.org/PR52546.

Simplifies the following cases:
    not(X) == 0 -> X != 0 -> X
    not(X) <=u 0 -> X >u 0 -> X
    not(X) >=s 0 -> X <s 0 -> X
    not(X) != 1 -> X == 1 -> X
    not(X) <=u 1 -> X >=u 1 -> X
    not(X) >s 1 -> X <=s -1 -> X

Differential Revision: https://reviews.llvm.org/D114666

c1cd698a

[NFC] Use getAlign() instead of getAlignment() in haveSameSpecialState() · f5687e0f
Arthur Eubanks authored Dec 09, 2021
```
getAlignment() is deprecated.
```
f5687e0f
[gn build] Port cfb07508 · 075eb78d
LLVM GN Syncbot authored Dec 09, 2021

075eb78d
[AArch64][GlobalISel] Add regbankselect support for G_FMAXIMUM/G_FMINIMUM · afdec434
Jessica Paquette authored Dec 08, 2021
```
These always use FPRs only.

Differential Revision: https://reviews.llvm.org/D115376
```
afdec434

[TargetInstrInfo][PowerPC] Remove virtual function that is only called from PPC specific code. · b3db7dde

Craig Topper authored Dec 09, 2021

There are two signatures of setSpecialOperandAttr in TargetInstrInfo.
One of them is only called from PPCInstrInfo which has an override
of it.

Remove it from TargetInstrInfo and make it a non-virtual method in
PPCInstrInfo.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D115404

b3db7dde

[GlobalISel] Make G_PTR_ADD pattern matcher non-commutative. · 2717f62c

Daniel Thornburgh authored Dec 09, 2021

G_PTR_ADD takes arguments of two different types, so it probably shouldn't be
considered commutative just on that basis. A recent G_PTR_ADD reassociation
optimization (https://reviews.llvm.org/D109528) can emit erroneous code if the
pattern matcher commutes the arguments; this can happen when the base pointer
was created by G_INTTOPTR of a G_CONSTANT and the offset register is variable.

This was discovered on the llvm-mos fork, but I added a failing test case that
should apply to AArch64 (and more generally).

Differential Revision: https://reviews.llvm.org/D114655

2717f62c

[ifs] Add options to allow llvm-ifs to generate multiple outputs · 5e171ceb

Haowei Wu authored Dec 02, 2021

This change adds options to llvm-ifs to allow it to generate multiple
types of stub files at a single invocation.

Differential Revision: https://reviews.llvm.org/D115024

5e171ceb

[AArch64][GlobalISel] Legalize scalar G_FMAXIMUM + G_FMINIMUM · 47e1f672

Jessica Paquette authored Dec 08, 2021

Necessary for implementing some combines on floating point selects.

Differential Revision: https://reviews.llvm.org/D115372

47e1f672

[SLP]Fix comparator for cmp instruction vectorization. · 19c5cf41

Alexey Bataev authored Dec 07, 2021

The comparator for the sort functions should provide strict weak
ordering relation between parameters. Current solution causes compiler
crash with some standard c++ library implementations, because it does
not meet this criteria. Tried to fix it + it improves the iverall
vectorization result.

Differential Revision: https://reviews.llvm.org/D115268

19c5cf41

[reductions] Delete another piece of dead flag handling [NFC] · 0d13f94c

Philip Reames authored Dec 09, 2021

The code claimed to handle nsw/nuw, but those aren't passed via builder state and the explicit IR construction just above never sets them.

The only case this bit of code is actually relevant for is FMF flags.  However, dropPoisonGeneratingFlags currently doesn't know about FMF at all, so this was a noop.  It's also unneeded, as the caller explicitly configures the flags on the builder before this call, and the flags on the individual ops should be controled by the intrinsic flags anyways.  If any of the flags aren't safe to propagate, the caller needs to make that change.

0d13f94c

[dsymutil][NFC] Fix typo in help message · 2204a7bc

Ellis Hoag authored Dec 09, 2021

Just a simple typo fix that allows me to test landing a commit now that
I have commit access.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D115414

2204a7bc

[recurrence] Delete dead flag/fmf handling [NFC] · b24db85c

Philip Reames authored Dec 09, 2021

The recurrence lowering code has handling which claims to be about flag intersection, but all the callers pass empty arrays to the arguments.  The sole exception is a caller of a method which has the argument, but no implementation.

I don't know what the intent was here, but it certaintly doesn't actually do anything today.

b24db85c

[instcombine] Do demanded elts last when visiting extractelement · 98f5ab6a

Philip Reames authored Dec 09, 2021

This reorders existing transforms to put demanded elements last. The reasoning here is that when we have an example which can be scalarized or handled via demanded bits, we should prefer scalarization as that doesn't require dropping flags on arithmetic instructions.

This doesn't show major changes in the tests today, but once I add support for fast math flags to dropPoisonGeneratingFlags this becomes glaringly obvious.

Differential Revision: https://reviews.llvm.org/D115394

98f5ab6a

Compute estimated trip counts for multiple exit loops · 2d31b025

Philip Reames authored Dec 09, 2021

This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, but that's fine as it causes us to over estimate the trip count at worst.

Reviewing the uses of the API, all but one are cases where we restrict a loop transformation (unroll, and vectorize respectively) when we know the trip count is short enough. So, as a result, the change makes these passes strictly less aggressive. The test change illustrates a case where we'd previously have runtime unrolled a loop which ran fewer iterations than the unroll factor. This is definitely unprofitable.

The one case where an upper bound on estimate trip count could drive a more aggressive transform is peeling, and I duplicated the logic being removed from the generic estimation there to keep it the same. The resulting heuristic makes no sense and should probably be immediately removed, but we can do that in a separate change.

This was noticed when analyzing regressions on D113939.

I plan to come back and incorporate estimated trip counts from other exits, but that's a minor improvement which can follow separately.

Differential Revision: https://reviews.llvm.org/D115362

2d31b025

[InstCombine] (~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c)) · 06ca0a27

Stanislav Mekhanoshin authored Nov 17, 2021

Transform
```
(~a & b & c) | ~(a | b | c) -> ~(a | (b ^ c))
```
And swapped case:
```
(~a | b | c) & ~(a & b & c) -> ~a | (b ^ c)
```

```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %or1 = or i4 %b, %a
  %or2 = or i4 %or1, %c
  %not1 = xor i4 %or2, 15
  %not2 = xor i4 %a, 15
  %and1 = and i4 %b, %not2
  %and2 = and i4 %and1, %c
  %or3 = or i4 %and2, %not1
  ret i4 %or3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %1 = xor i4 %c, %b
  %2 = or i4 %1, %a
  %or3 = xor i4 %2, 15
  ret i4 %or3
}
Transformation seems to be correct!
```
```
----------------------------------------
define i4 @src(i4 %a, i4 %b, i4 %c) {
%0:
  %and1 = and i4 %b, %a
  %and2 = and i4 %and1, %c
  %not1 = xor i4 %and2, 15
  %not2 = xor i4 %a, 15
  %or1 = or i4 %not2, %b
  %or2 = or i4 %or1, %c
  %and3 = and i4 %or2, %not1
  ret i4 %and3
}
=>
define i4 @tgt(i4 %a, i4 %b, i4 %c) {
%0:
  %xor = xor i4 %b, %c
  %not = xor i4 %a, 15
  %or = or i4 %xor, %not
  ret i4 %or
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112966

06ca0a27

[llvm] Use range-based for loops (NFC) · ccdd5bb2
Kazu Hirata authored Dec 09, 2021

ccdd5bb2

[RISCV] Use MULHU for more division by constant cases. · 6f7de819

Craig Topper authored Dec 09, 2021

D113805 improved handling of i32 divu/remu on RV64. The basic idea
from that can be extended to (mul (and X, C2), C1) where C2 is any
mask constant.

We can replace the and with an SLLI by shifting by the number of
leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1)
bits. This will give the full product XLen additional trailing zeros,
putting the result in the output of MULHU. If we can't use ANDI,
ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register.

The downside is it make take 1 additional instruction to create C1.
But since that's not on the critical path, it can hopefully be
interleaved with other operations.

The previous tablegen pattern is replaced by custom isel code.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D115310

6f7de819

[amdgpu][nfc] Drop dead PtrSet, fix a comment · 04b2f6ea
Jon Chesterfield authored Dec 09, 2021

04b2f6ea
[NFC] Replace some deprecated getAlignment() calls with getAlign() · 1172712f
Arthur Eubanks authored Dec 08, 2021
```
Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D115370
```
1172712f

[RISCV] Reduce duplicate FP test cases. · 0ec5f1e6

Craig Topper authored Dec 09, 2021

-Remove feq, fle, flt tests from *-arith.ll in favor of *-fcmp.ll which tests all predicates.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D113703

0ec5f1e6