Commits · a6b06b785cda1bb94dd05f29d66892ccb44cf0cd · Lorenzo Albano / LLVM bpEVL

Apr 06, 2021

[VPlan] Print VPValue operands for VPWidenPHI if possible. · a6b06b78

Florian Hahn authored Apr 06, 2021

For VPWidenPHIRecipes that model all incoming values as VPValue
operands, print those operands instead of printing the original PHI.

D99294 updates recipes of reduction PHIs to use the VPValue for the
incoming value from the loop backedge, making use of this new printing.

a6b06b78

[AMDGPU][MC][GFX9] Corrected SMEM decoding · 3eadcb86

Dmitry Preobrazhensky authored Apr 06, 2021

Corrected SMEM decoding when IMM=0 and OFFSET>127

Fixed bug 49819 (https://bugs.llvm.org/show_bug.cgi?id=49819)

Differential Revision: https://reviews.llvm.org/D99804

3eadcb86

[CostModel][X86] Improve accuracy of vXi8 multiply reduction costs · 201877d5

Simon Pilgrim authored Apr 06, 2021

After rG47321c311bdbe0145b9bf45d822185c37b19fa50 we promote vXi8 reductions to vXi16 to create a much faster PMULLW mul reduction, followed by a (free) truncation. This avoids the high cost of repeated vXi8 multiplications (which extend+multiply+truncate to/from vXi16 types....).

Fixes the missing vXi8 mul reduction vectorization in PR42674 (Comment #20) 'mul16' test case.

201877d5

[AMDGPU] Regenerate checks to fix prefixes broken in D96340. NFC. · 6eb5b06e
Jay Foad authored Apr 06, 2021

6eb5b06e

[test, AArch64] Fix use of var defined in CHECK-NOT · 638d70be

Thomas Preud'homme authored Mar 28, 2021

LLVM test CodeGen/AArch64/aarch64-tbz.ll tries to check for the absence
of a sequence of instructions with several CHECK-NOT with one of those
directives using a variable defined in another. However CHECK-NOT are
checked independently so that is using a variable defined in a pattern
that should not occur in the input.

This commit removes the definition and uses of variable to check each
line independently, making the check stronger than the current one. It
also removes unnecessary regex match for labels.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D99602

638d70be

[PhaseOrdering] Add PR45687 test coverage · 2fc761aa

Simon Pilgrim authored Apr 06, 2021

This is a mixture of instcombine/simplfycfg/instcombine to recognise and then remove the abs pattern

2fc761aa

[IR] Ignore bitcasts of function pointers which are only used as callees in callbase instruction · 167ea67d

madhur13490 authored Mar 18, 2021

This patch enhances hasAddressTaken() to ignore bitcasts as a
callee in callbase instruction. Such bitcast usage doesn't really take
the address in a useful meaningful way.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D98884

167ea67d

[KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI. · ddbb5873
Simon Pilgrim authored Mar 25, 2021
```
As promised in D98866
```
ddbb5873

[AArch64] Default to zero-cycle-zeroing FP registers · d5f1131c

Sjoerd Meijer authored Apr 06, 2021

It is generally beneficial to prefer "movi d0, #0" over "fmov s0, wzr" as this
is most efficient across all cores; it is recognised as a zeroing idiom. For
newer cores, fmov instructions can also be eliminated early and there is no
difference with movi, but some implementations lack this so is not true for
other/older cores. Thus this standardises on using movi as this should always
gives the same or better performance than the fmov with wzr.

Differential Revision: https://reviews.llvm.org/D99586

d5f1131c

[NFC][WebAssembly] Removed mangled name from test. · f1313b3b
Sam Parker authored Apr 06, 2021

f1313b3b

[AArch64] Use 64-bit movi for zeroing halfs/floats · ef05b08c

Sjoerd Meijer authored Apr 01, 2021

This was using the .2d variant which zeros 128 bits, but using the .2s variant
that zeros 64 bits is faster on some cores.

This is a prep step for D99586 to always using movi for zeroing floats.

Differential Revision: https://reviews.llvm.org/D99710

ef05b08c

[AMDGPU] Add some missing testing for new subtargets gfx90a and gfx90c · 94d0fc32
Jay Foad authored Mar 31, 2021
```
Differential Revision: https://reviews.llvm.org/D99647
```
94d0fc32
[NewPM] Fix unused lambda capture build error · 98742e42
Yevgeny Rouban authored Apr 06, 2021
```
Fixes commit 39e3e3aa: Redesign of PreserveCFG Checker
```
98742e42

[NewPM] Redesign of PreserveCFG Checker · 39e3e3aa

Yevgeny Rouban authored Apr 06, 2021

The reason for the NewPM redesign is described in the commit
  cba3e783: [NewPM] Disable PreservedCFGChecker ...

The checker introduces an internal custom CFG analysis that tracks
current up-to date CFG snapshot. The analysis is invalidated along
any other CFG related analysis (the key is CFGAnalyses). If the CFG
analysis is not invalidated at a functional pass exit then the checker
asserts that the CFG snapshot taken from this analysis is equals to
a snapshot of the current CFG.

Along the way:
- the function CFG::printDiff() is simplified by removing function
  name calculation. The name is printed by the caller;
- fixed CFG invalidated condition (see CFG::invalidate());
- StandardInstrumentations::registerCallbacks() gets additional
  optional parameter of type FunctionAnalysisManager*, which is
  needed by the checker to get the custom CFG analysis;
- several PM related tests updated to explicitly set
  -verify-cfg-preserved=1 as they need.

This patch is safe to land as the CFGChecker is left switched off
(the options -verify-cfg-preserved is false by default). It will be
switched on by a separate patch to minimize possible reverts.

Reviewed By: skatkov, kuhar

Differential Revision: https://reviews.llvm.org/D91327

39e3e3aa

[Statepoint] Factor-out utility function to get non-foldable area of... · 0057ec80

Serguei Katkov authored Apr 05, 2021

[Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC

Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99875

0057ec80

[NewPM] Change tests to run them without PreserveCFGChecker. NFC · 872c57c9

Yevgeny Rouban authored Apr 06, 2021

Change several pass sequence sensitive tests to be indifferent
to the PreserveCFGChecker by explicitly settting the option
-verify-cfg-preserved=0. It is a preparation step that allows
a redesign of PreserveCFGChecker.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D99878

872c57c9

[RISCV] When custom iseling masked stores, copy the mask into V0 instead of virtual register. · cb1028a0

Craig Topper authored Apr 05, 2021

I missed a few intrinsics in 3dd4aa7d
when I did this for masked loads and masked segment loads/stores.

Found while trying to share more code between these custom isel
functions.

cb1028a0

Comment adjustments for a rename · 58ccbd0d
Philip Reames authored Apr 05, 2021

58ccbd0d

[SROA] Allow SROA on pointers with invariant group intrinsic uses · ea0e2ca1

Arthur Eubanks authored Mar 29, 2021

When we are able to SROA an alloca, we know all uses of it, meaning we
don't have to preserve the invariant group intrinsics and metadata.

It's possible that we could lose information regarding redundant
loads/stores, but that's unlikely to have any real impact since right
now the only user is Clang and vtables.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99760

ea0e2ca1

Exact ashr/lshr don't loose any set bits and are thus trivially invertible · 13deb6aa
Philip Reames authored Apr 05, 2021
```
Use that fact to improve isKnownNonEqual.
```
13deb6aa
Address minor post commit feedback on 0e59dd · dc8d864e
Philip Reames authored Apr 05, 2021

dc8d864e
Copy syncscope when expanding atomicrmw into cmpxchg loop · 30b3aab3
Stanislav Mekhanoshin authored Apr 05, 2021
```
Fixes: SWDEV-280070

Differential Revision: https://reviews.llvm.org/D99902
```
30b3aab3

[RISCV] Add more RV32 vslide1up intrinsic test cases. NFC · 39151443

Craig Topper authored Apr 05, 2021

For some reason we only had 1 test case. This synchronizes the
test with vslide1down so we have the same number of tests for both.

39151443

Apr 05, 2021

[InstSimplify] fix potential miscompile in select value equivalence · e2a0f512

Sanjay Patel authored Apr 05, 2021

This is the sibling fix to c590a988 -
as there, we can't subsitute a vector value the equality
compare replacement that we are trying requires that the
comparison is true for the entire value. Vector select
can be partly true/false.

e2a0f512

[InstSimplify] add test for vector select with operand replacement; NFC · 78e5cf66
Sanjay Patel authored Apr 05, 2021
```
We need a sibling fix to c590a988
( https://llvm.org/PR49832 ) to avoid miscompiling.
```
78e5cf66
[RISCV] Add SDTCisInt to the SDTRVVSlide1 since it is only used for vslide1up.vx/vslide1down.vx. · 780a4728
Craig Topper authored Apr 05, 2021
```
The scalar type is already marked as XLenVT. The floating point
version would need a different rule.
```
780a4728

[RISCV] Split RISCVISD::VMV_S_XF_VL into separate integer and FP. · af283767

Craig Topper authored Apr 05, 2021

It's a bit silly, but it allows us to write stricter type
constraints for isel. There's still some extra type checks in
the generated table due to some type interference limitations
around HWMode.

af283767

Fix copy paste errors in tests from be11bd1e · 1d4c7429
Philip Reames authored Apr 05, 2021
```
Several of these weren't testing what was intented.
```
1d4c7429

Extract a helper for figuring out if an operator is invertible [nfc] · b0e59dd6

Philip Reames authored Apr 05, 2021

For use in an uncoming patch.  Left out the phi case (which could otherwise fit in this framework) as it would cause infinite recursion in said patch.  We can probably also leverage this in instcombine to ensure we keep the two sets of related analysis and transforms in sync.

b0e59dd6

[tests] Precommmit tests for reasoning about equality of recurrences · be11bd1e
Philip Reames authored Apr 05, 2021

be11bd1e
[RISCV] Move VSLIDE1UP_VX pattern out of a loop that includes FP types. · 7edda698
Craig Topper authored Apr 05, 2021
```
FP would need VFSLIDE1UP_VF which uses an FP register.
```
7edda698

[M68k] Add support for Motorola literal syntax to AsmParser · 4db18d62

Ricky Taylor authored Jan 26, 2021

These look like $00A0cf for hex and  %001010101 for binary. They are used in Motorola assembly syntax.

Differential Revision: https://reviews.llvm.org/D98519

4db18d62

[OPENMP51]Initial support for nocontext clause. · 7078ef47

Jennifer Yu authored Apr 03, 2021

Added basic parsing/sema/serialization support for the 'nocontext' clause.

Differential Revision: https://reviews.llvm.org/D99848

7078ef47

[gn build] (manually) port 0116d04d · 6103f3f3
Nico Weber authored Apr 05, 2021

6103f3f3
Revert "llvm-shlib: Create object libraries for each component and link against them" · e07e08f3
Tom Stellard authored Apr 05, 2021
```
This reverts commit 43ceb74e.

This caused some build failures: https://bugs.llvm.org/show_bug.cgi?id=49818
```
e07e08f3
Revert "Fix build rules for LLVM_WITH_Z3 after D95727" · 982396dd
Tom Stellard authored Apr 05, 2021
```
This reverts commit d66f9c4f.

This was a follow up fix for 43ceb74e, which
will be reverted.
```
982396dd

[TextAPI] move source code files out of subdirectory, NFC · 0116d04d

Cyndy Ishida authored Apr 05, 2021

TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a
need to seperate out TextAPI between formats.

Reviewed By: ributzka, int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D99811

0116d04d

[gn build] Port 9b3df78b · 5abc7250
LLVM GN Syncbot authored Apr 05, 2021

5abc7250

[LoopFusion] Bails out if only the second candidate is guarded (PR48060) · 6a82ace5

Ta-Wei Tu authored Apr 06, 2021

If only the second candidate loop is guarded while the first one is not, fusioning
two loops might not be valid but this check is currently missing.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48060

Reviewed By: sidbav

Differential Revision: https://reviews.llvm.org/D99716

6a82ace5

[RISCV] Add support for bitcasts between scalars and fixed-length vectors · af3a839c

Fraser Cormack authored Mar 31, 2021

This patch supports bitcasts from scalar types to fixed-length vectors
and vice versa. It custom-lowers and custom-legalizes them to
EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT operations, using a single-element
vectors to hold the scalar where appropriate.

Previously, some of these would fail to select, others would be expanded
through stack loads and stores. Effort was made to ensure the codegen
avoids the stack for both legal and illegal scalar types.

Some of the codegen could be improved, but on first glance it looks like
a general optimization of EXTRACT_VECTOR_ELT when extracting an i64
element on RV32.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99667

af3a839c