Commits · a2223b09b10a4cc87b5e9c4a36ab9401c46610f6 · Lorenzo Albano / LLVM bpEVL

Jan 20, 2021

[NFC] Move ImportedFunctionsInliningStatistics to Analysis · 95ce32c7

Mircea Trofin authored Jan 20, 2021

This is related to D94982. We want to call these APIs from the Analysis
component, so we can't leave them under Transforms.

Differential Revision: https://reviews.llvm.org/D95079

95ce32c7

[PredicateInfo] Handle logical and/or · 1c6d1e57

Nikita Popov authored Jan 20, 2021

Teach PredicateInfo to handle logical and/or the same way as
bitwise and/or. This allows handling logical and/or inside IPSCCP
and NewGVN.

1c6d1e57

Reland "[PDB] Defer relocating .debug$S until commit time and parallelize it" · 1a9bd5b8

Reid Kleckner authored Jan 20, 2021

This reverts commit 5b7aef6e and relands
6529d7c5.

The ASan error was debugged and determined to be the fault of an invalid
object file input in our test suite, which was fixed by my last change.
LLD's project policy is that it assumes input objects are valid, so I
have added a comment about this assumption to the relocation bounds
check.

1a9bd5b8

[PredicateInfo] Generalize processing of conditions · ca4ed1e7

Nikita Popov authored Jan 11, 2021

Branch/assume conditions in PredicateInfo are currently handled in
a rather ad-hoc manner, with some arbitrary limitations. For example,
an `and` of two `icmp`s will be handled, but an `and` of an `icmp`
and some other condition will not. That also includes the case where
more than two conditions and and'ed together.

This patch makes the handling more general by looking through and/ors
up to a limit and considering all kinds of conditions (though operands
will only be taken for cmps of course).

Differential Revision: https://reviews.llvm.org/D94447

ca4ed1e7

[WebAssembly] Prototype new f64x2 conversions · 11802ece

Thomas Lively authored Jan 20, 2021

As proposed in https://github.com/WebAssembly/simd/pull/383.

Differential Revision: https://reviews.llvm.org/D95012

11802ece

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets · 560d7e04

dfukalov authored Jan 20, 2021

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036

560d7e04

[lld-macho] Run ObjCContractPass during LTO · 697f4e42

Jez Ng authored Jan 12, 2021

Run the ObjCARCContractPass during LTO. The legacy LTO backend (under
LTO/ThinLTOCodeGenerator.cpp) already does this; this diff just adds that
behavior to the new LTO backend. Without that pass, the objc.clang.arc.use
intrinsic will get passed to the instruction selector, which doesn't know how to
handle it.

In order to test both the new and old pass managers, I've also added support for
the `--[no-]lto-legacy-pass-manager` flags.

P.S. Not sure if the ordering of the pass within the pipeline matters...

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94547

697f4e42

Revert "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor" · d97f776b
Mircea Trofin authored Jan 20, 2021
```
This reverts commit e8aec763.
```
d97f776b

[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor · e8aec763

Mircea Trofin authored Jan 19, 2021

When using 2 InlinePass instances in the same CGSCC - one for other
mandatory inlinings, the other for the heuristic-driven ones - the order
in which the ImportedFunctionStats would be output-ed would depend on
the destruction order of the inline passes, which is not deterministic.

This patch moves the ImportedFunctionStats responsibility to the
InlineAdvisor to address this problem.

Differential Revision: https://reviews.llvm.org/D94982

e8aec763

Revert "[DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE" · a5122605

Hans Wennborg authored Jan 20, 2021

It caused "Vector shift amounts must be in the same as their first arg"
asserts in Chromium builds. See the code review for repro instructions.

> Add DemandedElts support inside the TRUNCATE analysis.
>
> Differential Revision: https://reviews.llvm.org/D56387

This reverts commit cad4275d.

a5122605

[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly · 16d6e852
Dávid Bolvanský authored Jan 20, 2021
```
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94850
```
16d6e852
[RISCV] Remove unnecessary APInt copy. NFC · 9d792fef
Craig Topper authored Jan 20, 2021
```
getAPIntValue returns a const APInt& so keep it as a reference.
```
9d792fef

[X86][AVX] Handle vperm2x128 shuffling of a subvector splat. · b8b5e87e

Simon Pilgrim authored Jan 20, 2021

We already handle "vperm2x128 (ins ?, X, C1), (ins ?, X, C1), 0x31" for shuffling of the upper subvectors, but we weren't dealing with the case when we were splatting the upper subvector from a single source.

b8b5e87e

[AArch64] Fix -Wunused-but-set-variable in GCC -DLLVM_ENABLE_ASSERTIONS=off build · 36e62b1f
Fangrui Song authored Jan 20, 2021

36e62b1f

[PowerPC][Power10] Exploit splat instruction xxsplti32dx in Power10 · 719b563e

Albion Fung authored Jan 20, 2021

Exploits the instruction xxsplti32dx.

It can be used to materialize any 64 bit scalar/vector splat by using two instances, one for the upper 32 bits and the other for the lower 32 bits. It should not materialize the cases which can be materialized by using the instruction xxspltidp.

Differential Revision: https://https://reviews.llvm.org/D90173

719b563e

[RISCV] Add way to mark CompressPats that should only be used for compressing. · b11b6ab3

Craig Topper authored Jan 20, 2021

There can be muliple patterns that map to the same compressed
instruction. Reversing those leads to multiple ways to uncompress
an instruction, but its not easily controllable which one will
be chosen by the tablegen backend.

This patch adds a flag to mark patterns that should only be used
for compressing. This allows us to leave one canonical pattern
for uncompressing.

The obvious benefit of this is getting c.mv to uncompress to
the addi patern that is aliased to the mv pseudoinstruction. For
the add/and/or/xor/li patterns it just removes some unreachable
code from the generated code.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94894

b11b6ab3

[SLP] reduce reduction code for checking vectorizable ops; NFC · c09be0d2

Sanjay Patel authored Jan 20, 2021

This is another step towards removing `OperationData` and
fixing FMF matching/propagation bugs when forming reductions.

c09be0d2

[SLP] refactor more reduction functions; NFC · 1c54112a

Sanjay Patel authored Jan 20, 2021

We were able to remove almost all of the state from
OperationData, so these don't make sense as members
of that class - just pass the RecurKind in as a param.

More streamlining is possible, but I'm trying to avoid
logic/typo bugs while fixing this. Eventually, we should
not need the `OperationData` class.

1c54112a

[SLP] move reduction createOp functions; NFC · 8590d245

Sanjay Patel authored Jan 19, 2021

We were able to remove almost all of the state from
OperationData, so these don't make sense as members
of that class - just pass the RecurKind in as a param.

8590d245

Loop peeling: check that latch is conditional branch · 40cd262c

Joseph Tremoulet authored Jan 20, 2021

Loop peeling assumes that the loop's latch is a conditional branch.  Add
a check to canPeel that explicitly checks for this, and testcases that
otherwise fail an assertion when trying to peel a loop whose back-edge
is a switch case or the non-unwind edge of an invoke.

Reviewed By: skatkov, fhahn

Differential Revision: https://reviews.llvm.org/D94995

40cd262c

[DAGCombiner] Enable SimplifyDemandedBits vector support for TRUNCATE · cad4275d
Simon Pilgrim authored Jan 20, 2021
```
Add DemandedElts support inside the TRUNCATE analysis.

Differential Revision: https://reviews.llvm.org/D56387
```
cad4275d
Revert "[TableGen] Improve algorithm for inheriting class template args and fields" · 4f5f29d4
Paul C. Anagnostopoulos authored Jan 20, 2021
```
This reverts commit c056f824.

That commit causes build failures.
```
4f5f29d4

[X86][AVX] Fold extract_subvector(VSRLI/VSHLI(x,32)) -> VSRLI/VSHLI(extract_subvector(x),32) · 19d02842

Simon Pilgrim authored Jan 20, 2021

As discussed on D56387, if we're shifting to extract the upper/lower half of a vXi64 vector then we're actually better off performing this at the subvector level as its very likely to fold into something.

combineConcatVectorOps can perform this in reverse if necessary.

19d02842

[TableGen] Improve algorithm for inheriting class template args and fields · c056f824
Paul C. Anagnostopoulos authored Jan 08, 2021
```
Differential Revision: https://reviews.llvm.org/D94822
```
c056f824

[AArch64] Add support for the GNU ILP32 ABI · 21bfd068

Amanieu d'Antras authored Jan 20, 2021

Add the aarch64[_be]-*-gnu_ilp32 targets to support the GNU ILP32 ABI for AArch64.

The needed codegen changes were mostly already implemented in D61259, which added support for the watchOS ILP32 ABI. The main changes are:
- Wiring up the new target to enable ILP32 codegen and MC.
- ILP32 va_list support.
- ILP32 TLSDESC relocation support.

There was existing MC support for ELF ILP32 relocations from D25159 which could be enabled by passing "-target-abi ilp32" to llvm-mc. This was changed to check for "gnu_ilp32" in the target triple instead. This shouldn't cause any issues since the existing support was slightly broken: it was generating ELF64 objects instead of the ELF32 object files expected by the GNU ILP32 toolchain.

This target has been tested by running the full rustc testsuite on a big-endian ILP32 system based on the GCC ILP32 toolchain.

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D94143

21bfd068

[PM] Avoid duplicates in the Used/Preserved/Required sets · 985b9b7e

Bjorn Pettersson authored Jan 11, 2021

The pass analysis uses "sets" implemented using a SmallVector type
to keep track of Used, Preserved, Required and RequiredTransitive
passes. When having nested analyses we could end up with duplicates
in those sets, as there was no checks to see if a pass already
existed in the "set" before pushing to the vectors. This idea with
this patch is to avoid such duplicates by avoiding pushing elements
that already is contained when adding elements to those sets.

To align with the above PMDataManager::collectRequiredAndUsedAnalyses
is changed to skip adding both the Required and RequiredTransitive
passes to its result vectors (since RequiredTransitive always is
a subset of Required we ended up with duplicates when traversing
both sets).

Main goal with this is to avoid spending time verifying the same
analysis mulitple times in PMDataManager::verifyPreservedAnalysis
when iterating over the Preserved "set". It is assumed that removing
duplicates from a "set" shouldn't have any other negative impact
(I have not seen any problems so far). If this ends up causing
problems one could do some uniqueness filtering of the vector being
traversed in verifyPreservedAnalysis instead.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D94416

985b9b7e

[AArch64] Add missing "flagm" feature to the .arch_extension directive. · cab20f61
Mark Murray authored Jan 19, 2021
```
Depends on D94970

Differential Revision: https://reviews.llvm.org/D94971
```
cab20f61
[AArch64] Add missing "pauth" feature to the .arch_extension directive. · f344c028
Mark Murray authored Jan 18, 2021
```
Differential Revision: https://reviews.llvm.org/D94970
```
f344c028

[Coroutine] Remain alignment information when merging frame variables · c1bc7981

Chuanqi Xu authored Jan 20, 2021

Summary: This is to address bug48712.
The solution in this patch is that when we want to merge two variable a
into the storage frame of variable b only if the alignment of a is
multiple of b.
There may be other strategies. But now I think they are hard to handle
and benefit little. Or we can implement them in the future.

Test-plan: check-llvm

Reviewers: jmorse, lxfind, junparser

Differential Revision: https://reviews.llvm.org/D94891

c1bc7981

[AMDGPU][GlobalISel] Avoid selecting S_PACK with constants · a6a72dfd

Mirko Brkusanin authored Jan 20, 2021

If constants are hidden behind G_ANYEXT we can treat them same way as G_SEXT.
For that purpose we extend getConstantVRegValWithLookThrough with option
to handle G_ANYEXT same way as G_SEXT.

Differential Revision: https://reviews.llvm.org/D92219

a6a72dfd

[AMDGPU][MC] Add tfe disassembler support MIMG opcodes · 4ab704d6

Petar Avramovic authored Jan 20, 2021

With tfe on there can be a vgpr write to vdata+1.
Add tablegen support for 5 register vdata store.
This is required for 4 register vdata store with tfe.

Differential Revision: https://reviews.llvm.org/D94960

4ab704d6

[GlobalISel] Add missing operand update when copy is required · 2aeaaf84

Gabriel Hjort Åkerlund authored Jan 20, 2021

When constraining an operand register using constrainOperandRegClass(),
the function may emit a COPY in case the provided register class does
not match the current operand register class. However, the operand
itself is not updated to make use of the COPY, thereby resulting in
incorrect code. This patch fixes that bug by updating the machine
operand accordingly.

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D91244

2aeaaf84

[NFC][InstructionCost] Use InstructionCost in lib/Transforms/IPO/IROutliner.cpp · 255a5077

David Sherwood authored Jan 11, 2021

In places where we call a TTI.getXXCost() function I have changed
the code to use InstructionCost instead of unsigned. This is in
preparation for later on when we will change the TTI interfaces
to return InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D94427

255a5077

[X86] Add segment and address-size override prefixes · e2229538

Bill Wendling authored Jan 14, 2021

X86 allows for the "addr32" and "addr16" address size override prefixes.
Also, these and the segment override prefixes should be recognized as
valid prefixes.

Differential Revision: https://reviews.llvm.org/D94726

e2229538

[RISCV] Implement vlseg intrinsics. · 8ca4b174

Hsiangkai Wang authored Dec 31, 2020

For Zvlsseg, we need continuous vector registers for the values. We need
to define new register classes for the different combinations of (number
of fields and LMUL). For example,

when the number of fields(NF) = 3, LMUL = 2, the values will be assigned
to (V0M2, V2M2, V4M2), (V2M2, V4M2, V6M2), (V4M2, V6M2, V8M2), ...

We define the vlseg intrinsics with multiple outputs. There is no way to
describe the codegen patterns with multiple outputs in the tablegen
files. We do the codegen in RISCVISelDAGToDAG and use EXTRACT_SUBREG to
extract the values of output.

The multiple scalable vector values will be put into a struct. This
patch is depended on the support for scalable vector struct.

Differential Revision: https://reviews.llvm.org/D94229

8ca4b174

[llvm] Use llvm::all_of (NFC) · b023cdea
Kazu Hirata authored Jan 19, 2021

b023cdea
[llvm] Use llvm::any_of (NFC) · 978c7540
Kazu Hirata authored Jan 19, 2021

978c7540
[llvm] Use llvm::find (NFC) · 88572024
Kazu Hirata authored Jan 19, 2021

88572024

[RISCV] refactor VPatBinary (NFC) · 4dae2247

ShihPo Hung authored Jan 18, 2021

Make it easier to reuse for intrinsic vrgatherei16
which needs to encode both LMUL & EMUL in the instruction name,
like PseudoVRGATHEREI16_VV_M1_M1 and PseudoVRGATHEREI16_VV_M1_M2.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94951

4dae2247

Allow nonnull/align attribute to accept poison · 4479c0c2

Juneyoung Lee authored Jan 13, 2021

Currently LLVM is relying on ValueTracking's `isKnownNonZero` to attach `nonnull`, which can return true when the value is poison.
To make the semantics of `nonnull` consistent with the behavior of `isKnownNonZero`, this makes the semantics of `nonnull` to accept poison, and return poison if the input pointer isn't null.
This makes many transformations like below legal:

```
%p = gep inbounds %x, 1 ; % p is non-null pointer or poison
call void @f(%p)        ; instcombine converts this to call void @f(nonnull %p)
```

Instead, this semantics makes propagation of `nonnull` to caller illegal.
The reason is that, passing poison to `nonnull` does not immediately raise UB anymore, so such program is still well defined, if the callee does not use the argument.
Having `noundef` attribute there re-allows this.

```
define void @f(i8* %p) {       ; functionattr cannot mark %p nonnull here anymore
  call void @g(i8* nonnull %p) ; .. because @g never raises UB if it never uses %p.
  ret void
}
```

Another attribute that needs to be updated is `align`. This patch updates the semantics of align to accept poison as well.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D90529

4479c0c2