Commits · a80d5c34e4b99f21fa371160ac7eb7e9db093997 · Lorenzo Albano / LLVM bpEVL

Jan 31, 2022

[gn build] Port f3514af4 · 6d22f049
LLVM GN Syncbot authored Jan 31, 2022

6d22f049

Revert "[Local] invertCondition: try modifying an existing ICmpInst" · 8faad296

Jay Foad authored Jan 31, 2022

This reverts commit a6b54dda.

Apparently it is not safe to modify the condition even if it passes the
hasOneUse test, because StructurizeCFG might have other references to
the condition that are not manifest in the IR use-def chains.

8faad296

[SVE] Fix TypeSize->uint64_t implicit conversion in visitAlloca() · 002b944d

Kerry McLaughlin authored Jan 31, 2022

Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca()
when we call this function for a scalable alloca instruction, caused
by the implicit conversion of TySize to uint64_t.
This patch changes TySize to a TypeSize as returned by getTypeAllocSize()
and ensures the allocation size is multiplied by vscale for scalable vectors.

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D118372

002b944d

[ARM] Add Cortex-X1C Support for Clang and LLVM · 6b1e844b

Ties Stuij authored Jan 31, 2022

This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and
ARM.

For more information, see:
- https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c
- https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components

The following people contributed to this patch:
- Simon Tatham
- Ties Stuij

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D117202

6b1e844b

[Analysis] Attribute noundef should not prevent tail call optimization · ae990a3c
Dávid Bolvanský authored Jan 31, 2022
```
Very similar to https://reviews.llvm.org/D101230
Fixes https://github.com/llvm/llvm-project/issues/53501
```
ae990a3c

[X86] combineAnd() - per-element simplification - call SimplifyDemandedBits... · 7ec8fc29

Simon Pilgrim authored Jan 31, 2022

[X86] combineAnd() - per-element simplification - call SimplifyDemandedBits using mask demanded bits if SimplifyDemandedVectorElts fails

We already call SimplifyDemandedVectorElts using whether each vector mask element is zero/nonzero, this just extends this to also try SimplifyDemandedBits using the demanded bits mask generated from the nonzero elements.

This also requires an additional TargetLowering::SimplifyDemandedBits DemandedBits/DemandedElts wrapper.

7ec8fc29

[DebugInfo][InstrRef] Don't fully propagate single assigned variables · c703d77a

Jeremy Morse authored Jan 31, 2022

If we only assign a variable value a single time, we can take a short-cut
when computing its location: the variable value is only valid up to the
dominance frontier of where the assignemnt happens. Past that point, there
are other predecessors from where the variable has no value, meaning the
variable has no location past that point.

This patch recognises this scenario, and avoids expensive SSA computation,
to improve compile-time performance.

Differential Revision: https://reviews.llvm.org/D117877

c703d77a

Revert "[gn build] (manually) port 36892727" · da01fb74
Nico Weber authored Jan 31, 2022
```
This reverts commit 7b2dfe1c.

Matches ab3b8985.
```
da01fb74
Save some `std::string` allocations/deallocations when formatting attributes (NFC) · 5a90b1e4
Momchil Velikov authored Jan 31, 2022
```
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D118451
```
5a90b1e4
[DAG] SimplifyDemandedBits - mul(x,x) - if only demand bit[1] then fold to zero · 2d1390ef
Simon Pilgrim authored Jan 31, 2022

2d1390ef
[X86] Limit mul(x,x) knownbits tests with not undef/poison check · 48f45f6b
Simon Pilgrim authored Jan 31, 2022
```
We can only assume bit[1] == zero if its the only demanded bit or the source is not undef/poison
```
48f45f6b
[AMDGPU] AMDGPUAnnotateUniformValues: inline a single-use lambda. NFC. · 0dcc8b86
Jay Foad authored Jan 31, 2022

0dcc8b86

[AMDGPU] Add test for a problem with noclobber metadata · ae68b3a4

Jay Foad authored Jan 28, 2022

If AMDGPUAnnotateUniformValues finds a load from a uniform pointer with
no potentially clobbering stores between the kernel entry point and the
load instruction, it adds noclobber metadata to the *address*. This is
unsafe because it can get applied to other loads in the same which do
have aliasing stores.

Differential Revision: https://reviews.llvm.org/D118458

ae68b3a4

[X86] Add mul(x,x) tests showing miscompile · ffd0e464

Simon Pilgrim authored Jan 31, 2022

As raised by @efriedma on D117995 - the source must not be undef/poison to demand any bits in mul(x,x) other than bit[1]

https://alive2.llvm.org/ce/z/Cxkjen

ffd0e464

[Local] invertCondition: try modifying an existing ICmpInst · a6b54dda

Jay Foad authored Jan 28, 2022

This avoids various cases where StructurizeCFG would otherwise insert an
xor i1 instruction, and it since it generally runs late in the pipeline,
instcombine does not clean up the xor-of-cmp pattern.

Differential Revision: https://reviews.llvm.org/D118478

a6b54dda

[WebAssembly] Refactor and fix emission of external IR global decls · 00bf4755

Paulo Matos authored Jan 31, 2022

This patches fixes the visibility and linkage information of symbols
referring to IR globals.

Emission of external declarations is now done in the first execution
of emitConstantPool rather than in emitLinkage (and a few other
places). This is the point where we have already gathered information
about used symbols (by running the MC Lower PrePass) and not yet
started emitting any functions so that any declarations that need to
be emitted are done so at the top of the file before any functions.

This changes the order of a few directives in the final asm file which
required an update to a few tests.

Reviewed By: sbc100

Differential Revision: https://reviews.llvm.org/D118122

00bf4755

[AArch64] Fix costs of float vector compare/selects pairs. · 17ebd68a

Florian Hahn authored Jan 31, 2022

The current cost-model overestimates the cost of vector compares &
selects for ordered floating point compares. This patch fixes that by
extending the existing logic for integer predicates.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D118256

17ebd68a

Cleanup LLVMRemarks includes · 25991aad

serge-sans-paille authored Jan 28, 2022

Based on the output of include-what you-use.

Most notably, llvm/Remarks/Remark.h is no longer automatically included by
llvm/Remarks/RemarkParser.h, so client code may need to include explicitly.

clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/Remarks/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 770253
after:  759347

Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup

Differential Revision: https://reviews.llvm.org/D118506

25991aad

Cleanup llvm/utils/TableGen headers · 2dde5c97

serge-sans-paille authored Jan 28, 2022

Based on the output of include-what-you-use.
It's an utility directory, so no much impact on other code areas.

clang++ -E  -Iinclude -I../llvm/include ../llvm/utils/TableGen/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 4327274
after:  4316190

Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118466

2dde5c97

[Inline][Cloning] Reliably remove unreachable blocks during cloning (PR53206) · 4810051a

Nikita Popov authored Jan 28, 2022

The pruning cloner already tries to remove unreachable blocks. The
original cloning process will simplify instructions and constant
terminators, and only clone blocks that are reachable at that point.
However, phi nodes can only be simplified after everything has been
cloned. For that reason, additional blocks may become unreachable
after phi simplification.

The code does try to handle this as well, but only removes blocks
that don't have predecessors. It misses unreachable cycles. This
can cause issues if SEH exception handling code is part of an
unreachable cycle, as the inliner is not prepared to deal with that.

This patch instead performs an explicit scan for reachable blocks,
and drops everything else.

Fixes https://github.com/llvm/llvm-project/issues/53206.

Differential Revision: https://reviews.llvm.org/D118449

4810051a

[RISCV] Avoid pointer element type access for masked atomicrmw intrinsics · 0801940c

Nikita Popov authored Jan 27, 2022

masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly
mask it), so hardcode MVT::i32 as the access type here, rather than
determining it from the pointer element type.

Differential Revision: https://reviews.llvm.org/D118336

0801940c

[llvm] Remove redundant `;` (NFC) · f38767d7
Amir Ayupov authored Jan 30, 2022

f38767d7

[InstCombine] Generalize and-reduce pattern to handle `ne` case as well as `eq` · 70b3beb0

Max Kazantsev authored Jan 31, 2022

Following Sanjay's proposal from discussion in D118317, this patch
generalizes and-reduce handling to fold the following pattern
```
  icmp ne (bitcast(icmp ne (lhs, rhs)), 0)
```
into
```
  icmp ne (bitcast(lhs), bitcast(rhs))
```

https://alive2.llvm.org/ce/z/WDcuJ_

Differential Revision: https://reviews.llvm.org/D118431
Reviewed By: lebedev.ri

70b3beb0

[RISCV] Use existing variable intead of calling getOperand again. NFCI · 5fbc3cda

Craig Topper authored Jan 30, 2022

This is a slight change because I'm using the ANY_EXTEND result
instead of the original operand, but getNode should constant fold.

While there, add a comment about why the code specifically checks
for a ConstantSDNode.

5fbc3cda

[RISCV] Add more pack and packw test case for Zbkb. NFC · 175145e3
Craig Topper authored Jan 30, 2022
```
Make sure we cover the encodings use for zext.h and other encodings
not used for zext.h.
```
175145e3

[RISCV] Merge rv64zbkb-valid.s and rv64zbkb-only-valid.s. NFC · bb495810

Craig Topper authored Jan 30, 2022

Based on the existing naming "only" tests are used for rv32 instructions
that don't exist in rv64. rv32 tests without "only" are for instructions
that are in both rv32 and rv64. The rv64 tests are for instructions
that are only in rv64.

Both of these test files have instruction encodings that are only
valid in rv64 so they can be the same file.

bb495810

[RISCV] Rename rv64-zbkb-valid.s to rv64zbkb-valid.s. NFC · 3931faa5
Craig Topper authored Jan 30, 2022

3931faa5

[RISCV] Fix bad CHECK prefix in rv32zbkb-valid.s. · 491403c1

Craig Topper authored Jan 30, 2022

"pack t0, t1, zero" disassembles to "pack t0, t1, zero" with Zbkb
not "zext.h t0, t1"

Part of the test was using a CHECK prefix that doesn't appear on
the RUN line.

491403c1

[Analysis] Drop an unnecessary const from a return type (NFC) · cda7b6aa
Kazu Hirata authored Jan 30, 2022
```
Identified with readability-const-return-type.
```
cda7b6aa
[llvm] Use = default (NFC) · 152d61a8
Kazu Hirata authored Jan 30, 2022

152d61a8
[mlgo][regalloc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds... · 0e691aed
Fangrui Song authored Jan 30, 2022
```
[mlgo][regalloc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after a8a7bf92
```
0e691aed

Jan 30, 2022

[mlgo][regalloc] Fix register masking · a8a7bf92

Mircea Trofin authored Jan 30, 2022

If AllocationOrder has less than 32 elements, we were treating the extra
positions as if they were valid. This was detected by a subsequent
assert. The fix also tightens the asserts.

a8a7bf92

[RISCV] Lower riscv_zip/unzip intrinsic to RISCVISD::SHFL/UNSHFL. · 744be8c5

Craig Topper authored Jan 30, 2022

These are special versions of the more general shfli/unshfli
instructions. We can use the general ISD opcodes with the correct
immediates.

744be8c5

[Support][NFC] Fix generic `ChildrenGetterTy` of `IDFCalculatorBase` · e0b11c76

Markus Böck authored Jan 30, 2022

Both IDFCalculatorBase and its accompanying DominatorTreeBase only supports pointer nodes. The template argument is the block type itself and any uses of GraphTraits is therefore done via a pointer to the node type.
However, the ChildrenGetterTy type of IDFCalculatorBase has a use on just the node type instead of a pointer to the node type. Various parts of the monorepo has worked around this issue by providing specializations of GraphTraits for the node type directly, or not been affected by using specializations instead of the generic case. These are unnecessary however and instead the generic code should be fixed instead.

An example from within Tree is eg. A use of IDFCalculatorBase in InstrRefBasedImpl.cpp. It basically instantiates a IDFCalculatorBase<MachineBasicBlock, false> but due to the bug above then goes on to specialize GraphTraits<MachineBasicBlock> although GraphTraits<MachineBasicBlock*> exists (and should be used instead).

Similar dead code exists in clang which defines redundant GraphTraits to work around this bug.

This patch fixes both the original issue and removes the dead code that was used to work around the issue.

Differential Revision: https://reviews.llvm.org/D118386

e0b11c76

[RISCV] Custom lower brev8 intrinsic to RISCVISD::GREV. · e1075186

Craig Topper authored Jan 30, 2022

We can use the RISCVISD::GREV encoding that swaps the bits in
each byte.  This allows it to use the existing computeKnownBits
support for RISCVISD::GREV.

e1075186

[OpenMP] Use nullptr instead of NULL (NFC) · 780f8a00
Kazu Hirata authored Jan 30, 2022
```
Identified with modernize-use-nullptr.
```
780f8a00
[Analysis] Use != to compare strings (NFC) · 49fdee13
Kazu Hirata authored Jan 30, 2022
```
Identified with readability-string-compare.
```
49fdee13
[CodeGen] Use default member initialization (NFC) · 2bea207d
Kazu Hirata authored Jan 30, 2022
```
Identified with modernize-use-default-member-init.
```
2bea207d

[X86] combineVectorTruncation - use PACKUSDW(BLENDW(X,0),BLENDW(Y,0)) for v8i32->v8i16 truncation · 156f83ad

Simon Pilgrim authored Jan 30, 2022

Limit this to SSE41 - AVX1 targets to avoid UNPCKL(PSHUFB,PSHUFB), pre-SSE41 we don't have PACKUSDW/BLENDW and with AVX2 we can perform this as PERMQ(PSHUFB()).

156f83ad

[X86][AVX] matchUnaryShuffle - avoid creation of on-the-fly nodes (PR45974) · b7e04ccd
Simon Pilgrim authored Jan 30, 2022
```
Don't extract the ANY/ZERO_EXTEND_VECTOR_INREG subvector source until we're definitely combining to a new node.
```
b7e04ccd