Commits · 3a506b31a341585a21b21c42253ea9fc54c55b37 · Lorenzo Albano / LLVM bpEVL

Mar 21, 2021

Change OwningRewritePatternList to carry an MLIRContext with it. · 3a506b31

Chris Lattner authored Mar 20, 2021

This updates the codebase to pass the context when creating an instance of
OwningRewritePatternList, and starts removing extraneous MLIRContext
parameters. There are many many more to be removed.

Differential Revision: https://reviews.llvm.org/D99028

3a506b31

Reapply [ConstantFold] Handle vectors in ConstantFoldLoadThroughBitcast() · 9f864d20

Nikita Popov authored Mar 06, 2021

There seems to be an impedance mismatch between what the type
system considers an aggregate (structs and arrays) and what
constants consider an aggregate (structs, arrays and vectors).

Adjust the type check to consider vectors as well. The previous
version of the patch dropped the type check entirely, but it
turns out that getAggregateElement() does require the constant
to be an aggregate in some edge cases: For Poison/Undef the
getNumElements() API is called, without checking in advance that
we're dealing with an aggregate. Possibly the implementation should
avoid doing that, but for now I'm adding an assert so the next
person doesn't fall into this trap.

9f864d20

[InstSimplify] Add load of undef aggregate test (NFC) · 59dbf4d5
Nikita Popov authored Mar 21, 2021
```
To make sure this doesn't crash the following commit.
```
59dbf4d5
[InstSimplify] Regenerate test checks (NFC) · b32f5d50
Nikita Popov authored Mar 21, 2021

b32f5d50
[InstSimplify] Add additional select operand replacement tests (NFC) · ece1403a
Nikita Popov authored Mar 21, 2021
```
This tests for binops with identity elements.
```
ece1403a

[InstSimplify] Clean up SimplifyReplacedWithOp implementation (NFCI) · daae927f

Nikita Popov authored Mar 21, 2021

Replace Op with RepOp up-front, and then always work with the new
operands, rather than checking for replacement in various places.

daae927f

GlobalISel: Avoid unnecessary truncation to i64 · 1098acd4
Matt Arsenault authored Mar 20, 2021
```
We can just directly pass through the APInt to create a new constant.
```
1098acd4
AMDGPU/GlobalISel: Enable CSE in pre-legalizer combiner · 6314a727
Matt Arsenault authored Mar 20, 2021

6314a727

[DAG] Limit (sext_in_reg (zero_extend_vector_inreg x)) to exact sign extension · 64c2641c

Simon Pilgrim authored Mar 21, 2021

As commented by @craig.topper on rG1ba5c550d418, we can't guarantee that we'll be extending zero bits, just sign bit. So, revert to the old code for zero_extend_vector_inreg cases.

64c2641c

[lld-macho][nfc] Format Options.td · 8757616d

Jez Ng authored Mar 21, 2021

Summary: A good chunk of it was mis-indented. Fixed by using the
formatting settings from llvm/utils/vim.

8757616d

[X86][AVX] ComputeNumSignBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources · 31795889

Simon Pilgrim authored Mar 21, 2021

The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.

Added as an extension to PR49658

31795889

[X86] Add 'mulhs' variant of PR49658 test case · dc51cc32
Simon Pilgrim authored Mar 21, 2021

dc51cc32

[ARM] VINS f16 pattern · 6d9d2049

David Green authored Mar 21, 2021

This adds an extra pattern for inserting an f16 into a odd vector lane
via an VINS. If the dual-insert-lane pattern does not happen to apply,
this can help with some simple cases.

Differential Revision: https://reviews.llvm.org/D95471

6d9d2049

[RISCV] remove redundant instruction when eliminate frame index · 02ffbac8

luxufan authored Mar 19, 2021

The reason for generating mv a0, a0 instruction is when the stack object offset is large then int<12>. To deal this situation, in the elimintateFrameIndex function, it will
create a virtual register, which needs the register scavenger to scavenge it. If the machine instruction that contains the stack object and the opcode is ADDI(the addi
was generated by frameindexNode), and then this instruction's destination register was the same as the register that was generated by the register scavenger, then the
mv a0, a0 was generated. So to eliminnate this instruction, in the eliminateFrameIndex function, if the instrution opcode is ADDI, then the virtual register can't be created.

Differential Revision: https://reviews.llvm.org/D92479

02ffbac8

[X86][AVX] computeKnownBitsForTargetNode - add X86ISD::VBROADCAST handling for scalar sources · 297b9bc3

Simon Pilgrim authored Mar 21, 2021

The target shuffle code handles vector sources, but X86ISD::VBROADCAST can also accept a scalar source for splatting.

Suggested by @craig.topper on PR49658

297b9bc3

[X86] Add PR49658 test case · 613157dd
Simon Pilgrim authored Mar 21, 2021

613157dd
[X86] computeKnownBitsForTargetNode - add X86ISD::PMULUDQ handling · 54a05f2e
Simon Pilgrim authored Mar 21, 2021
```
Reuse the existing KnownBits multiplication code to handle what is effectively a ISD::UMUL_LOHI varient
```
54a05f2e

[Driver] Linux.cpp: add -internal-isystem lib/../$triple/include · 2288a75d

Fangrui Song authored Mar 21, 2021

With this change, for `#include <ar.h>`, `clang --target=aarch64-linux-gnu`
will read `/usr/lib/gcc/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/ar.h`
(on Debian gcc->gcc-cross)
instead of `/usr/include/ar.h`. Some glibc headers (e.g. gnu/stubs.h) are different across architectures.

2288a75d

[Driver] Gnu.cpp: drop an unneeded special rule related to sysroot · c2f9086b
Fangrui Song authored Mar 20, 2021

c2f9086b

[Driver] Gnu.cpp: drop an unneeded special rule related to sysroot · 56700e93

Fangrui Song authored Mar 20, 2021

Seem unnecessary to diverge from GCC here.
Beside, lib/../$OSLibDir can be considered closer to the GCC
installation then the system root. The comment should not apply.

56700e93

[Driver] Gnu.cpp: remove unneeded -L detection hack for -mx32 · 0ad0c476
Fangrui Song authored Mar 20, 2021
```
Removing the hack actually improves our compatibility with gcc -mx32.
```
0ad0c476

[Driver] Gnu.cpp: remove unneeded -L detection for libc++ · 775a2948

Fangrui Song authored Mar 20, 2021

If clang is installed in the system, the other -L suffice;
otherwise $ccc_install_dir/../lib below suffices.

775a2948

[Driver] Gnu.cpp: remove unneeded -L lib/gcc/$triple/$version/../../../$triple · 06d6b147

Fangrui Song authored Mar 20, 2021

After path resolution, it duplicates a subsequent -L entry. The entry below
(lib/gcc/$triple/$version/../../../../$OSLibDir) usually does not exist (e.g.
Arch Linux; Debian cross gcc). When it exists, it typically just has ld.so (e.g.
Debian native gcc) which cannot cause collision. Removing the -L (similar to
reordering it) is therefore justified.

06d6b147

[RISCV] Add test case to show a case where (mul (and X, 0xffffffff), (and Y,... · 27bc30c3

Craig Topper authored Mar 20, 2021

[RISCV] Add test case to show a case where (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization does not improve code.

If the mul add two users, one of which was a sext.w, the mul
would also be selected to a MULW before our pattern runs. This
causes the ANDs to now be used by the already selected MULW and
the mul we still need to select. They are unneeded on the MULW
since MULW only reads the lower bits. So they get selected to
SLLI+SRLI for the MULW use. The use for the
(mul (and X, 0xffffffff), (and Y, 0xffffffff)) manages to reuse
the SLLI.

The end result is increased register pressure and no improvement
to how soon we can start the MULW.

27bc30c3

[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants. · 361b7d12

Chris Lattner authored Mar 19, 2021

This reapplies b5d9a3c9 / https://reviews.llvm.org/D98609 with a one line fix in
processExistingConstants to skip() when erasing a constant we've already seen.

Original commit message:

1) Change the canonicalizer to walk the function in top-down order instead of
bottom-up order. This composes well with the "top down" nature of constant
folding and simplification, reducing iterations and re-evaluation of ops in
simple cases.
2) Explicitly enter existing constants into the OperationFolder table before
canonicalizing. Previously we would "constant fold" them and rematerialize
them, wastefully recreating a bunch fo constants, which lead to pointless
memory traffic.

Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.

One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.

Differential Revision: https://reviews.llvm.org/D99006

361b7d12

Revert "[IRSim] Adding basic implementation of llvm-sim." · 0776eca7
Andrew Litteken authored Mar 20, 2021
```
Causing build errors on the Windows Buildbots.

This reverts commit 5155dff2.
```
0776eca7

Mar 20, 2021

[RISCV] Update comment in RISCVInstrInfoM.td · b2bb0037
Jessica Clarke authored Mar 20, 2021
```
Missed in 07ed62b7.
```
b2bb0037

[RISCV] Disable (mul (and X, 0xffffffff), (and Y, 0xffffffff)) optimization when Zba is enabled. · 07ed62b7

Craig Topper authored Mar 20, 2021

This optimization is trying to save SRLI instructions needed to
implement the ANDs. If we have zext.w we won't save anything.
Because we don't check that the multiply is the only user of the
AND we might even increase instruction count.

07ed62b7

[RISCV] Add Zba command lines to xaluo.ll. NFC · 0874281d

Craig Topper authored Mar 20, 2021

Some of the patterns end up with 32 to 64 bit zero extends on RV64
which can be handled by zext.w.

0874281d

[test] Delete "-internal-isystem" "/usr/local/include" · 1fe1e996
Fangrui Song authored Mar 20, 2021

1fe1e996

[RISCV] Add isel pattern to optimize (mul (and X, 0xffffffff), (and Y, 0xffffffff)) on RV64 · b0d8823a

Craig Topper authored Mar 20, 2021

This patterns computes the full 64 bit product of a 32x32 unsigned
multiply. This requires a two pairs of SLLI+SRLI to zero the
upper 32 bits of the inputs.

We can do better than this by using two SLLI to move the lower
bits to the upper bits then use MULHU to compute the product. This
is the high half of a full 64x64 product. Since we put 32 0s in the lower
bits of the inputs we know the 128-bit product will have zeros in the
lower 64 bits. So the upper 64 bits, which MULHU computes, will contain
the original 64 bit product we were after.

The same trick would work for (mul (sext_inreg X, i32), (sext_inreg Y, i32))
using MULHS, but sext_inreg is sext.w which is already one instruction so we
wouldn't save anything.

Differential Revision: https://reviews.llvm.org/D99026

b0d8823a

[IRSim] Adding basic implementation of llvm-sim. · 5155dff2

Andrew Litteken authored Sep 17, 2020

This is a similarity visualization tool that accepts a Module and
passes it to the IRSimilarityIdentifier.  The resulting SimilarityGroups
are output in a JSON file.

Tests are found in test/tools/llvm-sim and check for the file not found,
a bad module, and that the JSON is created correctly.

Reviewers: paquette, jroelofs, MaskRay

Recommit of: 15645d04 to fix linking
errors.

Differential Revision: https://reviews.llvm.org/D86974

5155dff2

[AIX] Update rpath for BUILD_SHARED_LIBS · 14696baa

Jinsong Ji authored Mar 20, 2021

BUILD_SHARED_LIBS build llvm component as shared library,
which can reduce the size a lot.

Normally, the binary use ORIGIN../lib to load component libraries,
unfortunatly, ORIGIN is not supported by AIX ld.

We hardcoded the build lib and install lib path in rpath for now
to enable BUILD_SHARED_LIBS build.

Understand that this is not perfect solution,
we can update this when we find better solution.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D98901

14696baa

[test] Fix Driver/gcc-toolchain.cpp if CLANG_DEFAULT_RTLIB is compiler-rt · f628ba0b
Fangrui Song authored Mar 20, 2021

f628ba0b

[BranchProbability] move options for 'likely' and 'unlikely' · ee8b5381

Sanjay Patel authored Mar 20, 2021

This makes the settings available for use in other passes by housing
them within the Support lib, but NFC otherwise.

See D98898 for the proposed usage in SimplifyCFG
(where this change was originally included).

Differential Revision: https://reviews.llvm.org/D98945

ee8b5381

[lld-macho] Minor touch-up to objc.s · 47fdaa32
Jez Ng authored Mar 20, 2021

47fdaa32
[AST] Ensure that an empty json file is generated if compile errors · 188405bc
Stephen Kelly authored Mar 17, 2021
```
Differential Revision: https://reviews.llvm.org/D98827
```
188405bc
[test] Fix Driver/gcc-toolchain.cpp if CLANG_DEFAULT_CXX_STDLIB is libc++ · e92faa77
Fangrui Song authored Mar 20, 2021

e92faa77
[VE] Fix types of multiclass template arguments in TableGen files · 879760c2
Fangrui Song authored Mar 20, 2021
```
There were not properly checked before `[TableGen] Improve handling of template arguments`.
```
879760c2
Revert "Revert "[Driver] Drop obsoleted Ubuntu 11.04 gcc detection"" · dc3b438c
Fangrui Song authored Mar 20, 2021
```
This reverts commit 243333ef.
```
dc3b438c