Commits · fe0197e194a64f950602fb50736b6648a9e5b2a9 · Lorenzo Albano / LLVM bpEVL

Oct 07, 2020

[InstCombine] Add checks for and(logicalshift(zext(x),undef),y) cases · fe0197e1
Simon Pilgrim authored Oct 07, 2020
```
Prep work before some cleanup in narrowMaskedBinOp
```
fe0197e1

[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. · 322d0afd

Amara Emerson authored Oct 02, 2020

This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787

322d0afd

[NFC][InstCombine] Autogenerate a few tests being affected by upcoming patch · bef27e50
Roman Lebedev authored Oct 07, 2020

bef27e50
[Tests] Precommit test showing gap around load forwarding of vectors in instcombine · 14d5ee63
Philip Reames authored Oct 07, 2020

14d5ee63

InstCombine: Negator: don't rely on complexity sorting already being performed (PR47752) · fed0f890

Roman Lebedev authored Oct 07, 2020

In some cases, we can negate instruction if only one of it's operands
negates. Previously, we assumed that constants would have been
canonicalized to RHS already, but that isn't guaranteed to happen,
because of InstCombine worklist visitation order,
as the added test (previously-hanging) shows.

So if we only need to negate a single operand,
we should ensure ourselves that we try constant operand first.
Do that by re-doing the complexity sorting ourselves,
when we actually care about it.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47752

fed0f890

[InstCombine] Tweak funnel by constant tests for better shl/lshr commutation coverage · dce03e30
Simon Pilgrim authored Oct 07, 2020

dce03e30

Oct 06, 2020

[SimplifyLibCalls] Optimize mempcpy_chk to mempcpy · 86429c4e
Dávid Bolvanský authored Oct 05, 2020

86429c4e

[test][InstCombine][NewPM] Fix InstCombine tests under NPM · 8df17b4d

Arthur Eubanks authored Sep 23, 2020

Some of these depended on analyses being present that aren't provided
automatically in NPM.

early_dce_clobbers_callgraph.ll was previously inlining a noinline function?

cast-call-combine.ll relied on the legacy always-inline pass being a
CGSCC pass and getting rerun.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D88187

8df17b4d

Oct 05, 2020

[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which... · e00f189d

Roman Lebedev authored Oct 05, 2020

[InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592)

(it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html)

This canonicalization seems dubious.

Most importantly, while it does not create `inttoptr` casts by itself,
it may cause them to appear later, see e.g. D88788.

I think it's pretty obvious that it is an undesirable outcome,
by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts
are not no-op, and are no longer eager to look past them.
Which e.g. means that given
```
%a = load i32
%b = inttoptr %a
%c = inttoptr %a
```
we likely won't be able to tell that `%b` and `%c` is the same thing.

As we can see in D88789 / D88788 / D88806 / D75505,
we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least)
And we can't recover the situation post-inlining in instcombine.

So it really does look like this fold is actively breaking
otherwise-good IR, in a way that is not recoverable.
And that means, this fold isn't helpful in exposing the passes
that are otherwise unaware of these patterns it produces.

Thusly, i propose to simply not perform such a canonicalization.
The original motivational RFC does not state what larger problem
that canonicalization was trying to solve, so i'm not sure
how this plays out in the larger picture.

On vanilla llvm test-suite + RawSpeed, this results in
increase of asm instructions and final object size by ~+0.05%
decreases final count of bitcasts by -4.79% (-28990),
ptrtoint casts by -15.41% (-3423),
and of inttoptr casts by -25.59% (-6919, *sic*).
Overall, there's -0.04% less IR blocks, -0.39% instructions.

See https://bugs.llvm.org/show_bug.cgi?id=47592

Differential Revision: https://reviews.llvm.org/D88789

e00f189d

Revert "[SLC] Optimize mempcpy_chk to mempcpy" · a4bae56a
Dávid Bolvanský authored Oct 05, 2020
```
This reverts commit 3f1fd59d.
```
a4bae56a

[SLC] Optimize mempcpy_chk to mempcpy · 3f1fd59d

Dávid Bolvanský authored Oct 05, 2020

As reported in PR46735:

void* f(void *d, const void *s, size_t l)
{
    return __builtin___mempcpy_chk(d, s, l, __builtin_object_size(d, 0));
}

This can be optimized to `return mempcpy(d, s, l);`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86019

3f1fd59d

[InstCombine] Handle GEP inbounds in select op replacement (PR47730) · 3641d375

Nikita Popov authored Oct 05, 2020

When retrying the "simplify with operand replaced" select
optimization without poison flags, also handle inbounds on GEPs.

Of course, this particular example would also be safe to transform
while keeping inbounds, but the underlying machinery does not
know this (yet).

3641d375

[InstCombine] Add test for PR47730 · 0f8e4a5e
Nikita Popov authored Oct 05, 2020

0f8e4a5e

[InstCombine] Extend 'shift with constants' vector tests · 5ba084c4

Simon Pilgrim authored Oct 05, 2020

Added missing test coverage for shl(add(and(lshr(x,c1),c2),y),c1) -> add(and(x,c2<<c1),shl(y,c1)) combine

Rename tests as 'foo' and 'bar' isn't very extensible

Added vector tests with undefs and nonuniform constants

5ba084c4

[InstCombine] Add or(shl(v,and(x,bw-1)),lshr(v,bw-and(x,bw-1))) funnel shift tests · 2efd9fd6
Simon Pilgrim authored Oct 05, 2020
```
If we know the shift amount is less than the bitwidth we should be able to convert this to a funnel shift
```
2efd9fd6
[ValueTracking] canCreateUndefOrPoison - use APInt to check bounds instead of getZExtValue(). · 2cd7b0e1
Simon Pilgrim authored Oct 05, 2020
```
Fixes OSS Fuzz #26135
```
2cd7b0e1

Oct 03, 2020

[NFC][InstCombine] Autogenerate a few tests being affected by an upcoming patch · cd20c266
Roman Lebedev authored Oct 03, 2020

cd20c266
[InstCombine] Add tests for or(shl(x,c1),lshr(y,c2)) patterns that could fold to funnel shifts · 53fc4260
Simon Pilgrim authored Oct 03, 2020
```
Some initial test coverage toward fixing PR46896 - these are just copied from rotate.ll
```
53fc4260
[InstCombine] Add or(shl(v,and(x,bw-1)),lshr(v,bw-and(x,bw-1))) rotate tests · b82a7486
Simon Pilgrim authored Oct 03, 2020
```
If we know the shift amount is less than the bitwidth we should be able to convert this to a rotate/funnel shift
```
b82a7486

[InstCombine] recognizeBSwapOrBitReverseIdiom - add vector support · aacfe2be

Simon Pilgrim authored Oct 03, 2020

Add basic vector handling to recognizeBSwapOrBitReverseIdiom/collectBitParts - this works at the element level, all vector element operations must match (splat constants etc.) and there is no cross-element support (insert/extract/shuffle etc.).

aacfe2be

[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap... · 3aa93f69

Simon Pilgrim authored Oct 03, 2020

[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied)

If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type.

Differential Revision: https://reviews.llvm.org/D88578

3aa93f69

Oct 02, 2020
- Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom -... · 0364721e
  Simon Pilgrim authored Oct 02, 2020
```
Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)"

This reverts commit 3d14a1e9.

This is breaking on some 2stage clang buildbots
```
  0364721e
- [InstCombine] Add trunc(bswap(trunc/zext(x))) vector tests · d0dd7cad
  Simon Pilgrim authored Oct 02, 2020
  
  d0dd7cad
- [InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) · 3d14a1e9
  Simon Pilgrim authored Oct 02, 2020
```
If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Differential Revision: https://reviews.llvm.org/D88578
```
  3d14a1e9
- [InstCombine] Add partial bswap vector test from D88578 · 53fb9d06
  Simon Pilgrim authored Oct 02, 2020
  
  53fb9d06
- [InstCombine] Add some basic vector bswap tests · ec07ae2a
  Simon Pilgrim authored Oct 02, 2020
```
We get the vNi16 cases already via matching as a rotate followed by the fshl -> bswap combines
```
  ec07ae2a
- [InstCombine] Add partial bswap test from D88578 · 670e60c0
  Simon Pilgrim authored Oct 02, 2020
  
  670e60c0
Oct 01, 2020
- [InstCombine] Fix select operand simplification with undef (PR47696) · 9d1c8c0b
  Nikita Popov authored Oct 01, 2020
```
When replacing X == Y ? f(X) : Z with X == Y ? f(Y) : Z, make sure
that Y cannot be undef. If it may be undef, we might end up picking
a different value for undef in the comparison and the select
operand.
```
  9d1c8c0b
- [InstCombine] auto-generate complete test checks; NFC · 114e964d
  Sanjay Patel authored Oct 01, 2020
  
  114e964d
Sep 30, 2020

[InstCombine] Add tests for 'partial' bswap patterns · f425418f

Simon Pilgrim authored Sep 30, 2020

As mentioned on PR47191, if we're bswap'ing some bytes and the zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

f425418f

[InstCombine] Fix bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector support · 323d08e5
Simon Pilgrim authored Sep 30, 2020
```
Use getScalarSizeInBits not getPrimitiveSizeInBits to determine the shift value at the element level.
```
323d08e5
[InstCombine] Add bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector tests · b85de2c6
Simon Pilgrim authored Sep 30, 2020
```
Add tests showing failure to correctly fold vector bswap(trunc(bswap(x))) intrinsic patterns
```
b85de2c6
[InstCombine] Remove %tmp variable names from bswap-fold tests · 2ef73025
Simon Pilgrim authored Sep 30, 2020
```
Appease update_test_checks script that was complaining about potential %TMP clashes
```
2ef73025
[InstCombine] Remove %tmp variable names from bswap tests · 7fcad558
Simon Pilgrim authored Sep 30, 2020
```
Appease update_test_checks script that was complaining about potential %TMP clashes
```
7fcad558
[InstCombine] Add PR47191 bswap tests · 08c57204
Simon Pilgrim authored Sep 30, 2020

08c57204

[InstCombine] recognizeBSwapOrBitReverseIdiom - recognise zext(bswap(trunc(x))) patterns (PR39793) · af47d40b

Simon Pilgrim authored Sep 30, 2020

PR39793 demonstrated an issue where we fail to recognize 'partial' bswap patterns of the lower bytes of an integer source.

In fact, most of this is already in place collectBitParts suitably tags zero bits, so we just need to correctly handle this case by finding the zero'd upper bits and reducing the bswap pattern just to the active demanded bits.

Differential Revision: https://reviews.llvm.org/D88316

af47d40b

Sep 29, 2020

[InstCombine] ease alignment restriction for converting masked load to normal load · 0527c874

Sanjay Patel authored Sep 29, 2020

I think we initially made this fold conservative to be safer, but we do not
need the alignment attribute/metadata limitation because the masked load
intrinsic itself specifies the alignment. A normal vector load is better for
IR transforms and should be no worse in codegen than the masked alternative.
If it is worse for some target, the backend can reverse this transform.

Differential Revision: https://reviews.llvm.org/D88505

0527c874

[InstCombine] adjust duplicate test for masked load; NFC · 5409e483
Sanjay Patel authored Sep 29, 2020
```
The test after the changed test was checking exactly the same dereferenceable bytes.
```
5409e483

[InstCombine] visitTrunc - trunc (*shr (trunc A), C) --> trunc(*shr A, C) · 0cf48a70

Simon Pilgrim authored Sep 29, 2020

Attempt to fold trunc (*shr (trunc A), C) --> trunc(*shr A, C) iff the shift amount if small enough that all zero/sign bits created by the shift are removed by the last trunc.

Helps fix the regressions encountered in D88316.

I've tweaked a couple of shift values as suggested by @lebedev.ri to ensure we have coverage of shift values close (above/below) to the max limit.

Differential Revision: https://reviews.llvm.org/D88429

0cf48a70

[InstCombine] fix weird formatting in test file; NFC · 388b0689

Sanjay Patel authored Sep 29, 2020

It apparently didn't cause trouble for the parser or FileCheck,
but it was confusing to see a function def split by asserts.

388b0689