Commits · af1c5312d76000bf134d8b81cdb7343607c6ee64 · Lorenzo Albano / LLVM bpEVL

Sep 21, 2021
- [InstCombine] add tests for mask-shift with trunc; NFC · af1c5312
  Sanjay Patel authored Sep 20, 2021
  
  af1c5312
- [InstCombine] foldConstantInsEltIntoShuffle - bail if we fail to find constant element (PR51824) · fc8f1e44
  Simon Pilgrim authored Sep 21, 2021
```
If getAggregateElement() returns null for any element, early out as otherwise we will assert when creating a new constant vector

Fixes PR51824 + ; OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38057
```
  fc8f1e44
- [InstCombine] Eliminate vector reverse if all inputs/outputs to an instruction are reverses · f417d9d8
  Usman Nadeem authored Sep 20, 2021
```
Differential Revision: https://reviews.llvm.org/D109808

Change-Id: I1a10d2bc33acbe0ea353c6cb3d077851391fe73e
```
  f417d9d8
Sep 20, 2021

[IR] Add helper to convert offset to GEP indices · dd022656

Nikita Popov authored Sep 19, 2021

We implement logic to convert a byte offset into a sequence of GEP
indices for that offset in a number of places. This patch adds a
DataLayout::getGEPIndicesForOffset() method, which implements the
core logic. I've updated SROA, ConstantFolding and InstCombine to
use it, and there's a few more places where it looks relevant.

Differential Revision: https://reviews.llvm.org/D110043

dd022656

[Analysis] Add support for vscale in computeKnownBitsFromOperator · f988f680

David Sherwood authored Sep 16, 2021

In ValueTracking.cpp we use a function called
computeKnownBitsFromOperator to determine the known bits of a value.
For the vscale intrinsic if the function contains the vscale_range
attribute we can use the maximum and minimum values of vscale to
determine some known zero and one bits. This should help to improve
code quality by allowing certain optimisations to take place.

Tests added here:

  Transforms/InstCombine/icmp-vscale.ll

Differential Revision: https://reviews.llvm.org/D109883

f988f680

Sep 19, 2021
- [InstCombine] add/adjust tests for min/max intrinsics; NFC · 9555d1ed
  Sanjay Patel authored Sep 19, 2021
```
If we transform these, we have to propagate no-wrap/undef carefully.
```
  9555d1ed
Sep 18, 2021

[Tests] Fix noalias metadata in one more test · abe21da6

Nikita Popov authored Sep 18, 2021

Missed this one in 80110aaf. This
is another test mixing up alias scopes and alias scope lists.

abe21da6

[Tests] Fix incorrect noalias metadata · 80110aaf

Nikita Popov authored Sep 16, 2021

Mostly this fixes cases where !noalias or !alias.scope were passed
a scope rather than a scope list. In some cases I opted to drop
the metadata entirely instead, because it is not really relevant
to the test.

80110aaf

Precommit tests for D109807 "[InstCombine] Narrow type of logical operation chains when possible" · d841c72e
Usman Nadeem authored Sep 18, 2021
```
Change-Id: Iae9bf18619e4926301a866c7e2bd38ced524804e
```
d841c72e

[AArch64][SVE][InstCombine] Fold redundant zip1/2(uzp1/2) operations · 757384ab

Usman Nadeem authored Sep 12, 2021

    zip1(uzp1(A, B), uzp2(A, B)) --> A
    zip2(uzp1(A, B), uzp2(A, B)) --> B

Differential Revision: https://reviews.llvm.org/D109666

Change-Id: I4a6578db2fcef9ff71ad0e77b9fe08354e6dbfcd

757384ab

Sep 17, 2021
- [InstCombine] add tests for min/max intrinsics with offset operand; NFC · 6da35036
  Sanjay Patel authored Sep 17, 2021
  
  6da35036
- [NFC] Precommit tests for D109954 · d01e0c8c
  Dávid Bolvanský authored Sep 17, 2021
  
  d01e0c8c
- [InstCombine] allow splat vectors for narrowing masked fold · 41ff7612
  Sanjay Patel authored Sep 17, 2021
```
Mostly cosmetic diffs, but the use of m_APInt matches splat constants.
```
  41ff7612
- [InstCombine] add vector tests for 'and' folds; NFC · 3a587ed2
  Sanjay Patel authored Sep 17, 2021
  
  3a587ed2
Sep 16, 2021
- [InstCombine] Added llvm.powi optimizations · a4a426c9
  Dávid Bolvanský authored Sep 16, 2021
```
If power is even:
powi(-x, p) -> powi(x, p)
powi(fabs(x), p) -> powi(x, p)
powi(copysign(x, y), p) -> powi(x, p)
```
  a4a426c9
- [NFC] Added tests for llvm.powi optimizations · c0afb009
  Dávid Bolvanský authored Sep 16, 2021
  
  c0afb009
- Revert "[InstCombine] Improve TryToSinkInstruction with multiple uses" · f9e4aebe
  Anna Thomas authored Sep 15, 2021
```
This reverts commit 4ac4e521.
There are couple of test failures, which needs update of the test cases.

Doing a clean revert and will recommit the change along with fixed
testcases.
```
  f9e4aebe
Sep 15, 2021

[InstCombine] Improve TryToSinkInstruction with multiple uses · 4ac4e521

Anna Thomas authored Sep 15, 2021

This patch allows sinking an instruction which can have multiple uses in a
single user. We were previously over-restrictive by looking for exactly one use,
rather than one user.

Also, the API for retrieving undroppable user has been updated accordingly since
in both usecases (Attributor and InstCombine), we seem to care about the user,
rather than the use.

Reviewed-By: nikic

Differential Revision: https://reviews.llvm.org/D109700

4ac4e521

[InstCombine] move extend after insertelement if both operands are extended · e5a32d72

Sanjay Patel authored Sep 15, 2021

I was wondering how instcombine does on the examples in D109236,
and we're missing a basic transform:

inselt (ext X), (ext Y), Index --> ext (inselt X, Y, Index)

https://alive2.llvm.org/ce/z/z2aBu9

Note that there are several possible extensions of this fold
(see TODO comments).

Differential Revision: https://reviews.llvm.org/D109537

e5a32d72

[InstCombine] Update test checks through autogeneration, add more tests. NFC · 36ef65ad
Anna Thomas authored Sep 15, 2021
```
Updated check lines.
Tests precommitted from D109700.
```
36ef65ad

[InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y) · f5d89523

Filipp Zhinkin authored Sep 15, 2021

Enabled mul folding optimization that was previously disabled
by being incorrect.
To preserve correctness, mul's operand that is not compared
with zero in select's condition is now frozen.

Related bug: https://bugs.llvm.org/show_bug.cgi?id=51286

Correctness:
https://alive2.llvm.org/ce/z/bHef7J
https://alive2.llvm.org/ce/z/QcR7sf
https://alive2.llvm.org/ce/z/vvBLzt
https://alive2.llvm.org/ce/z/jGDXgq
https://alive2.llvm.org/ce/z/3Pe8Z4
https://alive2.llvm.org/ce/z/LGga8M
https://alive2.llvm.org/ce/z/CTG5fs

Differential Revision: https://reviews.llvm.org/D108408

f5d89523

Sep 14, 2021

[ARM] Teach DemandedVectorElts about VMOVN lanes · 5a6dfbb8

David Green authored Sep 14, 2021

The class of instructions that write to narrow top/bottom lanes only
demand the even or odd elements of the input lanes. Which means that a
pair of VMOVNT; VMOVNB demands no lanes from the original input. This
teaches that to instcombine from the target hooks available through
ARMTTIImpl.

Differential Revision: https://reviews.llvm.org/D109325

5a6dfbb8

Sep 13, 2021
- [InstCombine] Add PR51784 test cases · 6d970e83
  Simon Pilgrim authored Sep 11, 2021
  
  6d970e83
Sep 12, 2021

[InstCombine] remove casts from splat-a-bit pattern · 3a126134

Sanjay Patel authored Sep 12, 2021

https://alive2.llvm.org/ce/z/_AivbM

This case seems clear since we can reduce instruction count
and avoid an intermediate type change, but we might want to
use mask-and-compare for other sequences.

Currently, we can generate more instructions on some related
patterns by trying to use bit-hacks instead of mask+cmp, so
something is not behaving as expected.

3a126134

Sep 11, 2021

[InstCombine] update code/test comments; NFC · 75e8eb2b

Sanjay Patel authored Sep 11, 2021

Follow-up for post-commit suggestion on:
28afaed6

The comments were partly copied from the original
code, but not updated to match the new code.

75e8eb2b

[InstCombine] fold sub of min/max intrinsics with invertible ops · 28afaed6

Sanjay Patel authored Sep 11, 2021

This is a translation of the existing code to handle the intrinsics
and another step towards D98152.

https://alive2.llvm.org/ce/z/jA7eBC

This pattern is already handled by underlying folds if there are
less uses, so the minimal tests in this case have extra uses.

The larger cmyk tests show the motivation - when combined with
other folds, we invert a larger sequence and eliminate 'not' ops.

28afaed6

Revert "Revert "[AArch64][SVE][InstCombine] Canonicalize aarch64_sve_dup_x... · ab111e98

Usman Nadeem authored Sep 10, 2021

Revert "Revert "[AArch64][SVE][InstCombine] Canonicalize aarch64_sve_dup_x intrinsic to IR splat operation""

This reverts commit eee7d225.
Effectively relanding 98c37247
after fixing the failing tests.

Change-Id: I5d7461aeb820a2d5f1895457d824a8de4d316ee5

ab111e98

Sep 10, 2021

Revert "[AArch64][SVE][InstCombine] Canonicalize aarch64_sve_dup_x intrinsic to IR splat operation" · eee7d225
Usman Nadeem authored Sep 10, 2021
```
This reverts commit 98c37247.
```
eee7d225
[AArch64][SVE][InstCombine] Canonicalize aarch64_sve_dup_x intrinsic to IR splat operation · 98c37247
Usman Nadeem authored Sep 10, 2021
```
Differential Revision: https://reviews.llvm.org/D109118

Change-Id: I47adc1984a54bea02bf5a0a767b765afe7e16aa3
```
98c37247
[InstCombine] add tests for sub of min/max intrinsics; NFC · 188375f4
Sanjay Patel authored Sep 10, 2021

188375f4

[OpaquePtr] Forbid mixing typed and opaque pointers · 90ec6dff

Nikita Popov authored Sep 04, 2021

Currently, opaque pointers are supported in two forms: The
-force-opaque-pointers mode, where all pointers are opaque and
typed pointers do not exist. And as a simple ptr type that can
coexist with typed pointers.

This patch removes support for the mixed mode. You either get
typed pointers, or you get opaque pointers, but not both. In the
(current) default mode, using ptr is forbidden. In -opaque-pointers
mode, all pointers are opaque.

The motivation here is that the mixed mode introduces additional
issues that don't exist in fully opaque mode. D105155 is an example
of a design problem. Looking at D109259, it would probably need
additional work to support mixed mode (e.g. to generate GEPs for
typed base but opaque result). Mixed mode will also end up
inserting many casts between i8* and ptr, which would require
significant additional work to consistently avoid.

I don't think the mixed mode is particularly valuable, as it
doesn't align with our end goal. The only thing I've found it to
be moderately useful for is adding some opaque pointer tests in
between typed pointer tests, but I think we can live without that.

Differential Revision: https://reviews.llvm.org/D109290

90ec6dff

[InstCombine] add tests for X == 0 ? 0 : X * Y ; NFC · 745f82b8
Filipp Zhinkin authored Sep 10, 2021
```
These are the tests for D108408 with current baseline results.
```
745f82b8

Sep 09, 2021

[InstCombine] add tests for insertelement with cast ops; NFC · 3cb5aa86
Sanjay Patel authored Sep 09, 2021

3cb5aa86

[InstCombine] remove a buggy set of zext-icmp transforms · 97a4e7b7

Sanjay Patel authored Sep 09, 2021

The motivating case is an infinite loop shown with a reduced test from:
https://llvm.org/PR51762

To solve this, I'm proposing we delete the most obviously broken part of this code.

The bug example shows a fundamental problem: we ask computeKnownBits if a transform
will be profitable, alter the code by creating new instructions, then rely on
computeKnownBits to return the same answer to actually eliminate instructions.

But there's no guarantee that the results will be the same between the 1st and 2nd
calls. In the infinite loop example, we get different answers, so we add
instructions that conflict with some other transform, and we're stuck.

There's at least one other problem visible in the test diff for
`@zext_or_masked_bit_test_uses`: the code doesn't check uses properly, so we can
end up with extra instructions created.

Last, it's not clear if this set of transforms actually improves analysis or
codegen. I spot-checked a few targets and don't see a clear win:
https://godbolt.org/z/x87EWovso

If we do see a regression from this change, codegen seems like the right place to
add a cmp -> bit-hack fold.

If this is too big of a step, we could limit the computeKnownBits calls by not
passing a context instruction and/or limiting the recursion. I checked that those
would stop the infinite loop for PR51762, but that won't guarantee that some other
example does not fall into the same loop.

Differential Revision: https://reviews.llvm.org/D109440

97a4e7b7

Sep 08, 2021
- [InstCombine] add test for zext with 'or' op; NFC · b041b613
  Sanjay Patel authored Sep 08, 2021
  
  b041b613
- [InstCombine] remove unnecessary instructions from test; NFC · 5639946d
  Sanjay Patel authored Sep 08, 2021
  
  5639946d
Sep 07, 2021

[InstCombine] fold icmp equality with 'or' mask ops · a3c1669b

Sanjay Patel authored Sep 07, 2021

This could go either direction since the instruction
count is the same either way, but there are a few
reasons to prefer this:
1. We already do the related transform with 'and'
   (see just above the new code).
2. We try (too hard) to compensate for not having this
   and possibly other folds in transformZExtICmp(),
   and that leads to bugs like https://llvm.org/PR51762 .
3. Codegen looks better across a variety of targets.

https://alive2.llvm.org/ce/z/uEgn4P

a3c1669b

[InstCombine] add tests for icmp with 'or' ops; NFC · 9565457a
Sanjay Patel authored Sep 07, 2021

9565457a

[ConstFold] Support opaque pointers in constexpr GEPs · 58db5f6e

Nikita Popov authored Jul 20, 2021

Support opaque pointers in SymbolicallyEvaluateGEP() by using the
value type of a GlobalValue base or falling back to i8 if there
isn't one. We don't unconditionally generate i8 GEPs here because
that would lose inrange attribues, and because some optimizations
on globals currently rely on GEP types (e.g. the globals SROA
mentioned in the comment).

Differential Revision: https://reviews.llvm.org/D109297

58db5f6e

Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed... · 35fa7b8a

Roman Lebedev authored Sep 07, 2021

Reland "[InstCombine] Recognize `((x * y) s/ x) !=/== y` as an signed multiplication overflow check (PR48769)"

This reverts commit 91f7a4ff,
relanding commit 13ec913b.

The original commit was reverted because of (essentially)
https://bugs.llvm.org/show_bug.cgi?id=35922
which has now been addressed by d0eeb64b.

35fa7b8a