Commits · 0eee844539e406dfa8010a129ea3655d2298ac10 · Lorenzo Albano / LLVM bpEVL

Nov 30, 2021

[DebugInfo][InstrRef] Terminate overlapping variable fragments · 0eee8445

Jeremy Morse authored Nov 29, 2021

If we have a variable where its fragments are split into overlapping
segments:

    DBG_VALUE $ax, $noreg, !123, !DIExpression(DW_OP_LLVM_fragment_0, 16)
    ...
    DBG_VALUE $eax, $noreg, !123, !DIExpression(DW_OP_LLVM_fragment_0, 32)

we should only propagate the most recently assigned fragment out of a
block. LiveDebugValues only deals with live-in variable locations, as
overlaps within blocks is DbgEntityHistoryCalculators domain.

InstrRefBasedLDV has kept the accumulateFragmentMap method from
VarLocBasedLDV, we just need it to recognise DBG_INSTR_REFs. Once it's
produced a mapping of variable / fragments to the overlapped variable /
fragments, VLocTracker uses it to identify when a debug instruction needs
to terminate the other parts it overlaps with. The test is updated for
some standard "InstrRef picks different registers" variation, and the
order of some unrelated DBG_VALUEs changes.

Differential Revision: https://reviews.llvm.org/D114603

0eee8445

[CVP] Remove ashr of -1 or 0 · 45ecfed6

Fabian Wolff authored Nov 29, 2021

Fixes PR#52190. There is already a check for converting ashr instructions with non-negative left-hand sides into lshr; this patch adds an optimization to remove ashr altogether if the left-hand side is known to be in the range [-1, 1).

Differential Revision: https://reviews.llvm.org/D113835

45ecfed6

[SCEVExpander] Drop poison generating flags when reusing instructions · 8906a0fe

Philip Reames authored Nov 29, 2021

The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct.

This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid.

In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled.

On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE.

The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB.

Differential Revision: https://reviews.llvm.org/D112734

8906a0fe

Nov 29, 2021

[DebugInfo][InstrRef][NFC] "Final" x86 test cleanup · fc9dae42

Jeremy Morse authored Nov 29, 2021

These are some final test changes for using instruction referencing on X86:
 * Most of these tests just have the flag switched so that they run with
   instr-ref, and just work: these tests were fixed by earlier patches.
 * There are some spurious differences in textual outputs,
 * A few have different temporary labels in the output because more
   MCSymbols are printed to the output.

Differential Revision: https://reviews.llvm.org/D114588

fc9dae42

[unroll] Use early return in shouldPartialUnroll [nfc] · f50207c0
Philip Reames authored Nov 29, 2021

f50207c0
[unroll] Reduce scope of UnrollFactor variable in computeUnrollCount [NFC] · a655e0f9
Philip Reames authored Nov 29, 2021
```
Suggested in review of D114453, done as a separate change to get all uses at once.
```
a655e0f9

[DebugInfo][InstrRef] Add indirection from dbg.declare in SelectionDAG · a20987ad

Jeremy Morse authored Nov 29, 2021

Usually dbg.declares get translated into either entries in an MF
side-table, or a DBG_VALUE on entry to the function with IsIndirect set
(including in instruction referencing mode). Much rarer is a dbg.declare
attached to a non-argument value, such as in the test added in this patch
where there's a variable-length-array. Such dbg.declares become SDDbgValue
nodes with InIndirect=true.

As it happens, we weren't correctly emitting DBG_INSTR_REFs with the
additional indirection. This patch adds the extra indirection, encoded as
adding an additional DW_OP_deref to the expression.

Differential Revision: https://reviews.llvm.org/D114440

a20987ad

[unroll] Split full exact and full bound unroll costing [NFC] · 829b62ad

Philip Reames authored Nov 29, 2021

This change should be NFC. It's posted for review mostly to make sure others are happy with the names I'm introducing for "exact full unroll" and "bounded full unroll". The motivation here is that our cost model for bounded unrolling is too aggressive - it gives benefits for exits we aren't going to prune - but I also just think the new version of the code is a lot easier to follow.

Differential Revision: https://reviews.llvm.org/D114453

829b62ad

[ELF] --cref: If -Map is specified, print to the map file · 1ce51a5f

Fangrui Song authored Nov 29, 2021

PR48282: This behavior matches GNU ld and gold.

Reviewed By: markj

Differential Revision: https://reviews.llvm.org/D114663

1ce51a5f

[InstCombine] try to fold 'or' into 'mul' operand · 99f8b795

Sanjay Patel authored Nov 29, 2021

or (mul X, Y), X --> mul X, (add Y, 1) (when the multiply has no common bits with X)

We already have this fold if the pattern ends in 'add', but we can miss it if the
'add' becomes 'or' via another no-common-bits transform.

This is part of fixing:
http://llvm.org/PR49055
...but it won't make a difference on that example yet.

https://alive2.llvm.org/ce/z/Vrmoeb

Differential Revision: https://reviews.llvm.org/D114729

99f8b795

[DebugInfo][InstrRef] Preserve properties of restored variables · 9cf31b8d

Jeremy Morse authored Nov 29, 2021

InstrRefBasedLDV observes when variable locations are clobbered, scans what
values are available in the machine, and re-issues a DBG_VALUE for the
variable if it can find another location. Unfortunately, I hadn't joined up
the Indirectness flag, so if it did this to an Indirect Value, the
indirectness would be dropped.

Fix this, and add a test that if we clobber a variable value (on the stack
in this case), then the recovered variable location keeps the Indirect
flag.

Differential Revision: https://reviews.llvm.org/D114378

9cf31b8d

[DAG] Add tests for fpsti.sat for various architectures. NFC · 410d2764
David Green authored Nov 29, 2021

410d2764

OpenMP: Correctly query location for amdgpu-arch · 935abeaa

Matt Arsenault authored Nov 19, 2021

This was trying to figure out the build path for amdgpu-arch, and
making assumptions about where it is which were not working on my
system. Whether a standalone build or not, we should have a proper
imported target to get the location from.

935abeaa

Update unit test API usage (NFC) · 4f215bfa
Adrian Prantl authored Nov 29, 2021

4f215bfa

[DebugInfo][InstrRef][NFC] Test changes: DBG_VALUE to DBG_INSTR_REF · 32815bc9

Jeremy Morse authored Nov 29, 2021

This patch contains a bunch of replacements of:

    DBG_VALUE $somereg

with,

    SOMEINST debug-instr-number1
    DBG_INSTR_REF 1, 0, ...

It's mostly SelectionDAG tests that are making sure that the variable
location assignment is placed in the correct position in the instructions.

To avoid a loss in test coverage of SelectionDAG, which is used by a lot
of different backends, all these tests now have two modes and sets of RUN
lines, one for DBG_VALUE mode, the other for instruction referencing.

Differential Revision: https://reviews.llvm.org/D114258

32815bc9

Revert "OpenMP: Start calling setTargetAttributes for generated kernels" · 25eb7fa0
Matt Arsenault authored Nov 29, 2021
```
This reverts commit 6c27d389.

This is failing on the buildbots
```
25eb7fa0
[mlir][sparse] some leftover cleanup from migration to bufferization dialect · 52668355
Aart Bik authored Nov 29, 2021
```
Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D114730
```
52668355
[LICM] Regenerate test checks (NFC) · eee03523
Nikita Popov authored Nov 29, 2021

eee03523
[InstCombine] add tests for or with mul operand; NFC · 142044a0
Sanjay Patel authored Nov 29, 2021

142044a0

[InstCombine] (~(a | b) & c) | ~(c | (a ^ b)) -> ~((a | b) & (c | (b ^ a))) · 5c6b9e16

Stanislav Mekhanoshin authored Nov 24, 2021

```
----------------------------------------
define i3 @src(i3 %a, i3 %b, i3 %c) {
%0:
  %or1 = or i3 %b, %c
  %not1 = xor i3 %or1, 7
  %and1 = and i3 %a, %not1
  %xor1 = xor i3 %b, %c
  %or2 = or i3 %xor1, %a
  %not2 = xor i3 %or2, 7
  %or3 = or i3 %and1, %not2
  ret i3 %or3
}
=>
define i3 @tgt(i3 %a, i3 %b, i3 %c) {
%0:
  %obc = or i3 %b, %c
  %xbc = xor i3 %b, %c
  %o = or i3 %a, %xbc
  %and = and i3 %obc, %o
  %r = xor i3 %and, 7
  ret i3 %r
}
Transformation seems to be correct!
```

Differential Revision: https://reviews.llvm.org/D112955

5c6b9e16

[NFC][clang]Increase the number of driver diagnostics · 3c32c568

Steven Wan authored Nov 29, 2021

We're close to hitting the limited number of driver diagnostics, increase `DIAG_SIZE_DRIVER` to accommodate more.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D114615

3c32c568

[HIP] Add atomic load, atomic store and atomic cmpxchng_weak builtin support in HIP-clang · df0560ca

Anshil Gandhi authored Nov 29, 2021

Introduce `__hip_atomic_load`, `__hip_atomic_store` and `__hip_atomic_compare_exchange_weak`
builtins in HIP.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D114553

df0560ca

[LLDB][NativePDB] fix find-functions.cpp failure on windows bots (2) · fe270ab0
Zequan Wu authored Nov 29, 2021

fe270ab0
OpenMP: Start calling setTargetAttributes for generated kernels · 6c27d389
Matt Arsenault authored Nov 09, 2021
```
This wasn't setting any of the attributes the target would expect to
emit for kernels.
```
6c27d389

[libc++] Fix incorrect REQUIRES on a locale-dependent test · a8278a74

Louis Dionne authored Nov 29, 2021

The test doesn't depend specifically on the en_US.UTF-8 locale, instead
it depends on whether localization support exists, period.

Differential Revision: https://reviews.llvm.org/D114708

a8278a74

[NFC][AIX]Disable unsupported hip test on AIX · 23dc8862

Steven Wan authored Nov 29, 2021

AIX doesn't support GPU. There is no point testing HIP on it.

Reviewed By: Jake-Egan

Differential Revision: https://reviews.llvm.org/D114484

23dc8862

[LLDB][NativePDB] fix find-functions.cpp failure on windows bots · 34d02fad
Zequan Wu authored Nov 29, 2021

34d02fad
[mlir] Handle an edge case when folding reshapes with multiple trailing 1 dimensions · 8d474f1d
Benjamin Kramer authored Nov 29, 2021
```
We would exit early and miss this case.

Differential Revision: https://reviews.llvm.org/D114711
```
8d474f1d
[llvm] Use range-based for loops (NFC) · f240e528
Kazu Hirata authored Nov 29, 2021

f240e528

[InstCombine] Fold (~A | B) ^ A --> ~(A & B) · c572eb1a

Mehrnoosh Heidarpour authored Nov 29, 2021

https://alive2.llvm.org/ce/z/gLrYPk

Fixes:
https://llvm.org/PR52518

Reviewed by: spatel

Differential revision: https://reviews.llvm.org/D114339

c572eb1a

[SCEV] Remove incorrect assert · 77dd5798

Nikita Popov authored Nov 29, 2021

Fix assertion failure reported on D113349 by removing the assert.
While the produced expression should be equivalent, it may not
be strictly the same, e.g. due to lazy nowrap flag updates. Similar
to what the main createSCEV() code does, simply retain the old
value map entry if one already exists.

77dd5798

[HWASan] Disable LTO test on aarch64. · 2022e2fc
Matt Morehouse authored Nov 29, 2021
```
It fails for non-Android aarch64 bots as well.
```
2022e2fc

[fir] Get rid of the global option in FIRBuilder · 1cc3b135

Valentin Clement authored Nov 29, 2021

Replace the global option `nameLengthHashSize` with a constexpr
with the same name. The option was not used in fir-dev so switching
to a constexpr is fine.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D114630

1cc3b135

[X86][Costmodel] `getInterleavedMemoryOpCostAVX512()`: masked load can not be folded into a shuffle · 7e73c2a6
Roman Lebedev authored Nov 29, 2021
```
The mask on the shuffle is for the output, not the input.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D114697
```
7e73c2a6

[AMDGPU][GlobalISel] Transform (fsub (fpext (fneg (fmul x, y))), z) -> (fneg... · 0dd570ff

Mirko Brkusanin authored Nov 26, 2021

[AMDGPU][GlobalISel] Transform (fsub (fpext (fneg (fmul x, y))), z) -> (fneg (fma (fpext x), (fpext y), z))

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98050

0dd570ff

[AMDGPU][GlobalISel] Transform (fsub (fpext (fmul x, y)), z) -> (fma (fpext... · 37c2a220

Mirko Brkusanin authored Nov 26, 2021

[AMDGPU][GlobalISel] Transform (fsub (fpext (fmul x, y)), z) -> (fma (fpext x), (fpext y), (fneg z))

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98049

37c2a220

[AMDGPU][GlobalISel] Transform (fsub (fneg (fmul, x, y)), z) -> (fma (fneg x), y, (fneg z)) · 5fe7fcd2
Mirko Brkusanin authored Nov 26, 2021
```
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98048
```
5fe7fcd2
[AMDGPU][GlobalISel] Transform (fsub (fmul x, y), z) -> (fma x, y, -z) · a7821692
Mirko Brkusanin authored Nov 26, 2021
```
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D96614
```
a7821692

[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fpext (fmul u, v))), z) ->... · e5e49a08

Mirko Brkusanin authored Nov 26, 2021

[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fpext (fmul u, v))), z) -> (fma x, y, (fma (fpext u), (fpext v), z))

Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D98047

e5e49a08

[AMDGPU][GlobalISel] Transform (fadd (fma x, y, (fmul u, v)), z) -> (fma x, y, (fma u, v, z)) · f7322925
Mirko Brkusanin authored Nov 26, 2021
```
Patch by: Mateja Marjanovic

Differential Revision: https://reviews.llvm.org/D97938
```
f7322925