Commits · fd2c15e6024fa5d27dd95a8f57b33deb1eddba39 · Lorenzo Albano / LLVM bpEVL

Mar 18, 2020

[VPlan] Do not print mapping for Value2VPValue. · fd2c15e6

Florian Hahn authored Mar 18, 2020

The latest improvements to VPValue printing make this mapping clear when
printing the operand. Printing the mapping separately is not required
any longer.

Reviewers: rengolin, hsaito, Ayal, gilr

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D76375

fd2c15e6

[VPlan] Record underlying value for VPValues created by addVPValue (NFC). · 00c1cd19

Florian Hahn authored Mar 18, 2020

Now that printing VPValues uses the underlying IR value name, if
available, recording the underlying value here improves printing.

Reviewers: rengolin, hsaito, Ayal, gilr

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D76374

00c1cd19

[LangRef] fix formatting tick; NFC · acaf1442
Sanjay Patel authored Mar 18, 2020

acaf1442

[LangRef] add explanatory text for select poison semantics (PR20895) · faba1d03

Sanjay Patel authored Mar 18, 2020

This is copied from the suggested text by @regehr in:
https://bugs.llvm.org/show_bug.cgi?id=20895

The way forward was not clear for several years, but now that we
have 'freeze' and Alive2, the behavior should be documented.
Also see comments in D76332.

faba1d03

[InstSimplify] Add missing vector masked add tests to show lack of DemandedElts support · 49bdfd88
Simon Pilgrim authored Mar 18, 2020

49bdfd88

Remove CompositeType class. · e24e95fe

Eli Friedman authored Mar 03, 2020

The existence of the class is more confusing than helpful, I think; the
commonality is mostly just "GEP is legal", which can be queried using
APIs on GetElementPtrInst.

Differential Revision: https://reviews.llvm.org/D75660

e24e95fe

[JumpThreading] add a miscompile test based on discussion in D76332; NFC · 22c66c1a
Sanjay Patel authored Mar 18, 2020

22c66c1a

[SelectionDAGBuilder][FPEnv] Take into account SelectionDAG continuous CSE... · 498b5389

Craig Topper authored Mar 18, 2020

[SelectionDAGBuilder][FPEnv] Take into account SelectionDAG continuous CSE when setting the nofpexcept flag for constrained intrinsics

SelectionDAG CSEs nodes based on their result type and operands, but not their flags. The flags are expected to be intersected when they are CSEd. In SelectionDAGBuilder, for FP nodes we manage both the fast math flags and the nofpexcept flag after the nodes have already been CSEd when they were created with getNode. The management of the fastmath flags before the constrained nodes prevents the nofpexcept management from working correctly.

This commit moves the FMF handling for constrained intrinsics into their visitor and disables the common FMF handling for these nodes.

Differential Revision: https://reviews.llvm.org/D75224

498b5389

[ValueTracking] Add computeKnownBits DemandedElts support to XOR instructions (PR36319) · 9d40292a
Simon Pilgrim authored Mar 18, 2020

9d40292a
[InstSimplify] Add missing vector OR test to show lack of DemandedElts support · 47ce1406
Simon Pilgrim authored Mar 18, 2020

47ce1406
[InstSimplify] Regenerate OR tests · 6bdb0efa
Simon Pilgrim authored Mar 18, 2020

6bdb0efa

[TableGen][GlobalISel] Account for HwMode in RegisterBank register sizes · e9f22fd4

lewis-revill authored Mar 18, 2020

This patch generates TableGen descriptions for the specified register
banks which contain a list of register sizes corresponding to the
available HwModes. The appropriate size is used during codegen according
to the current HwMode. As this HwMode was not available on generation,
it is set upon construction of the RegisterBankInfo class. Targets
simply need to provide the HwMode argument to the
<target>GenRegisterBankInfo constructor.

The RISC-V RegisterBankInfo constructor has been updated accordingly
(plus an unused argument removed).

Differential Revision: https://reviews.llvm.org/D76007

e9f22fd4

[TableGen][GlobalISel] Rework RegisterBankEmitter for easier const correctness. · e225e770

lewis-revill authored Mar 18, 2020

This patch rewrites the RegisterBankEmitter class to derive
RegisterClassHierarchy from CodeGenTarget::getRegBank() rather than
constructing our own copy. All are now accessed through a const
reference.

Differential Revision: https://reviews.llvm.org/D76006

e225e770

[AliasAnalysis] Misc fixes for checking aliasing with scalable types. · ebec984e

Eli Friedman authored Mar 16, 2020

This is fixing up various places that use the implicit
TypeSize->uint64_t conversion.

The new overloads in MemoryLocation.h are already used in various places
that construct a MemoryLocation from a TypeSize, including MemorySSA.
(They were using the implicit conversion before.)

Differential Revision: https://reviews.llvm.org/D76249

ebec984e

[ValueTracking] Add computeKnownBits DemandedElts support to... · 1010c44b

Simon Pilgrim authored Mar 18, 2020

[ValueTracking] Add computeKnownBits DemandedElts support to EXTRACTELEMENT/OR/BSWAP/BITREVERSE instructions (PR36319)

These are all covered by the bswap/bitreverse vector tests.

1010c44b

[PowerPC] Remove UB from PPCInstrInfo when handling rotates fed by constants · e009fad3

Nemanja Ivanovic authored Mar 18, 2020

As pointed out in https://bugs.llvm.org/show_bug.cgi?id=45232 this code can
end up shifting a 64-bit unsigned value left by 64 bits. Althought this works
as expected on some platforms it is definitely UB. This patch removes the UB
and adds the associated test case.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=45232

e009fad3

Replace get*Alignment() methods with get*Align() equivalents. · 746bd860
Simon Pilgrim authored Mar 18, 2020
```
Fixes deprecation warning in EXPENSIVE_CHECKS builds.
```
746bd860
[InstSimplify] Add bitreverse/bswap vector tests · 9c6458ec
Simon Pilgrim authored Mar 18, 2020
```
Shows missing DemandedElts support (PR36319)
```
9c6458ec

[GlobalISel] Port some basic undef combines from DAGCombiner.cpp · dc5f9826

Jessica Paquette authored Mar 17, 2020

This ports some combines from DAGCombiner.cpp which perform some trivial
transformations on instructions with undef operands.

Not having these can make it extremely annoying to find out where we differ
from SelectionDAG by looking at existing lit tests. Without them, we tend to
produce pretty bad code generation when we run into instructions which use
undef operands.

Also remove the nonpow2_store_narrowing testcase from arm64-fallback.ll, since
we no longer fall back on the add.

Differential Revision: https://reviews.llvm.org/D76339

dc5f9826

[Dominators] Fixup comments in GenericDominatorTreeConstruction. NFC. · 1e4ee0bf

Jakub Kuderski authored Mar 18, 2020

Reviewers: asbirlea, brzycki, NutshellySima, grosser

Reviewed By: asbirlea, NutshellySima

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76340

1e4ee0bf

Support repeated machine outlining · 0d896278

Jin Lin authored Mar 17, 2020

Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019.

Reviewers: aschwaighofer, tellenbach, paquette

Reviewed By: paquette

Subscribers: tellenbach, hiraditya, llvm-commits, jinlin

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71027

0d896278

[VPlan] Use underlying value for printing, if available. · e6a74803

Florian Hahn authored Mar 18, 2020

When the an underlying value is available, we can use its name for
printing, as discussed in D73078.

Reviewers: rengolin, hsaito, Ayal, gilr

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D76200

e6a74803

[ARM,MVE] Add intrinsics for the VQDMLAD family. · e13d153c

Simon Tatham authored Mar 18, 2020

Summary:
This is another set of instructions too complicated to be sensibly
expressed in IR by anything short of a target-specific intrinsic.
Given input vectors a,b, the instruction generates intermediate values
2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high
half of each double-width values, and overwrites half the lanes in the
output vector c, which you therefore have to provide the input value
of. Optionally you can swap the elements of b so that the are things
like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when
taking the high half; and optionally you can take the difference
rather than sum of the two products. Finally, saturation is applied
when converting back to a single-width vector lane.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76359

e13d153c

[gn build] remove a workaround that is no longer needed · 642a424b
Nico Weber authored Mar 18, 2020

642a424b
[NFC][PowerPC] Update test · fc2a5ef9
Sam Parker authored Mar 18, 2020
```
Run the update script on one of the loop unroll tests.
```
fc2a5ef9

AMDGPU: Initial, crude support for indirect calls · 4ea1baf6

Matt Arsenault authored Mar 16, 2020

This isn't really usable, and requires using the
-amdgpu-fixed-function-abi flag to work.

Assumes a uniform call target, and will hit a verifier error if the
call target ends up in a VGPR. Also doesn't attempt to do anything
sensible for the reported register/stack usage.

4ea1baf6

Reapply "AMDGPU/GlobalISel: Fully handle 0 dmask case during legalize" · ea4597ee

Matt Arsenault authored Mar 18, 2020

This reverts commit 9bca8fc4.

Rearrange handling to avoid changing the instruction in the case where
it's going to be erased and replaced with undef.

ea4597ee

[AMDGPU] Fix AMDGPUUnifyDivergentExitNodes · d1a7bfca

Piotr Sobczak authored Mar 18, 2020

Summary:
For the case where "done" bits on existing exports are removed
by unifyReturnBlockSet(), unify all return blocks - even the
uniformly reached ones. We do not want to end up with a non-unified,
uniformly reached block containing a normal export with the "done"
bit cleared.

That case is believed to be rare - possible with infinite loops
in pixel shaders.

This is a fix for D71192.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76364

d1a7bfca

[gn build] add rebase changes that should have been in 9f981e9a · f57290ec
Nico Weber authored Mar 18, 2020

f57290ec
[ValueTracking] Add computeKnownBits DemandedElts support to AND instructions (PR36319) · 06150e83
Simon Pilgrim authored Mar 18, 2020

06150e83
Reland "[gn build] (manually) port 8b409eab" · 9f981e9a
Nico Weber authored Mar 18, 2020
```
This reverts commit 4060016f
and re-merges c5b81466.
```
9f981e9a

[InstCombine] GEPOperator::accumulateConstantOffset does not support scalable vectors · ef64ba83

Sander de Smalen authored Mar 18, 2020

Avoid transforming:

 %0 = bitcast i8* %base to <vscale x 16 x i8>*
 %1 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %0, i64 1

into:

 %0 = getelementptr i8, i8* %base, i64 16
 %1 = bitcast i8* %0 to <vscale x 16 x i8>*

Reviewers: efriedma, ctetreau

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76236

ef64ba83

[PowerPC][AIX] Implement by-val caller arguments in a single register. · c2186647

Chris Bowler authored Mar 18, 2020

This is the first of a series of patches that adds caller support for
by-value arguments. This patch add support for arguments that are passed in a
single GPR.

There are 3 limitation cases:
-The by-value argument is larger than a single register.
-There are no remaining GPRs even though the by-value argument would
otherwise fit in a single GPR.
-The by-value argument requires alignment greater than register width.

Future patches will be required to add support for these cases as well
as for the callee handling (in LowerFormalArguments_AIX) that
corresponds to this work.

Differential Revision: https://reviews.llvm.org/D75863

c2186647

[InstCombine][X86] Add additional demandedelts style test for in-range... · 24c2e613

Simon Pilgrim authored Mar 18, 2020

[InstCombine][X86] Add additional demandedelts style test for in-range variable per-element shift amounts (PR40391)

If we've shuffled the shift amount some of the (undemanded) elements may have become undef - this should be handled by the missing support in PR36319.

24c2e613

Fix `warning: extra ‘;’` (NFC) · 4d506da9
Mehdi Amini authored Mar 18, 2020

4d506da9

Fix build with gcc 7.5 by adding a "redundant move" · f3e297d9

Mehdi Amini authored Mar 18, 2020

The constructor of Expected<T> expects as T&&, but gcc-7.5 does not
infer an rvalue in this context apparently.

f3e297d9

[NFCI][SCEV] Avoid recursion in SCEVExpander::isHighCostExpansion*() · 85334b03

Roman Lebedev authored Mar 18, 2020

Summary:
As noted in [[ https://bugs.llvm.org/show_bug.cgi?id=45201 | PR45201 ]],
[[ https://bugs.llvm.org/show_bug.cgi?id=10090 | PR10090 ]] SCEV doesn't
always avoid recursive algorithms, and that causes issues with
large expression depths and/or smaller stack sizes.

In `SCEVExpander::isHighCostExpansion*()` case, the refactoring to avoid
recursion is rather idiomatic. We simply need to place the root expr
into a vector, and iterate over vector elements accounting for the cost
of each one, adding new exprs at the end of the vector,
thus achieving recursion-less traversal.

The order in which we will visit exprs doesn't matter here,
so we will be fine with the most basic approach of using SmallVector
and inserting/extracting from the back, which accidentally is the same
depth-first traversal that we were doing previously recursively.

Reviewers: mkazantsev, reames, wmi, ekatz

Reviewed By: mkazantsev

Subscribers: hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76273

85334b03

[IPRA][ARM] Spill extra registers at -Oz · 73cea83a

Oliver Stannard authored Jul 18, 2019

When optimising for code size at the expense of performance, it is often
worth saving and restoring some of r0-r3, if IPRA will be able to take
advantage of them. This doesn't cost any extra code size if we already
have a PUSH/POP pair, and increases the number of available registers
across any calls to the function.

We already have an optimisation which tries fold the subtract/add of the
SP into the PUSH/POP by using extra registers, which somewhat conflicts
with this. I've made the new optimisation less aggressive in cases where
the existing one is likely to trigger, which gives better results than
either of these optimisations by themselves.

Differential revision: https://reviews.llvm.org/D69936

73cea83a

[Alignment][NFC] Deprecate getMaxAlignment · d000655a

Guillaume Chatelet authored Mar 18, 2020

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76348

d000655a

[NFC][PowerPC] Add a new MIR file to test if-converter pass · 96b70809
Kang Zhang authored Mar 18, 2020

96b70809