Commits · d938ec4509c47d461377527fc2877ae14b91275c · Lorenzo Albano / LLVM bpEVL

Jun 19, 2020

[AArch64] Avoid incompatibility between SLSBLR mitigation and BTI codegen. · d938ec45

Kristof Beyls authored May 22, 2020

A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BL change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).

Therefore, this patch simply changes the function stubs for the SLSBLR
mitigation from
__llvm_slsblr_thunk_x<N>:
    br x<N>
    SpeculationBarrier
to
__llvm_slsblr_thunk_x<N>:
    mov x16, x<N>
    br  x16
    SpeculationBarrier

Differential Revision: https://reviews.llvm.org/D81405

d938ec45

[MC] Pass the symbol rather than its name to onSymbolStart() · 5bd33de9

Ronak Chauhan authored Jun 18, 2020

Summary: This allows targets to also consider the symbol's type and/or address if needed.

Reviewers: scott.linder, jhenderson, MaskRay, aardappel

Reviewed By: scott.linder, MaskRay

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82090

5bd33de9

[llvm][SVE] Reg + reg addressing mode for LD1RO. · d32c1346

Francesco Petrogalli authored Jun 19, 2020

Reviewers: efriedma, sdesmalen

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80741

d32c1346

[PowerPC] Canonicalize shuffles to match more single-instruction masks on LE · 1fed1316

Nemanja Ivanovic authored Jun 18, 2020

We currently miss a number of opportunities to emit single-instruction
VMRG[LH][BHW] instructions for shuffles on little endian subtargets. Although
this in itself is not a huge performance opportunity since loading the permute
vector for a VPERM can always be pulled out of loops, producing such merge
instructions is useful to downstream optimizations.
Since VPERM is essentially opaque to all subsequent optimizations, we want to
avoid it as much as possible. Other permute instructions have semantics that can
be reasoned about much more easily in later optimizations.

This patch does the following:
- Canonicalize shuffles so that the first element comes from the first vector
  (since that's what most of the mask matching functions want)
- Switch the elements that come from splat vectors so that they match the
  corresponding elements from the other vector (to allow for merges)
- Adds debugging messages for when a shuffle is matched to a VPERM so that
  anyone interested in improving this further can get the info for their code

Differential revision: https://reviews.llvm.org/D77448

1fed1316

AMDGPU/GlobalISel: Remove selection of MAD/MAC when not available · 8f3b2c8a

Carl Ritson authored Jun 19, 2020

Add code to respect mad-mac-f32-insts target feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D81990

8f3b2c8a

[StackSafety] Add "Must Live" logic · fcd67665

Vitaly Buka authored Jun 18, 2020

Summary:
Extend StackLifetime with option to calculate liveliness
where alloca is only considered alive on basic block entry
if all non-dead predecessors had it alive at terminators.

Depends on D82043.

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82124

fcd67665

[NFC] Refactor Registry loops to range for · 8b0df1c1
Nathan James authored Jun 19, 2020

8b0df1c1

[StackSafety] Add pass for StackLifetime testing · f672791e

Vitaly Buka authored Jun 17, 2020

Summary: lifetime.ll is a copy of SafeStack/X86/coloring2.ll

Reviewers: eugenis

Reviewed By: eugenis

Subscribers: hiraditya, mgrang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82043

f672791e

Jun 18, 2020

ARC: Enforce function alignment at code emission time · bbd78519

Matt Arsenault authored Jun 18, 2020

Don't do this in the MachineFunctionInfo constructor. Also, ensure the
alignment rather than overwriting it outright. I vaguely remember
there was another place to enforce the target minimum alignment, but I
couldn't find it (it's there for instructions).

bbd78519

AMDGPU/GlobalISel: Implement computeKnownAlignForTargetInstr · 95605b78
Matt Arsenault authored Jun 05, 2020
```
We probably need to move where intrinsics are lowered to copies to
make this useful.
```
95605b78

BypassSlowDivision: Fix dropping debug info · b13f6b0f

Matt Arsenault authored Jun 10, 2020

I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.

One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.

b13f6b0f

[PowerPC][Power10] Implement Parallel Bits Deposit/Extract Builtins in LLVM/Clang · c45c1611

Amy Kwan authored Jun 18, 2020

This patch implements builtins for the following prototypes:

vector unsigned long long vec_pdep(vector unsigned long long, vector unsigned long long);
vector unsigned long long vec_pext(vector unsigned long long, vector unsigned long long __b);
unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
unsigned long long __builtin_pextd (unsigned long long, unsigned long long);

Revision Depends on D80758

Differential Revision: https://reviews.llvm.org/D80935

c45c1611

GlobalISel: Pass LegalizerHelper to custom legalize callbacks · 7f8b2e1b

Matt Arsenault authored Jun 09, 2020

This was passing in all the parameters needed to construct a
LegalizerHelper in the custom legalization, when it's simpler to just
pass in the existing helper.

This is slightly more annoying to use in the common case where you
don't need the legalizer helper, but we could add back the common
parameters back in addition to the helper.

I didn't propagate this to all the internal target changes that this
logically implies, but did update a sample one for
legalizeMinNumMaxNum.

This is in preparation for moving AMDGPU load/store legalization
entirely into custom lowering. The current set of legalization actions
is really constraining and not really capable of expressing all the
actions needed to legalize loads/stores. In particular there's no way
to express when the memory access itself needs to change size vs. the
result type. There's also a lot of redundancy since the same
split/widen actions need to be applied in both vector and scalar
cases. All of the sub-cases logically belong as steps in the legalizer
helper, but it will be easier to consider everything at once in custom
lowering.

7f8b2e1b

[SVE] Remove calls to VectorType::getNumElements from Transforms/Utils · 8d11ec66

Christopher Tetreault authored Jun 18, 2020

Reviewers: efriedma, c-rhodes, david-arm, Tyker, asbirlea

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82057

8d11ec66

[CodeView] Revert and... · 2ae0df5b

Alexandre Ganea authored Jun 18, 2020

[CodeView] Revert 8374bf43 and 403f9537

This reverts:
8374bf43 [CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.
403f9537 [CodeView] Add full repro to LF_BUILDINFO record

This is causing the lld/test/COFF/pdb-relative-source-lines.test to fail: http://lab.llvm.org:8011/builders/lld-x86_64-win/builds/1096/steps/test-check-all/logs/FAIL%3A%20lld%3A%3Apdb-relative-source-lines.test
And clang/test/CodeGen/debug-info-codeview-buildinfo.c fails as well: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/33346/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Adebug-info-codeview-buildinfo.c

2ae0df5b

[BasicBlock] Added AnnotationWriter functionality to BasicBlock class · 41d53194

Kirill Naumov authored Jun 06, 2020

This functionality is very similar to Function compatibility with
AnnotationWriter. This change allows us to use AnnotationWriter with
BasicBlock through BB.print() method.

Reviewed-By: apilipenko
Differntial Revision: https://reviews.llvm.org/D81321

41d53194

[IRBuilder] add/use wrapper to create a generic compare based on predicate type; NFC · 46a285ad
Sanjay Patel authored Jun 18, 2020
```
The predicate can always be used to distinguish between icmp and fcmp,
so we don't need to keep repeating this check in the callers.
```
46a285ad

[SimplifyCFG] Update debug location when folding branch to common destination · 8cdd2a15

Davide Italiano authored Jun 18, 2020

Sometimes a dead block gets folded and the debug information is still
retained. This manifests as jumpy stepping in lldb, see the bugzilla PR
for an end-to-end C testcase.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46008

Differential Revision:  https://reviews.llvm.org/D82062

8cdd2a15

[TTI] Expose isNoopAddrSpaceCast in TTI. · 2defe557

Michael Liao authored Jun 09, 2020

Reviewers: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82025

2defe557

Fix return status of LoopDistribute · 4dd33272

serge-sans-paille authored Jun 04, 2020

Move code that may update the IR after precondition, so that if precondition
fail, the IR isn't modified.

Differential Revision: https://reviews.llvm.org/D81225

4dd33272

AMDGPU: Remove mayLoad/mayStore from some side effecting intrinsics · 779cba79
Matt Arsenault authored May 28, 2020
```
These don't really modify any memory, and should not expect memory
operands.
```
779cba79

[AMDGPU] Added new encoding to getMCOpcodeGen · 6c7e1b16

Stanislav Mekhanoshin authored Jun 16, 2020

Nothing breaks yet, but all encodings shall be in the map.

Differential Revision: https://reviews.llvm.org/D81974

6c7e1b16

[GlobalOpt] Remove preallocated calls when possible · 91ef9305

Arthur Eubanks authored Jun 01, 2020

When possible (e.g. internal linkage), strip preallocated attribute off
parameters/arguments.
This requires removing the "preallocated" operand bundle from the call
site, replacing @llvm.call.preallocated.arg() with an alloca and a
bitcast to i8*, and removing the @llvm.call.preallocated.setup(). Since
@llvm.call.preallocated.arg() can be called multiple times with the same
arg index, we create an alloca per arg index.
We add a @llvm.stacksave() where the @llvm.call.preallocated.setup() was
and a @llvm.stackrestore() after the preallocated call to prevent the
stack from blowing up. This is valid because the argument would normally
not exist on the stack after the call before the transformation.

This does not currently handle all possible preallocated calls. We will
need to figure out where to put @llvm.stackrestore() in the cases where
there is no obvious place to put it, for example conditional
preallocated calls, invokes.

This sort of transformation may need to be moved to somewhere more
accessible to accomodate similar transformations (like inlining) in the
future.

Reviewers: efriedma, hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80951

91ef9305

[ARM] Basic bfloat support · ecdf48f1

Alexandros Lamprineas authored Jun 18, 2020

This patch adds basic support for BFloat in the Arm backend.
For now the code generation relies on fullfp16 being present.

Briefly:
* adds the bfloat scalar and vector types in the necessary register classes,
* adjusts the calling convention to cope with bfloat argument passing and return,
* adds codegen patterns for moves, loads and stores.

It's tested mostly by the intrinsic patches that depend on it (load/store, convert/copy).

The following people contributed to this patch:

 * Alexandros Lamprineas
 * Ties Stuij

Differential Revision: https://reviews.llvm.org/D81373

ecdf48f1

[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended... · 24744213

Simon Pilgrim authored Jun 18, 2020

[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes.

If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.

24744213

AMDGPU: Don't pass MachineFunction if only the IR Function is used · 6f09bb7d
Matt Arsenault authored Jun 18, 2020

6f09bb7d

[AVR] Fix miscompilation of zext + add · b4c91462

Ayke van Laethem authored Apr 19, 2020

Code like the following:

    define i32 @foo(i32 %a, i1 zeroext %b) addrspace(1) {
    entry:
      %conv = zext i1 %b to i32
      %add = add nsw i32 %conv, %a
      ret i32 %add
    }

Would compile to the following (incorrect) code:

    foo:
        mov     r18, r20
        clr     r19
        add     r22, r18
        adc     r23, r19
        sbci    r24, 0
        sbci    r25, 0
        ret

Those sbci instructions are clearly wrong, they should have been adc
instructions.

This commit improves codegen to use adc instead:

    foo:
        mov     r18, r20
        clr     r19
        ldi     r20, 0
        ldi     r21, 0
        add     r22, r18
        adc     r23, r19
        adc     r24, r20
        adc     r25, r21
        ret

This code is not optimal (it could be just 5 instructions instead of the
current 9) but at least it doesn't miscompile.

Differential Revision: https://reviews.llvm.org/D78439

b4c91462

Lanai: Remove unused method · 243303f8

Matt Arsenault authored Jun 18, 2020

This was depending on the MachineFunction at MachineFunctionInfo
construction, which will soon be disallowed.

243303f8

[X86][SSE] Fold MOVMSK(PCMPEQ(X,0)) == -1 -> PTESTZ(X,X) · fe0a85fa

Simon Pilgrim authored Jun 18, 2020

Allow combineSetCCMOVMSK to handle 'allof' X == 0 patterns to be replaced with PTESTZ

This is a preliminary patch before properly handling PR35129

fe0a85fa

[CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb'... · 8374bf43

Alexandre Ganea authored Jun 18, 2020

[CodeView] Fix generated command-line expansion in LF_BUILDINFO. Fix the 'pdb' entry which was previously a null reference, now an empty string.

Previously, the DIA SDK didn't like the empty reference in the 'pdb' entry.

8374bf43

[RISCV64] Emit correct lib call for fp(float/double) to ui/si · 7622ea58

Kamlesh Kumar authored Jun 18, 2020

Since i32 is not legal in riscv64,
it always promoted to i64 before emitting lib call and
for conversions like float/double to int and float/double to unsigned int
wrong lib call was emitted. This commit fix it using custom lowering.

Differential Revision: https://reviews.llvm.org/D80526

7622ea58

[MC] Rename a misnamed function. NFC. · 6853cc72

Igor Kudrin authored Jun 18, 2020

The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to
better reflect an expression it creates and fix a naming style issue.

Differential Revision: https://reviews.llvm.org/D82079

6853cc72

[CodeView] Add full repro to LF_BUILDINFO record · 403f9537

Alexandre Ganea authored Jun 18, 2020

This patch adds some missing information to the LF_BUILDINFO which allows for rebuilding an .OBJ without any external dependency but the .OBJ itself (other than the compiler executable).

Some tools need this information to reproduce a build without any knowledge of the build system. The LF_BUILDINFO therefore stores a full path to the compiler, the PWD (which is the CWD at program startup), a relative or absolute path to the TU, and the full CC1 command line. The command line needs to be freestanding (not depend on any environment variable). In the same way, MSVC doesn't store the provided command-line, but an expanded version (somehow their equivalent of CC1) which is also freestanding.

For more information see PR36198 and D43002.

Differential Revision: https://reviews.llvm.org/D80833

403f9537

[CodeView] Add TypeCollection::replaceType to replace type records post-merging · 24eff42b
Alexandre Ganea authored Jun 18, 2020
```
The API is not called in this patch. This is to simply/support https://reviews.llvm.org/D80833
```
24eff42b
[Clang] Move clang::Job::printArg to llvm::sys::printArg. NFCI. · a45409d8
Alexandre Ganea authored Jun 17, 2020
```
This patch is to support/simplify https://reviews.llvm.org/D80833
```
a45409d8

[Matrix] Use alignment info when lowering loads/stores. · 1669fddc

Florian Hahn authored Jun 18, 2020

This patch updates LowerMatrixIntrinsics to preserve the alignment
specified at the original load/stores and the align attribute for the
pointer argument of the column.major.load/store intrinsics.

We can always use the specified alignment for the load of the first
column. For subsequent columns, the alignment may need to be reduced.

For ConstantInt strides, compute the offset for the start of the column in
bytes and use commonAlignment to get the largest valid alignment.

For non-ConstantInt strides, we need to take the common alignment of the
initial alignment and the element size in bytes.

Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke, rjmccall

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D81960

1669fddc

[ARM] Moving CMSE handling of half arguments and return to the backend · 92ad6d57

Lucas Prates authored Jun 05, 2020

Summary:
As half-precision floating point arguments and returns were previously
coerced to either float or int32 by clang's codegen, the CMSE handling
of those was also performed in clang's side by zeroing the unused MSBs
of the coercer values.

This patch moves this handling to the backend's calling convention
lowering, making sure the high bits of the registers used by
half-precision arguments and returns are zeroed.

Reviewers: chill, rjmccall, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D81428

92ad6d57

[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend · a255931c

Lucas Prates authored Jun 09, 2020

Summary:
Half-precision floating point arguments and returns are currently
promoted to either float or int32 in clang's CodeGen and there's
no existing support for the lowering of `half` arguments and returns
from IR in AArch32's backend.

Such frontend coercions, implemented as coercion through memory
in clang, can cause a series of issues in argument lowering, as causing
arguments to be stored on the wrong bits on big-endian architectures
and incurring in missing overflow detections in the return of certain
functions.

This patch introduces the handling of half-precision arguments and returns in
the backend using the actual "half" type on the IR. Using the "half"
type the backend is able to properly enforce the AAPCS' directions for
those arguments, making sure they are stored on the proper bits of the
registers and performing the necessary floating point convertions.

Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer

Reviewed By: ostannard

Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75169

a255931c

[SVE] Add flag to specify SVE register size, using this to calculate legal vector types. · 4612f391

Paul Walker authored Dec 13, 2019

Adds aarch64-sve-vector-bits-{min,max} to allow the size of SVE
data registers (in bits) to be specified. This allows the code
generator to make assumptions it normally couldn't. As a starting
point this information is used to mark fixed length vector types
that can fit within the specified size as legal.

Reviewers: rengolin, efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80384

4612f391

[DA] conservatively mark the join of every divergent branch · 7aad2207

Sameer Sahasrabuddhe authored Jun 18, 2020

For a loop, a join block is a block that is reachable along multiple
disjoint paths from the exiting block of a loop. If the exit condition
of the loop is divergent, then such join blocks must also be marked
divergent. This currently fails in some cases because not all join
blocks are identified correctly.

The workaround is to conservatively mark every join block of any
branch (not necessarily the exiting block of a loop) as divergent.

https://bugs.llvm.org/show_bug.cgi?id=46372

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D81806

7aad2207