Commits · 784929d0454c4df6a98ef6fbbd1d30a6f71f9c16 · Lorenzo Albano / LLVM bpEVL

Feb 08, 2019

Implementation of asm-goto support in LLVM · 784929d0

Craig Topper authored Feb 08, 2019

This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563

784929d0

[CodeExtractor] Restore outputs after creating exit stubs · 0e5dd512

Vedant Kumar authored Feb 08, 2019

When CodeExtractor saves the result of InvokeInst at the first insertion
point of the 'normal destination' basic block, this block can be omitted
in the outlined region, so store is placed outside of the function. The
suggested solution is to process saving outputs after creating exit
stubs for new function, and stores will be placed in that blocks before
return in this case.

Patch by Sergei Kachkov!

Fixes llvm.org/PR40455.

Differential Revision: https://reviews.llvm.org/D57919

llvm-svn: 353562

0e5dd512

AMDGPU: Eliminate GPU specific SubtargetFeatures · 564f0f83

Matt Arsenault authored Feb 08, 2019

Inline compatability is determined from the individual feature
bits. These are just sets of the separate features, but will always be
treated as incompatible unless they are specifically ignored.

Defining the ISA version number here in tablegen would be nice, but it
turns out this wasn't actually used.

llvm-svn: 353558

564f0f83

[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X)) · 92a8c367

Nemanja Ivanovic authored Feb 08, 2019

The sqrt case is faster and we already do this for the case where
the exponent is 0.25. This adds the 0.75 case which is also not
sensitive to signed zeros.

Patch by Whitney Tsang (Whitney)

Differential revision: https://reviews.llvm.org/D57434

llvm-svn: 353557

92a8c367

[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder · 01e818a9

Aditya Nandakumar authored Feb 08, 2019

https://reviews.llvm.org/D57932

Add some logging + tests to make sure CSEInfo prints debug output.

reviewed by: arsenm

llvm-svn: 353553

01e818a9

AMDGPU: Remove GCN features and predicates · d7047276

Matt Arsenault authored Feb 08, 2019

These are no longer necessary since the R600 tablegen files are split
out now.

llvm-svn: 353548

d7047276

[InstrProf] Implement static profdata registration · 987d331f

Reid Kleckner authored Feb 08, 2019

Summary:
The motivating use case is eliminating duplicate profile data registered
for the same inline function in two object files. Before this change,
users would observe multiple symbol definition errors with VC link, but
links with LLD would succeed.

Users (Mozilla) have reported that PGO works well with clang-cl and LLD,
but when using LLD without this static registration, we would get into a
"relocation against a discarded section" situation. I'm not sure what
happens in that situation, but I suspect that duplicate, unused profile
information was retained. If so, this change will reduce the size of
such binaries with LLD.

Now, Windows uses static registration and is in line with all the other
platforms.

Reviewers: davidxl, wmi, inglorion, void, calixte

Subscribers: mgorny, krytarowski, eraman, fedor.sergeev, hiraditya, #sanitizers, dmajor, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D57929

llvm-svn: 353547

987d331f

[TargetLowering] Use ISD::FSHR in expandFixedPointMul · eb6a47a4

Simon Pilgrim authored Feb 08, 2019

Replace OR(SHL,SRL) pattern with ISD::FSHR (legalization expands this later if necessary) - this helps with the scale == 0 'undefined' drop-through case that was discussed on D55720.

llvm-svn: 353546

eb6a47a4

[TargetLowering] Add SimplifyDemandedBits funnel shift support · 478bb907
Simon Pilgrim authored Feb 08, 2019
```
llvm-svn: 353539
```
478bb907

ArgumentPromotion should copy all metadata to new Function · 3ce8112d

Teresa Johnson authored Feb 08, 2019

Summary:
ArgumentPromotion had code to specifically move the dbg metadata over to
the new function, but other metadata such as the function_entry_count
!prof metadata was not. Replace code that moved dbg metadata with a call
to copyMetadata. The old metadata is automatically removed when the old
Function is removed.

Reviewers: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57846

llvm-svn: 353537

3ce8112d

[X86] Remove isReMaterializable from X87 floating point constant loads and constant pool loads. · 41a1792b

Craig Topper authored Feb 08, 2019

Summary: These instructions update FPSW so they aren't generically safe to rematerialize into any location if FPSW is live for a comparison result. They also use FPCW for exception masking control. Though the only exception they can generate is stack overflow and we manage the stack ourselves so that's not really going to occur.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57934

llvm-svn: 353536

41a1792b

[x86] fix formatting; NFC · e9cc26a5
Sanjay Patel authored Feb 08, 2019
```
(test commit #2 migrating to git)

llvm-svn: 353533
```
e9cc26a5

[AMDGPU] Fix CS scratch setup on pre-GCN3 ASICs · 494b8ac9

Carl Ritson authored Feb 08, 2019

Summary:
Prior to GCN3 s_load_dword offsets are in dwords rather than bytes.
Thus the scratch buffer descriptor offset must be adjusted for pre-GCN3 ASICs.

Reviewers: nhaehnle, tpr

Reviewed By: nhaehnle

Subscribers: sheredom, arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56496

llvm-svn: 353530

494b8ac9

Revert r353416 "[DAG] Cleanup unused nodes on failed store-to-load forward combine." · 97011ccc
Nirav Dave authored Feb 08, 2019
```
This cleanup causes out-of-tree crashes.

llvm-svn: 353527
```
97011ccc

AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2 · b0a22704

Matt Arsenault authored Feb 08, 2019

clampScalar doesn't do anything for non-power-of-2 in range.
There should probably be a combination rule to reduce the number
of matching rules.

llvm-svn: 353526

b0a22704

[AMDGPU][MC] Added support of lds_direct operand · 942c273d

Dmitry Preobrazhensky authored Feb 08, 2019

See bug 39293: https://bugs.llvm.org/show_bug.cgi?id=39293

Reviewers: artem.tamazov, rampitec

Differential Revision: https://reviews.llvm.org/D57889

llvm-svn: 353524

942c273d

AMDGPU/GlobalISel: Fix non-power-of-2 implicit_def · 0f2debb1
Matt Arsenault authored Feb 08, 2019
```
llvm-svn: 353522
```
0f2debb1

[MIPS GlobalISel] Select any extending load and truncating store · c98b26d3

Petar Avramovic authored Feb 08, 2019

Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and
G_SEXTLOAD. That is perform widenScalarDst to size given by the target
and avoid additional checks in common code. Targets can reorder or add
additional rules in LegalizeRuleSet for the opcode to achieve desired
behavior.

Select extending load that does not have specified type of extension
into zero extending load.

Select truncating store that stores number of bytes indicated by size
in MachineMemoperand.

Differential Revision: https://reviews.llvm.org/D57454

llvm-svn: 353520

c98b26d3

AMDGPU/GlobalISel: Don't use a copy in addrspacecast lowering · dc88a2ce
Matt Arsenault authored Feb 08, 2019
```
llvm-svn: 353516
```
dc88a2ce

[AMDGPU][MC][CODEOBJECT] Added predefined symbols to access GPU minor and stepping numbers · 62a0318d

Dmitry Preobrazhensky authored Feb 08, 2019

Added the following Code Object v3 symbols:
    .amdgcn.gfx_generation_minor
    .amdgcn.gfx_generation_stepping

Reviewers: artem.tamazov, kzhuravl

Differential Revision: https://reviews.llvm.org/D57826

llvm-svn: 353515

62a0318d

[AMDGPU] Fix DPP combiner · 7fe97f8c

Valery Pykhtin authored Feb 08, 2019

Differential revision: https://reviews.llvm.org/D55444

dpp move with uses and old reg initializer should be in the same BB.
bound_ctrl:0 is only considered when bank_mask and row_mask are fully enabled (0xF). Otherwise the old register value is checked for identity.
Added add, subrev, and, or instructions to the old folding function.
Kill flag is cleared for the src0 (DPP register) as it may be copied into more than one user.

The pass is still disabled by default.

llvm-svn: 353513

7fe97f8c

[DWARF] LLVM ERROR: Broken function found, while removing Debug Intrinsics. · 08dc50f2

Carlos Alberto Enciso authored Feb 08, 2019

Check that when SimplifyCFG is flattening a 'br', all their debug intrinsic instructions are removed, including any dbg.label referencing a label associated with the basic blocks being removed.

Differential Revision: https://reviews.llvm.org/D57444

llvm-svn: 353511

08dc50f2

Revert r353424 "[llvm-ar][libObject] Fix relative paths when nesting thin archives." · f5db7158

Hans Wennborg authored Feb 08, 2019

This broke the Chromium build on Windows, see https://crbug.com/930058

> Summary:
> When adding one thin archive to another, we currently chop off the relative path to the flattened members. For instance, when adding `foo/child.a` (which contains `x.txt`) to `parent.a`, whe
> lattening it we should add it as `foo/x.txt` (which exists) instead of `x.txt` (which does not exist).
>
> As a note, this also undoes the `IsNew` parameter of handling relative paths in r288280. The unit test there still passes.
>
> This was reported as part of testing the kernel build with llvm-ar: https://patchwork.kernel.org/patch/10767545/ (see the second point).
>
> Reviewers: mstorsjo, pcc, ruiu, davide, david2050
>
> Subscribers: hiraditya, llvm-commits
>
> Tags: #llvm
>
> Differential Revision: https://reviews.llvm.org/D57842

This reverts commit bf990ab5.

llvm-svn: 353507

f5db7158

[MIPS GlobalISel] Select mul · 56dc218d

Petar Avramovic authored Feb 08, 2019

Legalize and select G_MUL for s32 and smaller types for MIPS32.

Differential Revision: https://reviews.llvm.org/D57816

llvm-svn: 353506

56dc218d

[LoopSimplifyCFG] Use DTU.applyUpdates instead of insert/deleteEdge · 6b63d3a2

Max Kazantsev authored Feb 08, 2019

`insert/deleteEdge` methods in DTU can make updates incorrectly in some cases
(see https://bugs.llvm.org/show_bug.cgi?id=40528), and it is recommended to
use `applyUpdates` methods instead when it is needed to make a mass update in CFG.

Differential Revision: https://reviews.llvm.org/D57316
Reviewed By: kuhar

llvm-svn: 353502

6b63d3a2

[ARM] Add OptMinSize to ARMSubtarget · 5b09834b

Sam Parker authored Feb 08, 2019

    
In many places in the backend, we like to know whether we're
optimising for code size and this is performed by checking the
current machine function attributes. A subtarget is created on a
per-function basis, so it's possible to know when we're compiling for
code size on construction so record this in the new object.

Differential Revision: https://reviews.llvm.org/D57812

llvm-svn: 353501

5b09834b

[CodeExtractor] Update function's assumption cache after extracting blocks from it · 807960e6

Sergey Dmitriev authored Feb 08, 2019

Summary: Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache.

Reviewers: hfinkel, vsk, fhahn, davidxl, sanjoy

Reviewed By: hfinkel, vsk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57215

llvm-svn: 353500

807960e6

[WebAssembly] Fix parseImmediate's memory alignment requirement · df6770f0
Heejin Ahn authored Feb 08, 2019
```
This fixes the current failure in the x86-64 ubsan bot caused by
r353496.

llvm-svn: 353499
```
df6770f0

AMDGPU/GlobalISel: Legalize addrspacecast · a8b4339c

Matt Arsenault authored Feb 08, 2019

Use a placeholder constant for now on targets
that need the load from the queue ptr.

llvm-svn: 353497

a8b4339c

[WebAssembly] Fixed Disassembler ignoring endian swap on big endian. · 0d9f3f7f

Wouter van Oortmerssen authored Feb 08, 2019

Summary: This fixes: https://bugs.llvm.org/show_bug.cgi?id=40620

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57933

llvm-svn: 353496

0d9f3f7f

Fix the lowering issue of intrinsics llvm.localaddress on X86 · 738180cc

Craig Topper authored Feb 08, 2019

Patch by Yuanke Luo

Reviewers: craig.topper, annita.zhang, smaslov, rnk, wxiao3

Reviewed By: rnk

Subscribers: efriedma, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57501

llvm-svn: 353492

738180cc

[X86] Add FPCW as a register and start using it as an implicit use on floating point instructions. · c782f188

Craig Topper authored Feb 08, 2019

Summary:
FPCW contains the rounding mode control which we manipulate to implement fp to integer conversion by changing the roudning mode, storing the value to the stack, and then changing the rounding mode back. Because we didn't model FPCW and its dependency chain, other instructions could be scheduled into the middle of the sequence.

This patch introduces the register and adds it as an implciit def of FLDCW and implicit use of the FP binary arithmetic instructions and store instructions. There are more instructions that need to be updated, but this is a good start. I believe this fixes at least the reduced test case from PR40529.

Reviewers: RKSimon, spatel, rnk, efriedma, andrew.w.kaylor

Subscribers: dim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57735

llvm-svn: 353489

c782f188

[AArch64] Fix condition for "high-vector" DUP optimizations. · 29c06093

Eli Friedman authored Feb 08, 2019

AArch64 NEON has a bunch of instructions with a "2" suffix that extract
the top half of the source vectors, instead of the bottom half.  We have
some DAGCombines to try to take advantage of that.  However, they
assumed that any EXTRACT_VECTOR was extracting the high half of the
vector in question.

This issue has apparently existed since the AArch64 backend was merged.

Fixes https://bugs.llvm.org/show_bug.cgi?id=40632 .

Differential Revision: https://reviews.llvm.org/D57862

llvm-svn: 353486

29c06093

Feb 07, 2019

[mips][micromips] Fix how values in .gcc_except_table are calculated · 3cfcd754

Petar Jovanovic authored Feb 07, 2019

When a landing pad is calculated in a program that is compiled for micromips
with -fPIC flag, it will point to an even address.
Such an error will cause a segmentation fault, as the instructions in
micromips are aligned on odd addresses. This patch sets the last bit of the
offset where a landing pad is, to 1, which will effectively be an odd
address and point to the instruction exactly.

r344591 fixed this issue for -static compilation.

Patch by Aleksandar Beserminji.

Differential Revision: https://reviews.llvm.org/D57677

llvm-svn: 353480

3cfcd754

[x86] fix formatting; NFC · 81f859d1
Sanjay Patel authored Feb 07, 2019
```
llvm-svn: 353477
```
81f859d1

[WebAssembly] Fix imported function symbol names that differ from their import... · 29874cea

Dan Gohman authored Feb 07, 2019

[WebAssembly] Fix imported function symbol names that differ from their import names in the .o format

Add a flag to allow symbols to have a wasm import name which differs from the
linker symbol name, allowing the linker to link code using the import_module
attribute.

This is the MC/Object portion of the patch.

Differential Revision: https://reviews.llvm.org/D57632

llvm-svn: 353474

29874cea

[InstCombine] Optimize `atomicrmw <op>, 0` into `load atomic` when possible · 96f54de8

Quentin Colombet authored Feb 07, 2019

This commit teaches InstCombine how to replace an atomicrmw operation
into a simple load atomic.
For a given `atomicrmw <op>`, this is possible when:
1. The ordering of that operation is compatible with a load (i.e.,
   anything that doesn't have a release semantic).
2. <op> does not modify the value being stored

Differential Revision: https://reviews.llvm.org/D57854

llvm-svn: 353471

96f54de8

[LV] Remove unnecessary assignment to UserIC. · f557a94a
Florian Hahn authored Feb 07, 2019
```
llvm-svn: 353469
```
f557a94a

[InstCombine] Fix crashing from (icmp (bitcast ([su]itofp X)), Y) · 781d8838

Sanjay Patel authored Feb 07, 2019

This fixes a class of bugs introduced by D44367,
which transforms various cases of icmp (bitcast ([su]itofp X)), Y to icmp X, Y.
If the bitcast is between vector types with a different number of elements,
the current code will produce bad IR along the lines of: icmp <N x i32> ..., <M x i32> <...>.

This patch suppresses the transform if the bitcast changes the number of vector elements.

Patch by: @AndrewScheidecker (Andrew Scheidecker)

Differential Revision: https://reviews.llvm.org/D57871

llvm-svn: 353467

781d8838

Move SMTSolver dump() methods out-of-line. · e794db88

Adrian Prantl authored Feb 07, 2019

This broke modularized non-local-submodule-visibility builds because
the function bodies pulled in extra dependencies.

llvm-svn: 353465

e794db88