Commits · 887ec751732efa1cc4508ce524f7e72b4e597e45 · Lorenzo Albano / LLVM bpEVL

Aug 22, 2018

[MS Demangler] Print template constructor args. · ee09170d

Zachary Turner authored Aug 21, 2018

Previously if you had something like this:

template<typename T>
struct Foo {
  template<typename U>
  Foo(U);
};

Foo F(3.7);

this would mangle as ??$?0N@?$Foo@H@@QEAA@N@Z

and this would be demangled as:

undname:      __cdecl Foo<int>::Foo<int><double>(double)
llvm-undname: __cdecl Foo<int>::Foo<int>(double)

Note the lack of the constructor template parameter in our
demangling.

This patch makes it so we print the constructor argument list.

llvm-svn: 340356

ee09170d

Aug 21, 2018

MachineScheduler: Refactor setPolicy() to limit computing remaining latency · ecd6aa5b

Tom Stellard authored Aug 21, 2018

Summary:
Computing the remaining latency can be very expensive especially
on graphs of N nodes where the number of edges approaches N^2.

This reduces the compile time of a pathological case with the
AMDGPU backend from ~7.5 seconds to ~3 seconds.  This test case has
a basic block with 2655 stores, each with somewhere between 500
and 1500 successors and predecessors.

Reviewers: atrick, MatzeB, airlied, mareko

Reviewed By: mareko

Subscribers: tpr, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50486

llvm-svn: 340346

ecd6aa5b

[AMDGPU] Consider loads from flat addrspace to be potentially divergent · 72855e36

Scott Linder authored Aug 21, 2018

In general we can't assume flat loads are uniform, and cases where we can prove
they are should be handled through infer-address-spaces.

Differential Revision: https://reviews.llvm.org/D50991

llvm-svn: 340343

72855e36

[MS Demangler] Fix a few more edge cases. · df4cd7cb

Zachary Turner authored Aug 21, 2018

I found these by running llvm-undname over a couple hundred
megabytes of object files generated as part of building chromium.
The issues fixed in this patch are:

  1) decltype-auto return types.
  2) Indirect vtables (e.g. const A::`vftable'{for `B'})
  3) Pointers, references, and rvalue-references to member pointers.

I have exactly one remaining symbol out of a few hundred MB of object
files that produces a name we can't demangle, and it's related to
back-referencing.

llvm-svn: 340341

df4cd7cb

[WebAssembly] Restore __stack_pointer after catch instructions · 78d19108

Heejin Ahn authored Aug 21, 2018

Summary:
After the stack is unwound due to a thrown exception, the
`__stack_pointer` global can point to an invalid address. This inserts
instructions that restore `__stack_pointer` global.

Reviewers: jgravelle-google, dschuff

Subscribers: mgorny, sbc100, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50980

llvm-svn: 340339

78d19108

[WebAssembly] v128.const · 22442924

Thomas Lively authored Aug 21, 2018

Summary:
This CL implements v128.const for each vector type. New operand types
are added to ensure the vector contents can be serialized without LEB
encoding. Tests are added for instruction selection, encoding,
assembly and disassembly.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50873

llvm-svn: 340336

22442924

[LICM] Refactor some AliasSetTracker code to get rid of new/deletes. NFC · 883fe455
Marcello Maggioni authored Aug 21, 2018
```
Differential Revision: https://reviews.llvm.org/D51024

llvm-svn: 340333
```
883fe455

[CodeExtractor] Use 'normal destination' BB as insert point to store invoke results. · 7cdf52e4

Florian Hahn authored Aug 21, 2018

Currently CodeExtractor tries to use the next node after an invoke to
place the store for the result of the invoke, if it is an out parameter
of the region. This fails, as the invoke terminates the current BB.
In that case, we can place the store in the 'normal destination' BB, as
the result will only be available in that case.


Reviewers: davidxl, davide, efriedma

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D51037

llvm-svn: 340331

7cdf52e4

[WebAssembly] Don't make wasm cleanuppads into funclet entries · 9cd7f88a

Heejin Ahn authored Aug 21, 2018

Summary:
Catchpads and cleanuppads are not funclet entries; they are only EH
scope entries. We already dont't set `isEHFuncletEntry` for catchpads.
This patch does the same thing for cleanuppads.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50654

llvm-svn: 340330

9cd7f88a

[WebAssembly] Change writeSPToMemory to writeSPToGlobal (NFC) · 20c9c443

Heejin Ahn authored Aug 21, 2018

Summary: SP is now a __stack_pointer global and not a memory address anymore.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D51046

llvm-svn: 340328

20c9c443

[RegisterCoalescer] Use substPhysReg in reMaterializeTrivialDef · e0632138

Bjorn Pettersson authored Aug 21, 2018

Summary:
When RegisterCoalescer::reMaterializeTrivialDef is substituting
a register use in a DBG_VALUE instruction, and the old register
is a subreg, and the new register is a physical register,
then we need to use substPhysReg in order to extract the correct
subreg.

Reviewers: wmi, aprantl

Reviewed By: wmi

Subscribers: hiraditya, MatzeB, qcolombet, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D50844

llvm-svn: 340326

e0632138

[WebAssembly] Add isEHScopeReturn instruction property · ed5e06b0

Heejin Ahn authored Aug 21, 2018

Summary:
So far, `isReturn` property is used to mean both a return instruction
from a functon and the end of an EH scope, a scope that starts with a EH
scope entry BB and ends with a catchret or a cleanupret instruction.
Because WinEH uses funclets, all EH-scope-ending instructions are also
real return instruction from a function. But for wasm, they only serve
as the end marker of an EH scope but not a return instruction that
exits a function. This mismatch caused incorrect prolog and epilog
generation in wasm EH scopes. This patch fixes this.

This patch is in the same vein with rL333045, which splits
`MachineBasicBlock::isEHFuncletEntry` into `isEHFuncletEntry` and
`isEHScopeEntry`.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50653

llvm-svn: 340325

ed5e06b0

[InstCombine] Pull simple checks above a more complicated one. NFCI · 3d8fe39c

Craig Topper authored Aug 21, 2018

I'm assuming its easier to make sure the RHS of an XOR is all ones than it is to check for the many select patterns we have. So lets check that first. Same with the one use check.

llvm-svn: 340321

3d8fe39c

[GVN] Assign new value number to calls reading memory, if there is no MemDep info. · 9583d4fa

Florian Hahn authored Aug 21, 2018

Currently we assign the same value number to two calls reading the same
memory location if we do not have MemoryDependence info. Without MemDep
Info we cannot guarantee that there is no store between the two calls, so we
have to assign a new number to the second call.

It also adds a new option EnableMemDep to enable/disable running
MemoryDependenceAnalysis and also renamed NoLoads to NoMemDepAnalysis to
be more explicit what it does. As it also impacts calls that read memory,
NoLoads is a bit confusing.

Reviewers: efriedma, sebpop, john.brawn, wmi

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D50893

llvm-svn: 340319

9583d4fa

[RegisterCoalscer] Manually remove leftover segments when commuting def · b211434a

Krzysztof Parzyszek authored Aug 21, 2018

In removeCopyByCommutingDef, segments from the source live range are
copied into (and merged with) the segments of the target live range.
This is performed for all subranges of the source interval. It can
happen that there will be subranges of the target interval that had
no corresponding subranges in the source interval, and in such cases
these subrages will not be updated. Since the copy being coalesced
is about to be removed, these ranges need to be updated by removing
the segments that are started by the copy.

llvm-svn: 340318

b211434a

[NVPTX] Remove ftz variants of cvt with rounding mode · d66dde5a

Benjamin Kramer authored Aug 21, 2018

These do not exist in ptxas, it refuses to compile them.

Differential Revision: https://reviews.llvm.org/D51042

llvm-svn: 340317

d66dde5a

Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift... · 3dc594c1

Eric Christopher authored Aug 21, 2018

Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction" due to it causing a compiler crash on valid.

This reverts commit r340016, testcase forthcoming.

llvm-svn: 340315

3dc594c1

[AST] Remove notion of volatile from alias sets [NFCI] · c3c23e8c

Philip Reames authored Aug 21, 2018

Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.

It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)

There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..

llvm-svn: 340312

c3c23e8c

Update DBG_VALUE register operand during LiveInterval operations · 132fc5a8

Yury Delendik authored Aug 21, 2018

Summary:
Handling of DBG_VALUE in ConnectedVNInfoEqClasses::Distribute() was fixed in
PR16110. However DBG_VALUE register operands are not getting updated. This
patch properly resolves the value location.

Reviewers: MatzeB, vsk

Reviewed By: MatzeB

Subscribers: kparzysz, thegameg, vsk, MatzeB, dschuff, sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D48994

llvm-svn: 340310

132fc5a8

Revert "Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations"" · c0333f71

Aditya Nandakumar authored Aug 21, 2018

This reverts commit d1341152d91398e9a882ba2ee924147ea2f9b589.

This patch originally made use of Nested MachineIRBuilder buildInstr
calls, and since order of argument processing is not well defined, the
instructions were built slightly in a different order (still correct).
I've removed the nested buildInstr calls to have a defined order now.

Patch was tested by Mikael.

llvm-svn: 340309

c0333f71

[X86][SSE] Lower vXi8 general shifts to SSE shifts directly. NFCI. · 50eba6b3

Simon Pilgrim authored Aug 21, 2018

Most of these shifts are extended to vXi16 so we don't gain anything from forcing another round of generic shift lowering - we know these extended cases are legal constant splat shifts.

llvm-svn: 340307

50eba6b3

[BypassSlowDivision] Teach bypass slow division not to interfere with div by... · b172b888

Craig Topper authored Aug 21, 2018

[BypassSlowDivision] Teach bypass slow division not to interfere with div by constant where constants have been constant hoisted, but not moved from their basic block

DAGCombiner doesn't pay attention to whether constants are opaque before doing the div by constant optimization. So BypassSlowDivision shouldn't introduce control flow that would make DAGCombiner unable to see an opaque constant. This can occur when a div and rem of the same constant are used in the same basic block. it will be hoisted, but not leave the block.

Longer term we probably need to look into the X86 immediate cost model used by constant hoisting and maybe not mark div/rem immediates for hoisting at all.

This fixes the case from PR38649.

Differential Revision: https://reviews.llvm.org/D51000

llvm-svn: 340303

b172b888

[X86][SSE] Lower v8i16 general shifts to SSE shifts directly. NFCI. · 98eb4ae4

Simon Pilgrim authored Aug 21, 2018

We don't gain anything from forcing another round of generic shift lowering - we know these are legal constant splat shifts.

llvm-svn: 340302

98eb4ae4

[X86][SSE] Lower directly to SSE shifts in the BLEND(SHIFT, SHIFT) combine. NFCI. · dbe4e9e3

Simon Pilgrim authored Aug 21, 2018

We don't gain anything from forcing another round of generic shift lowering - we know these are legal constant splat shifts.

llvm-svn: 340300

dbe4e9e3

Try to fix bot build failure · 182bab8d
Matt Arsenault authored Aug 21, 2018
```
llvm-svn: 340296
```
182bab8d

[AMDGPU] Support idot2 pattern. · 3528c803

Farhana Aleen authored Aug 21, 2018

Summary: Transform add (mul ((i32)S0.x, (i32)S1.x),

         add( mul ((i32)S0.y, (i32)S1.y), (i32)S3) => i/udot2((v2i16)S0, (v2i16)S1, (i32)S3)

Author: FarhanaAleen

Reviewed By: arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D50024

llvm-svn: 340295

3528c803

AMDGPU: Partially move target handling code from clang to TargetParser · 7dd9d58c

Matt Arsenault authored Aug 21, 2018

A future change in clang necessitates access of this information
from the driver, so move this into a common place.

Try to mimic something resembling the API the other targets are
using here.

One thing I'm uncertain about is how to split amdgcn and r600
handling. Here I've mostly duplicated the functions for each,
while keeping the same enums. I think this is a bit awkward
for the features which don't matter for amdgcn.

It's also a bit messy that this isn't a complete set of
subtarget features. This is just the minimum set needed
for the driver code. For example building the list of
subtarget feature names is still in clang.

llvm-svn: 340291

7dd9d58c

[X86][SSE] Add helper function to convert to/between the SSE vector shift opcodes. NFCI. · 5a83a1fd
Simon Pilgrim authored Aug 21, 2018
```
Also remove some more getOpcode calls from LowerShift when we already have Opc.

llvm-svn: 340290
```
5a83a1fd

[aarch64][mc] Don't lookup symbols when there is no symbol lookup callback · 6a943fb1

Daniel Sanders authored Aug 21, 2018

Summary: When run under llvm-mc-disassemble-fuzzer, there is no symbol lookup callback so tryAddingSymbolicOperand() must fail gracefully instead of crashing

Reviewers: aemerson, javed.absar

Reviewed By: aemerson

Subscribers: lhames, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D51005

llvm-svn: 340287

6a943fb1

[InstSimplify] use isKnownNeverNaN to fold more fcmp ord/uno · f3ae9cc3

Sanjay Patel authored Aug 21, 2018

Remove duplicate tests from InstCombine that were added with
D50582. I left negative tests there to verify that nothing
in InstCombine tries to go overboard. If isKnownNeverNaN is
improved to handle the FP binops or other cases, we should
have coverage under InstSimplify, so we could remove more
duplicate tests from InstCombine at that time.

llvm-svn: 340279

f3ae9cc3

[LV] Vectorize loops where non-phi instructions used outside loop · b02b0ad8

Anna Thomas authored Aug 21, 2018

Summary:
Follow up change to rL339703, where we now vectorize loops with non-phi
instructions used outside the loop. Note that the cyclic dependency
identification occurs when identifying reduction/induction vars.

We also need to identify that we do not allow users where the PSCEV information
within and outside the loop are different. This was the fix added in rL307837
for PR33706.

Reviewers: Ayal, mkuper, fhahn

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50778

llvm-svn: 340278

b02b0ad8

[AMDGPU] Allow int types for MUBUF vdata · bb5ee41a

Tim Renouf authored Aug 21, 2018

Summary:
Previously the new llvm.amdgcn.raw/struct.buffer.load/store intrinsics
only allowed float types for the data to be loaded or stored, which
sometimes meant the frontend needed to generate a bitcast. In this, the
new intrinsics copied the old buffer intrinsics.

This commit extends the new intrinsics to allow int types as well.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D50315

Change-Id: I8202af2d036455553681dcbb3d7d32ae273f8f85
llvm-svn: 340270

bb5ee41a

[AMDGPU] New buffer intrinsics · 4f703f5e

Tim Renouf authored Aug 21, 2018

Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.buffer.load
  llvm.amdgcn.raw.buffer.load.format
  llvm.amdgcn.raw.buffer.load.format.d16
  llvm.amdgcn.struct.buffer.load
  llvm.amdgcn.struct.buffer.load.format
  llvm.amdgcn.struct.buffer.load.format.d16
  llvm.amdgcn.raw.buffer.store
  llvm.amdgcn.raw.buffer.store.format
  llvm.amdgcn.raw.buffer.store.format.d16
  llvm.amdgcn.struct.buffer.store
  llvm.amdgcn.struct.buffer.store.format
  llvm.amdgcn.struct.buffer.store.format.d16
  llvm.amdgcn.raw.buffer.atomic.*
  llvm.amdgcn.struct.buffer.atomic.*

with the following changes from the llvm.amdgcn.buffer.*
intrinsics:

* there are separate raw and struct versions: raw does not have an
  index arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::BUFFER_* SD nodes always have an index operand, all three
offset operands, combined cachepolicy operand, and an extra idxen
operand.

The obsolescent llvm.amdgcn.buffer.* intrinsics continue to work.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50306

Change-Id: If897ea7dc34fcbf4d5496e98cc99a934f62fc205
llvm-svn: 340269

4f703f5e

[AMDGPU] New tbuffer intrinsics · 35484c9d

Tim Renouf authored Aug 21, 2018

Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.tbuffer.load
  llvm.amdgcn.struct.tbuffer.load
  llvm.amdgcn.raw.tbuffer.store
  llvm.amdgcn.struct.tbuffer.store

with the following changes from the llvm.amdgcn.tbuffer.* intrinsics:

* there are separate raw and struct versions: raw does not have an index
  arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined format arg (dfmt+nfmt)

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::TBUFFER_* SD nodes always have an index operand, all three
offset operands, combined format operand, combined cachepolicy operand,
and an extra idxen operand.

The tbuffer pseudo- and real instructions now also have a combined
format operand.

The obsolescent llvm.amdgcn.tbuffer.* and llvm.SI.tbuffer.store
intrinsics continue to work.

V2: Separate raw and struct intrinsics.
V3: Moved extract_glc and extract_slc defs to a more sensible place.
V4: Rebased on D49995.
V5: Only two separate offset args instead of three.
V6: Pseudo- and real instructions have joint format operand.
V7: Restored optionality of dfmt and nfmt in assembler.
V8: Addressed minor review comments.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49026

Change-Id: If22ad77e349fac3a5d2f72dda53c010377d470d4
llvm-svn: 340268

35484c9d

Change how finalizeBundle selects debug location for the BUNDLE instruction · d378a396

Bjorn Pettersson authored Aug 21, 2018

Summary:
Previously a BUNDLE instruction inherited the DebugLoc from the
first instruction in the bundle, even if that DebugLoc had no
DILocation. With this commit this is changed into selecting the
first DebugLoc that has a DILocation, by searching among the
bundled instructions.

The idea is to reduce amount of bundles that are lacking
debug locations.

Reviewers: #debug-info, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: JDevlieghere, mattd, llvm-commits

Differential Revision: https://reviews.llvm.org/D50639

llvm-svn: 340267

d378a396

[DAGCombiner] Reduce load widths of shifted masks · 597811e7

Sam Parker authored Aug 21, 2018

During combining, ReduceLoadWdith is used to combine AND nodes that
mask loads into narrow loads. This patch allows the mask to be a
shifted constant. This results in a narrow load which is then left
shifted to compensate for the new offset.

Differential Revision: https://reviews.llvm.org/D50432

llvm-svn: 340261

597811e7

[TargetLowering] Add BuildSDiv support for division by one or negone. · 72b324de

Simon Pilgrim authored Aug 21, 2018

This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1.

llvm-svn: 340260

72b324de

[MIPS GlobalISel] Select bitwise instructions · 3b953c37

Petar Jovanovic authored Aug 21, 2018

Select bitwise instructions for i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D50183

llvm-svn: 340258

3b953c37

[LICM] Hoist guards with invariant conditions · 097ef691

Max Kazantsev authored Aug 21, 2018

This patch teaches LICM to hoist guards from the loop if they are guaranteed to execute and
if there are no side effects that could prevent that.

Differential Revision: https://reviews.llvm.org/D50501
Reviewed By: reames

llvm-svn: 340256

097ef691

[RegisterCoalescer] Do not assert when trying to remat dead values · 880f2915

Bjorn Pettersson authored Aug 21, 2018

Summary:
RegisterCoalescer::reMaterializeTrivialDef used to assert that
the input register was live in. But as shown by the new
coalesce-dead-lanes.mir test case that seems to be a valid
scenario. We now return false instead of the assert, simply
avoiding to remat the dead def.

Normally a COPY of an undef value is eliminated by
eliminateUndefCopy(). Although we only do that when the
destination isn't a physical register. So the situation
above should be limited to the case when we copy an undef
value to a physical register.

Reviewers: kparzysz, wmi, tpr

Reviewed By: kparzysz

Subscribers: MatzeB, qcolombet, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D50842

llvm-svn: 340255

880f2915