Commits · 3385caaafd2ca0a85fadc589c0f7b63c9815c911 · Lorenzo Albano / LLVM bpEVL

Jun 18, 2018

[VPlan] Add VPInstruction to VPRecipe transformation. · 3385caaa

Florian Hahn authored Jun 18, 2018

This patch introduces a VPInstructionToVPRecipe transformation, which
allows us to generate code for a VPInstruction based VPlan re-using the
existing infrastructure.

Reviewers: dcaballe, hsaito, mssimpso, hfinkel, rengolin, mkuper, javed.absar, sguggill

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D46827

llvm-svn: 334969

3385caaa

[ORC] Add an initial implementation of a replacement CompileOnDemandLayer. · 68c9b8d6

Lang Hames authored Jun 18, 2018

CompileOnDemandLayer2 is a replacement for CompileOnDemandLayer built on the ORC
Core APIs. Functions in added modules are extracted and compiled lazily.
CompileOnDemandLayer2 supports multithreaded JIT'd code, and compilation on
multiple threads.

llvm-svn: 334967

68c9b8d6

[ORC] Keep weak flag on VSO symbol tables during materialization, but treat · 2e96114c

Lang Hames authored Jun 18, 2018

materializing weak symbols as strong.

This removes some elaborate flag tweaking and plays nicer with RuntimeDyld,
which relies of weak/common flags to determine whether it should emit a given
weak definition. (Switching to strong up-front makes it appear as if there is
already an overriding definition, which would require an extra back-channel to
override).

llvm-svn: 334966

2e96114c

Shrink interval after moving copy in removePartialRedundancy · 54601732
Krzysztof Parzyszek authored Jun 18, 2018
```
llvm-svn: 334963
```
54601732

[llvm-mca] Use an ordered map to collect hardware statistics. NFC. · a88281d8

Andrea Di Biagio authored Jun 18, 2018

Histogram entries are now ordered by key.  This should improves their
readability when statistics are printed.

llvm-svn: 334961

a88281d8

Fix typoed cast to avoid assertion in MCFragment::dump. · b35f9e14
Nirav Dave authored Jun 18, 2018
```
llvm-svn: 334959
```
b35f9e14

[SLPVectorizer] Tidyup isShuffle helper · 5b962b2f

Simon Pilgrim authored Jun 18, 2018

Ensure we keep track of the input vectors in all cases instead of just for SK_Select.

Ideally we'd reuse the shuffle mask pattern matching in TargetTransformInfo::getInstructionThroughput here to easily add support for all TargetTransformInfo::ShuffleKind without mass code duplication, I've added a TODO for now but D48236 should help us here.

Differential Revision: https://reviews.llvm.org/D48023

llvm-svn: 334958

5b962b2f

[TableGen] Make TiedAsmOperandTable in the AsmMatcher 'static' since its at file scope. · 88c142b4
Craig Topper authored Jun 18, 2018
```
llvm-svn: 334957
```
88c142b4
[TableGen] Remove unused member variable. · b41a1376
Craig Topper authored Jun 18, 2018
```
I think this became unused after r324196.

llvm-svn: 334956
```
b41a1376

[VPlanRecipeBase] Add eraseFromParent(). · 63cbcf98

Florian Hahn authored Jun 18, 2018

Reviewers: dcaballe, hsaito, mkuper, hfinkel

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D48081

llvm-svn: 334951

63cbcf98

[AArch64][SVE] Asm: Support for saturating INC/DEC (64bit scalar) instructions. · 13684d84

Sander de Smalen authored Jun 18, 2018

Summary:
The variants added by this patch are:
- SQINC  (signed increment)
- UQINC  (unsigned increment)
- SQDEC  (signed decrement)
- UQDEC  (unsigned decrement)

For example:
  uqincw  x0, all, mul #4

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Differential Revision: https://reviews.llvm.org/D47715

llvm-svn: 334948

13684d84

[X86][BtVer2] Flag AVX2+ scheduler classes as unsupported · 9173c97c
Simon Pilgrim authored Jun 18, 2018
```
Jaguar only supports up to AVX1

Differential Revision: https://reviews.llvm.org/D48274

llvm-svn: 334947
```
9173c97c

[llvm-mca] Add tests for XOP and AVX512 instructions that implicitly clear the... · 487da729

Andrea Di Biagio authored Jun 18, 2018

[llvm-mca] Add tests for XOP and AVX512 instructions that implicitly clear the upper portion of a super-register.

When the destination register of a XOP instruction is an XMM register, bits
[255:128] of the corresponding YMM register are cleared.

When the destination register of a EVEX encoded instruction is an XMM/YMM
register, the upper bits of the corresponding ZMM are cleared.
On processors that feature AVX512, a write to an XMM registers always clears the
upper portion of the corresponding ZMM register if the instruction is VEX or
EVEX encoded.

These new tests show some interesting cases which aren't correctly analyzed by
llvm-mca. The lack of knowledge related to the implicit update on the
super-registers is addressed by D48225.

llvm-svn: 334945

487da729

[VPlan] Fix sanitizer problem with insertBefore. · 3bcff366
Florian Hahn authored Jun 18, 2018
```
llvm-svn: 334943
```
3bcff366

[TableGen][AsmMatcherEmitter] Allow tied operands of different classes in aliases. · 118099a6

Sander de Smalen authored Jun 18, 2018

Allow a tied operand of a different operand class in InstAliases,
so that the operand can be printed (and added to the MC instruction)
as the appropriate register. For example, 'GPR64as32', which would
be printed/parsed as a 32bit register and should match a tied 64bit
register operand, where the former is a sub-register of the latter.

This patch also generalizes the constraint checking to an overrideable
method in MCTargetAsmParser, so that target asmparsers can specify
whether a given operand satisfies the tied register constraint.

Reviewers: olista01, rengolin, fhahn, SjoerdMeijer, samparker, dsanders, craig.topper

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D47714

llvm-svn: 334942

118099a6

Update copyright year to 2018. · 7555c589
Paul Robinson authored Jun 18, 2018
```
llvm-svn: 334936
```
7555c589
[SLPVectorizer] Avoid calling const VL.size() repeatedly in for-loop. NFCI. · 99a58320
Simon Pilgrim authored Jun 18, 2018
```
llvm-svn: 334934
```
99a58320

[VPlanRecipeBase] Add insertBefore helper. · 7591e4e9

Florian Hahn authored Jun 18, 2018

Reviewers: dcaballe, mkuper, hfinkel, hsaito, mssimpso

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D48080

llvm-svn: 334933

7591e4e9

[llvm-exegesis] Optionally ignore instructions without a sched class. · e752fd65

Clement Courbet authored Jun 18, 2018

Summary: See PR37602.

Reviewers: RKSimon

Subscribers: llvm-commits, tschuett

Differential Revision: https://reviews.llvm.org/D48267

llvm-svn: 334932

e752fd65

[AArch64][SVE] Asm: Support for vector element compares. · d521c435

Sander de Smalen authored Jun 18, 2018

This patch adds instructions for comparing elements from two vectors, e.g.
  cmpgt p0.s, p0/z, z0.s, z1.s

and also adds support for comparing to a 64-bit wide element vector, e.g.
  cmpgt p0.s, p0/z, z0.s, z1.d

The patch also contains aliases for certain comparisons, e.g.:
  cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
  cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
  cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
  cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s

llvm-svn: 334931

d521c435

[X86] Fix NOOP sched overrides on BDW/HSW/SKL. · 0d9da88d

Clement Courbet authored Jun 18, 2018

Summary: Noop certainly does not use resources.

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits, gchatelet

Differential Revision: https://reviews.llvm.org/D48028

llvm-svn: 334927

0d9da88d

[X86] Create X86InstrFMA3Group objects fully in a static table instead of on the heap. NFCI · f0ab7bd1

Craig Topper authored Jun 18, 2018

Previously we heap allocated the X86InstrFMA3Group objects which were created by passing them small register/memory opcode arrays that existed as individual static tables.

Rather than a bunch of small static arrays we now have one large static table of X86InstrFMA3Group objects. Rather than storing a pointer to the opcode arrays in the X86InstrFMA3Group object, we now store have a register and memory array as part of the object. If a group doesn't have memory or register opcodes, the array entries will be 0.

This greatly simplifies the destruction of the X86InstrFMA3Info object. We no longer need to delete the X86InstrFMA3Group objects as we destruct the DenseMap. And we don't need to keep track of which ones we already deleted.

This reduces the llc binary size on my local machine by ~50k. I can only assume that's really due to the fact that we had something like 512 small static arrays that we passed to the init functions either one at a time or in pairs. So there were between 256 and 512 distinct calls to the init functions in the initOnceImpl method.

llvm-svn: 334925

f0ab7bd1

[X86] Add '.s' aliases to the assembler for the various redundant move... · 16fdde5e

Craig Topper authored Jun 18, 2018

[X86] Add '.s' aliases to the assembler for the various redundant move encodings to match gas and our EVEX instructions.

We already have these aliases for EVEX enocded instructions, but not for the GPR, MMX, SSE, and VEX versions.

Also remove the vpextrw.s EVEX alias. That's not something gas implements.

llvm-svn: 334922

16fdde5e

[X86] Move the 'vmovq.s' and similar assembly strings for EVEX vector moves... · 916d0cf6

Craig Topper authored Jun 18, 2018

[X86] Move the 'vmovq.s' and similar assembly strings for EVEX vector moves with reversed operands to InstAliases.

The .s assembly strings allow the reversed forms to be targeted from assembly which matches gas behavior. But when printing the instructions we should print them without the .s to match other tooling like objdump. By using InstAliases we can use the normal string in the instruction and just hide it from the assembly parser.

Ideally we'd add the .s versions to the legacy SSE and VEX versions as well for full compatibility with gas. Not sure how we got to state where only EVEX was supported.

llvm-svn: 334920

916d0cf6

[TableGen] Prevent double flattening of InstAlias asm strings in the asm matcher emitter. · 2be74395

Craig Topper authored Jun 18, 2018

Unlike CodeGenInstruction, CodeGenInstAlias was flatting asm strings in its constructor. For instructions it was the users responsibility to flatten the string.

AsmMatcherEmitter didn't know this and treated them the same. This caused double flattening of InstAliases. This is mostly harmless unless the desired assembly string contains curly braces. The second flattening wouldn't know to ignore these and would remove the curly braces. And for variant 1 it would remove the contents of them as well.

To mitigate this, this patch makes removes the flattening from the CodeGenIntAlias constructor and modifies AsmWriterEmitter to account for the flattening not having been done.

llvm-svn: 334919

2be74395

[ORC] Remove redundant condition · 0705ee8d
Lang Hames authored Jun 17, 2018
```
llvm-svn: 334918
```
0705ee8d

Jun 17, 2018

[ORC] Only notify queries that they are resolved/ready when the query state · a5247cc5
Lang Hames authored Jun 17, 2018
```
changes.

This guards against redundant notifications.

llvm-svn: 334916
```
a5247cc5

[X86] Add all the FMA instructions direclty to the load folding table instead... · 9fe45d84

Craig Topper authored Jun 17, 2018

[X86] Add all the FMA instructions direclty to the load folding table instead of proxying through X86InstrFMA3Info.

These increases the size of the static tables, but is closer to what we would get if used the autogenerated table directly. This reduces the remaining large deltas between what's in the manual table and what's in the autogenerated table.

llvm-svn: 334915

9fe45d84

[ORC] Suppress an unused variable warning for a debug-mode only use. · cd018a44
Lang Hames authored Jun 17, 2018
```
llvm-svn: 334911
```
cd018a44
[ORC] Erase empty dependence sets when adding new symbol dependencies. · df5776b1
Lang Hames authored Jun 17, 2018
```
llvm-svn: 334910
```
df5776b1

[ORC] In MaterializationResponsibility, only maintain the Materializing flag on · 11adecfb

Lang Hames authored Jun 17, 2018

symbols in debug mode.

The MaterializationResponsibility class hijacks the Materializing flag to track
symbols that have not yet been resolved in order to guard against redundant
resolution. Since this is an API contract check and only enforced in debug mode
there is no reason to maintain the flag state in release mode.

llvm-svn: 334909

11adecfb

[X86] Pass the parent SDNode to X86DAGToDAGISel::selectScalarSSELoad to... · b0e986f8

Craig Topper authored Jun 17, 2018

[X86] Pass the parent SDNode to X86DAGToDAGISel::selectScalarSSELoad to simplify the hasSingleUseFromRoot handling.

Some of the calls to hasSingleUseFromRoot were passing the load itself. If the load's chain result has a user this would count against that. By getting the true parent of the match and ensuring any intermediate between the match and the load have a single use we can avoid this case. isLegalToFold will take care of checking users of the load's data output.

This fixed at least fma-scalar-memfold.ll to succed without the peephole pass.

llvm-svn: 334908

b0e986f8

[llvm-mca][X86] Add some avx512f/avx512vl resource test placeholders · e930f569

Simon Pilgrim authored Jun 17, 2018

There are a lot of instructions to add under these ISAs (and the other AVX512 variants) but this should demonstrate how to test for the EVEX instructions with different maskings

llvm-svn: 334907

e930f569

[AArch64][SVE] Asm: Support for bitwise operations on predicate vectors. · 279b7e74

Sander de Smalen authored Jun 17, 2018

This patch adds support for instructions performing bitwise operations
on predicate vectors, including AND, BIC, EOR, NAND, NOR, ORN, ORR, and
their status flag setting variants ANDS, BICS, EORS, NANDS, ORNS, ORRS.

This patch also adds several aliases:

  orr  p0.b, p1/z, p1.b, p1.b  => mov  p0.b, p1.b
  orrs p0.b, p1/z, p1.b, p1.b  => movs p0.b, p1.b

  and  p0.b, p1/z, p2.b, p2.b  => mov  p0.b, p1/z, p2.b
  ands p0.b, p1/z, p2.b, p2.b  => movs p0.b, p1/z, p2.b

  eor  p0.b, p1/z, p2.b, p1.b  => not  p0.b, p1/z, p2.b
  eors p0.b, p1/z, p2.b, p1.b  => nots p0.b, p1/z, p2.b

llvm-svn: 334906

279b7e74

[AArch64][SVE] Asm: Support for SEL (vector/predicate) instructions. · 2c25b4cd

Sander de Smalen authored Jun 17, 2018

Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.

llvm-svn: 334905

2c25b4cd

[NVPTX] Ignore target-cpu and -features for inlining · c7410ed4

Jonas Hahnfeld authored Jun 17, 2018

We don't want to prevent inlining because of target-cpu and -features
attributes that were added to newer versions of LLVM/Clang: There are
no incompatible functions in PTX, ptxas will throw errors in such cases.

Differential Revision: https://reviews.llvm.org/D47691

llvm-svn: 334904

c7410ed4

[WebAssembly] Simple comment fix. NFC. · 97869467
Heejin Ahn authored Jun 17, 2018
```
llvm-svn: 334899
```
97869467
[X86] More additions to the load folding tables based on the autogenerated tables. · 29f22d7b
Craig Topper authored Jun 16, 2018
```
Including more additions for NotMemoryFoldable to remove some entries from the autogenerated table.

llvm-svn: 334898
```
29f22d7b

[X86] Hide POP16/32/64rmr and PUSH16/32/64rmr instructions from the assembly parser. · c4356328

Craig Topper authored Jun 16, 2018

These all have a short form encoding that the assembler already prefers. Though that preference seems to only be based on order in the .td fie. Hiding the long form saves space in the table and prevents us from breaking the implicit order based priority.

llvm-svn: 334897

c4356328

[X86] Fix an inconsistency between AVX512 and AVX/SSE version on a couple instructions. · 74412c7d

Craig Topper authored Jun 16, 2018

VMOVPQIto64Zmr is not a 64-bit mode only instruction. But I don't know how to test this because VMOVPQIto64mr should always have priority over it in 32-bit mode since its only advantage is XMM16-XMM31 which aren't usable in 32-bit mode.

VMOVPQIto64Zrr is a 64-bit mode only instruction, but we don't need to explicitly mark it as such because it uses a GR64 register which won't parse in 32-bit mode.

llvm-svn: 334896

74412c7d