Commits · 6270ab6ce4e38b3d2f5a4a29c49e2f9515a5540e · Lorenzo Albano / LLVM bpEVL

Jul 04, 2018

[Power9]Legalize and emit code for round & convert quad-precision values · 6270ab6c

Lei Huang authored Jul 04, 2018

Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

llvm-svn: 336299

6270ab6c

[mips] Warn when crc, ginv, virt flags are used with too old revision · 87b60a0e

Vladimir Stefanovic authored Jul 04, 2018

CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.

Differential Revision: https://reviews.llvm.org/D48843

llvm-svn: 336296

87b60a0e

[PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler · cb4f0c5c

Stefan Pintilie authored Jul 04, 2018

  We want to run the Machine Scheduler instead of the List Scheduler after RA.
  Checked with a performance run on a Power 9 machine with SPEC 2006 and while
  some benchmarks improved and others degraded the geomean was slightly improved
  with the Machine Scheduler.

  Differential Revision: https://reviews.llvm.org/D45265

llvm-svn: 336295

cb4f0c5c

[ARM] [Assembler] Support negative immediates: cover few missing cases · 17c0c4e7

Volodymyr Turanskyy authored Jul 04, 2018

Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.

This change adds negative immediates support and respective tests
for the following:

ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W

Differential Revision: https://reviews.llvm.org/D48649

llvm-svn: 336286

17c0c4e7

[MachineOutliner] Fix typo in getOutliningCandidateInfo function name · eaececf5

Yvan Roux authored Jul 04, 2018

getOutlininingCandidateInfo -> getOutliningCandidateInfo

Differential Revision: https://reviews.llvm.org/D48867

llvm-svn: 336285

eaececf5

[AArch64][SVE] Asm: Support for reversed subtract (SUBR) instruction. · 1e4dc2e9

Sander de Smalen authored Jul 04, 2018

                                                                           
This patch adds both a vector and an immediate form, e.g.                  
                                                                           
- Vector form:                                                             
                                                                           
    subr z0.h, p0/m, z0.h, z1.h                                            
                                                                           
  subtract active elements of z0 from z1, and store the result in z0.      
                                                                           
- Immediate form:                                                          
                                                                           
    subr z0.h, z0.h, #255                                                  
                                                                           
  subtract elements of z0, and store the result in z0.

llvm-svn: 336274

1e4dc2e9

[AArch64][SVE] Asm: Support for instructions to set/read FFR. · ab2b0530

Sander de Smalen authored Jul 04, 2018

Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
    rdffr   p0.b
- RDFFR (predicated)
    rdffr   p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
    rdffr   p0.b, p0/z

Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
    setffr
- WRFFR  (unpredicated)
    wrffr   p0.b

llvm-svn: 336267

ab2b0530

[AArch64][SVE] Asm: Support for FP conversion instructions. · 80283b2a

Sander de Smalen authored Jul 04, 2018

The variants added are:

- fcvt   (FP convert precision)
- scvtf  (signed int -> FP) 
- ucvtf  (unsigned int -> FP) 
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))

For example:
  fcvt   z0.h, p0/m, z0.s  (single- to half-precision FP) 
  scvtf  z0.h, p0/m, z0.s  (32-bit int to half-precision FP) 
  ucvtf  z0.h, p0/m, z0.s  (32-bit unsigned int to half-precision FP) 
  fcvtzs z0.s, p0/m, z0.h  (half-precision FP to 32-bit int)
  fcvtzu z0.s, p0/m, z0.h  (half-precision FP to 32-bit unsigned int)

llvm-svn: 336265

80283b2a

[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED) · c3e1617b

Simon Pilgrim authored Jul 04, 2018

We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.

Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247.

llvm-svn: 336250

c3e1617b

[AArch64][SVE] Asm: Support for SVE condition code aliases · e31e6d46

Sander de Smalen authored Jul 04, 2018

SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The 
details are described in section 2.2 of the architecture
reference manual supplement for SVE.

In short:

  SVE alias =>  AArch64 name
  --------------------------
  NONE      => EQ
  ANY       => NE
  NLAST     => HS
  LAST      => LO
  FIRST     => MI
  NFRST     => PL
  PMORE     => HI
  PLAST     => LS
  TCONT     => GE
  TSTOP     => LT

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48869

llvm-svn: 336245

e31e6d46

[lanai] Handle atomic load of i8 like regular load. · 0b1e1a4c

Jacques Pienaar authored Jul 03, 2018

Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded.

llvm-svn: 336236

0b1e1a4c

Jul 03, 2018

[X86][AsmParser] Fix inconsistent declaration parameter name in r336218 · 78ab286a
Fangrui Song authored Jul 03, 2018
```
llvm-svn: 336232
```
78ab286a
[NVPTX] Expand v2f16 INSERT_VECTOR_ELT · 2f0dd140
Benjamin Kramer authored Jul 03, 2018
```
Vectorization can create them.

llvm-svn: 336227
```
2f0dd140
[X86] Remove repeated 'the' from multiple comments that have been copy and pasted. NFC · e317533d
Craig Topper authored Jul 03, 2018
```
llvm-svn: 336226
```
e317533d
[ARM] Fix inconsistent declaration parameter name in r336195 · 68169343
Fangrui Song authored Jul 03, 2018
```
llvm-svn: 336223
```
68169343
[AArch64] Make function parameter names in declarations match those of definitions · bc5c7f2e
Fangrui Song authored Jul 03, 2018
```
llvm-svn: 336222
```
bc5c7f2e

[X86][AsmParser] Rework the in/out (%dx) hack one more time. · adc51ae4

Craig Topper authored Jul 03, 2018

This patch adds a new token type specifically for (%dx). We will now always create this token when we parse (%dx). After all operands have been parsed, if the mnemonic is in/out we'll morph this token to a regular register token. Otherwise we keep it as the special DX token which won't match any instructions.

This removes the need for passing Mnemonic through the parsing functions. It also seems closer to gas where when its used on the wrong instruction it just gets diagnosed as an invalid operand rather than a bad memory address.

llvm-svn: 336218

adc51ae4

[X86][AsmParser] Don't consider %eip as a valid register outside of 32-bit mode. · bc598f0d

Craig Topper authored Jul 03, 2018

This might make the error message added in r335668 unneeded, but I'm not sure yet.

The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.

llvm-svn: 336217

bc598f0d

[AArch64][SVE] Asm: Support for FP Complex ADD/MLA. · 128fdfa2

Sander de Smalen authored Jul 03, 2018

The variants added in this patch are:

- Predicated Complex floating point ADD with rotate, e.g.

   fcadd   z0.h, p0/m, z0.h, z1.h, #90

- Predicated Complex floating point MLA with rotate, e.g.

   fcmla   z0.h, p0/m, z1.h, z2.h, #180

- Unpredicated Complex floating point MLA with rotate (indexed operand), e.g.

   fcmla   z0.h, p0/m, z1.h, z2.h[0], #180

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48824

llvm-svn: 336210

128fdfa2

[AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to unselectable stores. · d912ffab

Amara Emerson authored Jul 03, 2018

r336120 resulted in falling back to SelectionDAG more often due to the G_STORE
MMOs not matching the vreg size. This fixes that by explicitly any-extending the
value.

llvm-svn: 336209

d912ffab

[AArch64][SVE] Asm: Support for FMUL (indexed) · 8cd1f533

Sander de Smalen authored Jul 03, 2018

Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:

  fmul z0.s, z1.s, z2.s[0]

which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.

This patch adds restricted register classes for SVE vectors:
  ZPR_3b (only z0..z7 are allowed)  - for indexed vector of 16/32-bit elements.
  ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48823

llvm-svn: 336205

8cd1f533

[AArch64][SVE] Asm: Support for predicated unary operations. · cbd22494

Sander de Smalen authored Jul 03, 2018

The patch includes support for the following instructions:

       ABS z0.h, p0/m, z0.h
       NEG z0.h, p0/m, z0.h

  (S|U)XTB z0.h, p0/m, z0.h
  (S|U)XTB z0.s, p0/m, z0.s
  (S|U)XTB z0.d, p0/m, z0.d

  (S|U)XTH z0.s, p0/m, z0.s
  (S|U)XTH z0.d, p0/m, z0.d

  (S|U)XTW z0.d, p0/m, z0.d

llvm-svn: 336204

cbd22494

[ARM][NFC] Refactor sequential access for DSP · ffc16816

Sam Parker authored Jul 03, 2018

    
With a view to support parallel operations that have their results
stored to memory, refactor the consecutive access helper out so it
could support stores instructions.

Differential Revision: https://reviews.llvm.org/D48872

llvm-svn: 336195

ffc16816

[AArch64] Armv8.4-A: system registers · 173b7f0e

Sjoerd Meijer authored Jul 03, 2018

This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.

Differential Revision: https://reviews.llvm.org/D48871

llvm-svn: 336193

173b7f0e

Revert "[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values" · fd171f2f
Benjamin Kramer authored Jul 03, 2018
```
This reverts commit r336113. It causes crashes.

llvm-svn: 336189
```
fd171f2f

[AArch64][SVE] Asm: Support for saturing ADD/SUB instructions. · 7fc85432

Sander de Smalen authored Jul 03, 2018

The variants added are:
    signed Saturating ADD/SUB (immediate)  e.g. sqadd z0.h, z0.h, #42
  unsigned Saturating ADD/SUB (immediate)  e.g. uqadd z0.h, z0.h, #42
    signed Saturating ADD/SUB (vectors)    e.g. sqadd z0.h, z0.h, z1.h
  unsigned Saturating ADD/SUB (vectors)    e.g. uqadd z0.h, z0.h, z1.h

llvm-svn: 336186

7fc85432

[MIPS GlobalISel] Lower arguments using stack · 226e6117

Petar Jovanovic authored Jul 03, 2018

Lower more than 4 arguments using stack. This patch targets MIPS32.
It supports only functions with arguments of type i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D47934

llvm-svn: 336185

226e6117

[AArch64][SVE] Asm: Support for vector element FP compare. · 8fcc3f5f

Sander de Smalen authored Jul 03, 2018

Contains the following variants:

- Compare with (elements from) other vector
  instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo.
  aliases: fcmle, fcmlt.

  e.g. fcmle   p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h

- Compare absolute values with (absolute values from) other vector.
  instructions: facge, facgt.
  aliases: facle, faclt.

  e.g. facle   p0.h, p0/z, z0.h, z1.h => facge   p0.h, p0/z, z1.h, z0.h

- Compare vector elements with #0.0
  instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne.

  e.g. fcmle   p0.h, p0/z, z0.h, #0.0

llvm-svn: 336182

8fcc3f5f

Jul 02, 2018

[WebAssembly] Support for atomic stores · 402b4908

Heejin Ahn authored Jul 02, 2018

Summary: Add support for atomic store instructions.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48839

llvm-svn: 336145

402b4908

[ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m. · fd10286e

Vadzim Dambrouski authored Jul 02, 2018

Reviewers: efriedma, rogfer01, javed.absar

Reviewed By: efriedma, rogfer01

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D48846

llvm-svn: 336144

fd10286e

[WebAssembly] Fix fast-isel optimization of branch conditions. · b01d8762

Dan Gohman authored Jul 02, 2018

LLVM doesn't guarantee anything about the high bits of a register holding
an i1 value at the IR level, so don't translate LLVM IR i1 values directly
into WebAssembly conditional branch operands. WebAssembly's conditional
branches do demand all 32 bits be valid.

Fixes PR38019.

llvm-svn: 336138

b01d8762

[X86] Add phony registers for high halves of regs with low halves · fd974949

Krzysztof Parzyszek authored Jul 02, 2018

Add registers still missing after r328016 (D43353):
- for bits 15-8  of SI, DI, BP, SP (*H), and R8-R15 (*BH),
- for bits 31-16 of R8-R15 (*WH).

Thanks to Craig Topper for pointing it out.

llvm-svn: 336134

fd974949

[X86] Don't use aligned load/store instructions for fp128 if the load/store isn't aligned. · 56440b97

Craig Topper authored Jul 02, 2018

Similarily, don't fold fp128 loads into SSE instructions if the load isn't aligned. Unless we're targeting an AMD CPU that doesn't check alignment on arithmetic instructions.

Should fix PR38001

llvm-svn: 336121

56440b97

[AArch64][GlobalISel] Any-extend vararg parameters to stack slot size on Darwin. · 846f2436

Amara Emerson authored Jul 02, 2018

We currently don't any-extend vararg parameters before storing them to the stack
locations on Darwin. However, SelectionDAG however does this, and so user code
is in the wild which inadvertently relies on this extension. This can manifest
in cases where the value stored is (int)0, but the actual parameter is interpreted
by va_arg as a pointer, and so not extending to 64 bits causes the callee to
load additional undefined bits.

llvm-svn: 336120

846f2436

[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values · 2bc8e079

Simon Pilgrim authored Jul 02, 2018

We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.

llvm-svn: 336113

2bc8e079

[X86] Use addAliasForDirective to support the .word directive (reland) · c4890878

Alex Bradbury authored Jul 02, 2018

The X86 asm parser currently has custom parsing logic for .word. Rather than
use this custom logic, we can just use addAliasForDirective to enable the
reuse of AsmParser::parseDirectiveValue.

See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon
(rL332607) backends.

Differential Revision: https://reviews.llvm.org/D47004

This is a fixed reland of rL336100. This should have been caught in 
pre-commit testing so apologies for the noise.

llvm-svn: 336104

c4890878

Revert r336100 · c000e4dc
Alex Bradbury authored Jul 02, 2018
```
This was a bad change. .word == 2byte on x86.

llvm-svn: 336103
```
c000e4dc

[X86] Use addAliasForDirective to support the .word directive · 42485ec9

Alex Bradbury authored Jul 02, 2018

The X86 asm parser currently has custom parsing logic for .word. Rather than 
use this custom logic, we can just use addAliasForDirective to enable the 
reuse of AsmParser::parseDirectiveValue.

See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon 
(rL332607) backends.

Differential Revision: https://reviews.llvm.org/D47004

llvm-svn: 336100

42485ec9

[AArch64][SVE] Asm: Support for (SQ)INCP/DECP (scalar, vector) · 8d4c01a7

Sander de Smalen authored Jul 02, 2018

Increments/decrements the result with the number of active bits
from the predicate.

The inc/dec variants added are:
- incp   x0, p0.h     (scalar)
- incp   z0.h, p0     (vector)

The unsigned saturating inc/dec variants added are:
- uqincp x0, p0.h     (scalar)
- uqincp w0, p0.h     (scalar, 32bit)
- uqincp z0.h, p0     (vector)

The signed saturating inc/dec variants added are:
- sqincp x0, p0.h     (scalar)
- sqincp x0, p0.h, w0 (scalar, 32bit)
- sqincp z0.h, p0     (vector)

llvm-svn: 336091

8d4c01a7

[AArch64][SVE] Asm: Support for (saturating) vector INC/DEC instructions. · c5041017

Sander de Smalen authored Jul 02, 2018

Increment/decrement vector by multiple of predicate constraint
element count.

The variants added by this patch are:
 - INCH, INCW, INC 

and (saturating):
 - SQINCH, SQINCW, SQINCD
 - UQINCH, UQINCW, UQINCW
 - SQDECH, SQINCW, SQINCD
 - UQDECH, UQINCW, UQINCW

For example:
  incw z0.s, all, mul #4

llvm-svn: 336090

c5041017