Commits · 1efc17e1ec2834581555bf69e1f77e190a9498e3 · Lorenzo Albano / LLVM bpEVL

Jun 10, 2016

test commit: remove trailing whitespaces in README.txt · 1efc17e1
Roger Ferrer Ibanez authored Jun 10, 2016
```
llvm-svn: 272380
```
1efc17e1
[AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ. · 200d237e
Craig Topper authored Jun 10, 2016
```
llvm-svn: 272371
```
200d237e

[AVX512] Fix shuffle comment printing to handle the masked versions of some... · 89c17614

Craig Topper authored Jun 10, 2016

[AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names.

llvm-svn: 272367

89c17614

AMDGPU: Fix trailing whitespace · 37fefd68
Matt Arsenault authored Jun 10, 2016
```
llvm-svn: 272364
```
37fefd68
AMDGPU: v_cndmask_b32 does not def vcc · 58ddad5b
Matt Arsenault authored Jun 10, 2016
```
Fixes verifier errors after SIShrinkInstructions.

llvm-svn: 272351
```
58ddad5b

AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permute · 26a2ab74

Tom Stellard authored Jun 10, 2016

Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.

Reviewers: scchan, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19994

llvm-svn: 272349

26a2ab74

AMDGPU/SI: Use common topological sort algorithm in SIScheduleDAGMI · 1d3940e8

Tom Stellard authored Jun 09, 2016

Reviewers: arsenm, axeldavy

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19823

llvm-svn: 272346

1d3940e8

AMDGPU: Fix flat atomics · 7757c59e

Matt Arsenault authored Jun 09, 2016

The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.

llvm-svn: 272345

7757c59e

AMDGPU: Fix i64 global cmpxchg · 88701817

Matt Arsenault authored Jun 09, 2016

This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.

There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.

llvm-svn: 272344

88701817

Add aliases for mfvrsave/mtvrsave. · 1dbb23e1

Eric Christopher authored Jun 09, 2016

Update a test as we're now going to emit it for easier reading of
generated assembly as well.

llvm-svn: 272339

1dbb23e1

AMDGPU: Run verifer after insert waits pass · e2bd9a32
Matt Arsenault authored Jun 09, 2016
```
llvm-svn: 272338
```
e2bd9a32

AMDGPU: Remove incorrect assertion · cfb61e78

Matt Arsenault authored Jun 09, 2016

I'm still not sure under what circumstances the offset here is non-0,
but private memory is not limited to 27-bits.

llvm-svn: 272337

cfb61e78

AMDGPU: Properly initialize SIShrinkInstructions · c3a01ec9
Matt Arsenault authored Jun 09, 2016
```
llvm-svn: 272336
```
c3a01ec9
[X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments · 643734c5
Simon Pilgrim authored Jun 09, 2016
```
llvm-svn: 272319
```
643734c5

Jun 09, 2016

[X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics · f718682e

Simon Pilgrim authored Jun 09, 2016

Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ 

llvm-svn: 272308

f718682e

[X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR · 47c76e20
Simon Pilgrim authored Jun 09, 2016
```
llvm-svn: 272307
```
47c76e20

[X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte shifts · 0ab9d302

Simon Pilgrim authored Jun 09, 2016

512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget

llvm-svn: 272300

0ab9d302

[NVPTX] Add intrinsics for shfl instructions. · ed2c282d

Justin Lebar authored Jun 09, 2016

Summary:
Currently clang emits these instructions via inline (volatile) asm in
the CUDA headers.  Switching to intrinsics will let the optimizer reason
across calls to these intrinsics.

Reviewers: tra

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D21160

llvm-svn: 272298

ed2c282d

AMDGPU/SI: Fix 32-bit fdiv lowering · ed0f97fa

Wei Ding authored Jun 09, 2016

We were using the fast fdiv lowering for all division, implementation of
IEEE754 fdiv is added.

http://reviews.llvm.org/D20557

llvm-svn: 272292

ed0f97fa

[SystemZ] Enable long displacement constraints for inline ASM operands · 79564611

Ulrich Weigand authored Jun 09, 2016

This enables use of the 'S' constraint for inline ASM operands on
SystemZ, which allows for a memory reference with a signed 20-bit
immediate displacement. This patch includes corresponding documentation
and test case updates.

I've changed the 'T' constraint to match the new behavior for 'S', as
'T' also uses a long displacement (though index constraints are still
not implemented). I also changed 'm' to match the behavior for 'S' as
this will allow for a wider range of displacements for 'm', though
correct me if that's not the right decision.

Author: colpell
Differential Revision: http://reviews.llvm.org/D21097

llvm-svn: 272266

79564611

[mips][microMIPS] Implement BOVC, BNVC, EXT, INS and JALRC instructions · c962c493
Hrvoje Varga authored Jun 09, 2016
```
Differential Revision: http://reviews.llvm.org/D11798

llvm-svn: 272259
```
c962c493

[Thumb] A branch is not part of an IT block · a7dbf987

James Molloy authored Jun 09, 2016

ReplaceTailWithBranchTo assumed that if an instruction is predicated, it must be part of an IT block. This is not correct for conditional branches.

No testcase as this was triggered by the reverted patch r272017 - test coverage will occur when that patch is re-reverted and there is no known way to trigger this in the meantime.

llvm-svn: 272258

a7dbf987

[AVX512] Remove masked_move/blendm intrinsic from back-end. · f635367e

Igor Breger authored Jun 09, 2016

This is complement patch to D21060.

Differential Revision: http://reviews.llvm.org/D21174

llvm-svn: 272257

f635367e

[mips][microMIPS] Add CodeGen support for SEL.*, SELEQZ, SELNEZ, SELEQZ.*,... · cd242c16

Zlatko Buljan authored Jun 09, 2016

[mips][microMIPS] Add CodeGen support for SEL.*, SELEQZ, SELNEZ, SELEQZ.*, SELNEZ.* and CMP.condn.fmt instructions
Differential Revision: http://reviews.llvm.org/D20862

llvm-svn: 272256

cd242c16

[AMDGPU] Disassembler: Support for sdwa instructions · c9bdcb75

Sam Kolton authored Jun 09, 2016

Reviewers: vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D21129

llvm-svn: 272255

c9bdcb75

[AVX512] Fix shuffle decode printing for several instructions with write... · 6f7288dc

Craig Topper authored Jun 09, 2016

[AVX512] Fix shuffle decode printing for several instructions with write masks. There are still more bugs here with UNPCK and PALIGN for sure. But these were the easiest ones to fix.

llvm-svn: 272252

6f7288dc

[Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated · feb9f424

James Molloy authored Jun 09, 2016

If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead;

  int i(int a) {
    return a & 0xfffffeec;
  }

Used to produce:
    ldr r1, [CONSTPOOL]
    ands r0, r1
  CONSTPOOL: 0xfffffeec

And now produces:
    movs    r1, #255
    adds    r1, #20  ; Less costly immediate generation
    bics    r0, r1

llvm-svn: 272251

feb9f424

[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions.... · 7a299309

Craig Topper authored Jun 09, 2016

[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions.

llvm-svn: 272249

7a299309

[X86] Fix bad comment in assert. NFC · 565a5b54
Craig Topper authored Jun 09, 2016
```
llvm-svn: 272248
```
565a5b54

AArch64: support the `.arch` directive in the IAS · 6c19ffc8

Saleem Abdulrasool authored Jun 09, 2016

Add support to the AArch64 IAS for the `.arch` directive.  This allows the
assembly input to use architectural functionality in part of a file.  This is
used in existing code like BoringSSL.

Resolves PR26016!

llvm-svn: 272241

6c19ffc8

Jun 08, 2016

Apply most suggestions of clang-tidy's performance-unnecessary-value-param · c321e534
Benjamin Kramer authored Jun 08, 2016
```
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

llvm-svn: 272190
```
c321e534

[AArch64][RegisterBankInfo] G_OR are fine on either GPR or FPR. · d1cd30b2

Quentin Colombet authored Jun 08, 2016

Teach AArch64RegisterBankInfo that G_OR can be mapped on either GPR or
FPR for 64-bit or 32-bit values.

Add test cases demonstrating how this information is used to coalesce a
computation on a single register bank.

llvm-svn: 272170

d1cd30b2

[ARM] MSR instructions implicitly set CPSR · b3378e2f

Oliver Stannard authored Jun 08, 2016

The MSR instructions can write to the CPSR, but we did not model this
fact, so we could emit them in the middle of IT blocks, changing the
condition flags for later instructions in the block.

The tests use two calls to llvm.write_register.i32 because it is valid
to use these instructions at the end of an IT block, which if conversion
does do in some cases. With two calls, the first clobbers the flags, so
a branch has to be used to make the second one conditional.

Differential Revision: http://reviews.llvm.org/D21139

llvm-svn: 272154

b3378e2f

[mips] Add a proper file header in MipsFastISel.cpp · a9e5154d
Vasileios Kalintiris authored Jun 08, 2016
```
llvm-svn: 272138
```
a9e5154d

[Hexagon] Modify HexagonExpandCondsets to handle subregisters · b16882dd

Krzysztof Parzyszek authored Jun 08, 2016

Also, switch to using functions from LiveIntervalAnalysis to update
live intervals, instead of performing the updates manually.

Re-committing r272045.

llvm-svn: 272135

b16882dd

[ARM] Remove redundant check. NFC · 0781d10a

Diana Picus authored Jun 08, 2016

isSwift is tested earlier and known to be false when we reach this code.

llvm-svn: 272127

0781d10a

Avoid copies of std::strings and APInt/APFloats where we only read from it · 46e38f36

Benjamin Kramer authored Jun 08, 2016

As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126

46e38f36

[AVX512] Fix cvtusi2sd instruction Opcode, it should be 0x7B instead of 0x2A. · 982e4003
Igor Breger authored Jun 08, 2016
```
llvm-svn: 272122
```
982e4003
[AArch64][RegisterBankInfo] Use the generic implementation of copyCost. · a4ac7cda
Quentin Colombet authored Jun 08, 2016
```
Long term we may want to give high cost at FPR to/from GPR copies.

llvm-svn: 272086
```
a4ac7cda

[RegisterBankInfo] Add a size argument for the cost of copy. · cfbdee23

Quentin Colombet authored Jun 08, 2016

The cost of a copy may be different based on how many bits we have to
copy around. E.g., a 8-bit copy may be different than a 32-bit copy.

llvm-svn: 272084

cfbdee23