Commits · a780ffaac29e9d38db75ba9ba7f74617a2e59ba4 · Roger Ferrer / llvm-epi

Mar 23, 2017
- [AMDGPU] Emit kernel debug properties as code object metadata · a780ffaa
  Konstantin Zhuravlyov authored Mar 22, 2017
```
Differential Revision: https://reviews.llvm.org/D30969

llvm-svn: 298558
```
  a780ffaa
Mar 22, 2017

[AMDGPU] Emit kernel code properties as code object metadata · ca0e7f64

Konstantin Zhuravlyov authored Mar 22, 2017

  - These are not required for low level runtime

Differential Revision: https://reviews.llvm.org/D29949

llvm-svn: 298556

ca0e7f64

Clean up some Subtarget uses and casts in the X86 backend, removing unnecessary work or calls. · fd8510cf
Eric Christopher authored Mar 22, 2017
```
llvm-svn: 298555
```
fd8510cf

[AMDGPU] Restructure code object metadata creation · 7498cd61

Konstantin Zhuravlyov authored Mar 22, 2017

  - Rename runtime metadata -> code object metadata
  - Make metadata not flow
  - Switch enums to use ScalarEnumerationTraits
  - Cleanup and move AMDGPUCodeObjectMetadata.h to AMDGPU/MCTargetDesc
  - Introduce in-memory representation for attributes
  - Code object metadata streamer
  - Create metadata for isa and printf during EmitStartOfAsmFile
  - Create metadata for kernel during EmitFunctionBodyStart
  - Finalize and emit metadata to .note during EmitEndOfAsmFile
  - Other minor improvements/bug fixes

Differential Revision: https://reviews.llvm.org/D29948

llvm-svn: 298552

7498cd61

[AMDGPU] Fix bug 31610 · eb685e5f
Konstantin Zhuravlyov authored Mar 22, 2017
```
Differential Revision: https://reviews.llvm.org/D31258

llvm-svn: 298551
```
eb685e5f

[ARM] t2_so_imm_neg had a subtle bug in the conversion, and could trigger UB... · 50a066b3

Artyom Skrobov authored Mar 22, 2017

[ARM] t2_so_imm_neg had a subtle bug in the conversion, and could trigger UB by negating (int)-2147483648. By pure luck, none of the pre-existing tests triggered this; so I'm adding one.

Summary: Thanks to Vitaly Buka for helping catch this.

Reviewers: rengolin, jmolloy, efriedma, vitalybuka

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D31242

llvm-svn: 298512

50a066b3

[AMDGPU][MC] Fix for Bug 28204 + LIT tests · 895d377d

Dmitry Preobrazhensky authored Mar 22, 2017

Fixed v_mad_i64_i32/u64_u32 encoding

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D30828

llvm-svn: 298502

895d377d

[X86] Remove unnecessary duplicate code (PR30649). NFCI. · b19a507a
Simon Pilgrim authored Mar 22, 2017
```
llvm-svn: 298495
```
b19a507a
[X86] Remove an unused function from release builds. Reported by gccs unused function warning. · 3eb6ff9d
Craig Topper authored Mar 22, 2017
```
llvm-svn: 298485
```
3eb6ff9d

[SystemZ] Don't drop any operands in expandZExtPseudo() · 808c89f4

Jonas Paulsson authored Mar 22, 2017

Make sure that any operands, e.g. of an implicit def of a super reg is
transferred to the new instruction.

Review: Ulrich Weigand
llvm-svn: 298484

808c89f4

Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including... · e69c137f

Vitaly Buka authored Mar 22, 2017

Revert "[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648."

Fails check-llvm with ubsan

This reverts commit r298417.

llvm-svn: 298482

e69c137f

Mar 21, 2017

AMDGPU: Remove hasSideEffects from SI_RETURN_TO_EPILOG · 513cb7a8
Matt Arsenault authored Mar 21, 2017
```
llvm-svn: 298454
```
513cb7a8

AMDGPU: Rename SI_RETURN · 5b20fbb7

Matt Arsenault authored Mar 21, 2017

This is used for a specific type of return to a shader part's
epilog code. Rename to try avoiding confusion from a true
call's return.

llvm-svn: 298452

5b20fbb7

Let llvm.objectsize be conservative with null pointers · 56c7e88c

George Burgess IV authored Mar 21, 2017

This adds a parameter to @llvm.objectsize that makes it return
conservative values if it's given null.

This fixes PR23277.

Differential Revision: https://reviews.llvm.org/D28494

llvm-svn: 298430

56c7e88c

[X86][MS-compatability][llvm] allow MS TYPE/SIZE/LENGTH operators as a part of... · 07a8974c

Coby Tayree authored Mar 21, 2017

[X86][MS-compatability][llvm] allow MS TYPE/SIZE/LENGTH operators as a part of a compound expression

This patch introduces X86AsmParser with the ability to handle the aforementioned ops within compound "MS" arithmetical expressions.
Currently - only supported as a stand alone Operand, e.g.:
"TYPE X"
now allowed :
"4 + TYPE X * 128"

Clang side: https://reviews.llvm.org/D31174

Differential Revision: https://reviews.llvm.org/D31173

llvm-svn: 298425

07a8974c

[X86] Remove extra semicolon to placate GCC. NFCI. · 200e5e18
Davide Italiano authored Mar 21, 2017
```
llvm-svn: 298423
```
200e5e18

[ARM] Recommit the glueless lowering of addc/adde in Thumb1, · 40a4f406

Artyom Skrobov authored Mar 21, 2017

including the amended (no UB anymore) fix for adding/subtracting -2147483648.

This reverts r298328 "[ARM] Revert r297443 and r297820."
and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648""

llvm-svn: 298417

40a4f406

Recommit r298282 with fixes for memory allocation/deallocation · d033d1fd

Krzysztof Parzyszek authored Mar 21, 2017

[Hexagon] Recognize polynomial-modulo loop idiom again

Regain the ability to recognize loops calculating polynomial modulo
operation. This ability has been lost due to some changes in the
preceding optimizations. Add code to preprocess the IR to a form
that the pattern matching code can recognize.

llvm-svn: 298400

d033d1fd

AMDGPU: Buffer descriptor changes for GFX9 · 5c7a61d2

Marek Olsak authored Mar 21, 2017

Reviewers: arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr

Differential Revision: https://reviews.llvm.org/D31158

llvm-svn: 298397

5c7a61d2

AMDGPU: Always use VGPR indexing on GFX9 · e22fdb9c

Marek Olsak authored Mar 21, 2017

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr

Differential Revision: https://reviews.llvm.org/D31157

llvm-svn: 298396

e22fdb9c

Rename AttributeSet to AttributeList · b518054b

Reid Kleckner authored Mar 21, 2017

Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393

b518054b

AMDGPU: Fix not including v2i16/v2f16 in register class · 5af82a7a
Matt Arsenault authored Mar 21, 2017
```
llvm-svn: 298390
```
5af82a7a
AMDGPU: Fix asserting on 0 dmask for image intrinsics · f8fb605a
Matt Arsenault authored Mar 21, 2017
```
Fold these to undef during lowering so users get eliminated.

llvm-svn: 298387
```
f8fb605a

[ARM] [Assembler] Support negative immediates for A32, T32 and T16 · 2409c640

Sanne Wouda authored Mar 21, 2017

Summary:
To support negative immediates for certain arithmetic instructions, the
instruction is converted to the inverse instruction with a negated (or inverted)
immediate. For example, "ADD r0, r1, #FFFFFFFF" cannot be encoded as an ADD
instruction.  However, "SUB r0, r1, #1" is equivalent.

These conversions are different from instruction aliases.  An alias maps
several assembler instructions onto one encoding.  A conversion, however, maps
an *invalid* instruction--e.g. with an immediate that cannot be represented in
the encoding--to a different (but equivalent) instruction.

Several instructions with negative immediates were being converted already, but
this was not systematically tested, nor did it cover all instructions.

This patch implements all possible substitutions for ARM, Thumb1 and
Thumb2 assembler and adds tests.  It also adds a feature flag
(-mattr=+no-neg-immediates) to turn these substitutions off.  This is
helpful for users who want their code to assemble to exactly what they
wrote.

Reviewers: t.p.northover, rovka, samparker, javed.absar, peter.smith, rengolin

Reviewed By: javed.absar

Subscribers: aadg, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D30571

llvm-svn: 298380

2409c640

[x86] use PMOVMSK for vector-sized equality comparisons · 79379cae

Sanjay Patel authored Mar 21, 2017

We could do better by splitting any oversized type into whatever vector size the target supports,
but I left that for future work if it ever comes up. The motivating case is memcmp() calls on 16-byte
structs, so I think we can wire that up with a TLI hook that feeds into this.

Differential Revision: https://reviews.llvm.org/D31156

llvm-svn: 298376

79379cae

[AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler · fd4c410f
Valery Pykhtin authored Mar 21, 2017
```
Differential revision: https://reviews.llvm.org/D31046

llvm-svn: 298368
```
fd4c410f

[ADMGPU] SDWA peephole optimization pass. · f60ad58d

Sam Kolton authored Mar 21, 2017

Summary:
First iteration of SDWA peephole.

This pass tries to combine several instruction into one SDWA instruction. E.g. it converts:
'''
V_LSHRREV_B32_e32 %vreg0, 16, %vreg1
V_ADD_I32_e32 %vreg2, %vreg0, %vreg3
V_LSHLREV_B32_e32 %vreg4, 16, %vreg2
'''
Into:
'''
V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD
'''

Pass structure:
1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''.
2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0'''
3. Iterate over all potential instructions and check if they can be converted into SDWA.
4. Convert instructions to SDWA.

This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done).
There are several ways this pass can be improved:
1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass.
2. Introduce more SDWA patterns
3. Introduce mnemonics to limit when SDWA patterns should apply

Reviewers: vpykhtin, alex-t, arsenm, rampitec

Subscribers: wdng, nhaehnle, mgorny

Differential Revision: https://reviews.llvm.org/D30038

llvm-svn: 298365

f60ad58d

[DebugInfo][X86] Teach Optimize LEAs pass to handle debug values · 7937be7d

Andrea Di Biagio authored Mar 21, 2017

This patch fixes an issue in the Optimize LEAs pass where redundant LEAs were
not removed because they were being used by debug values. The debug values are
now ignored when determining whether LEAs are redundant.

For now the debug values for the redundant LEAs are marked as undefined,
effectively lost. The intention is for a follow up patch which will attempt to
preserve the debug values where possible.

Patch by Andrew Ng.

Differential Revision: https://reviews.llvm.org/D30835

llvm-svn: 298360

7937be7d

[SystemZ] Don't drop MO flags in foldMemoryOperandImpl() · bd65421f

Jonas Paulsson authored Mar 21, 2017

The def operand of the new LG/LD should have the old def operands
flags and subreg index.

New test: test/CodeGen/SystemZ/fold-memory-op-impl.ll

Review: Ulrich Weigand
llvm-svn: 298341

bd65421f

Revert "[Hexagon] Recognize polynomial-modulo loop idiom again" · c12716e7
Vitaly Buka authored Mar 21, 2017
```
Fix memory leaks on check-llvm tests detected by Asan.

This reverts commit r298282.

llvm-svn: 298329
```
c12716e7

[ARM] Revert r297443 and r297820. · 76732acc

Eli Friedman authored Mar 21, 2017

The glueless lowering of addc/adde in Thumb1 has known serious
miscompiles (see https://reviews.llvm.org/D31081), and r297820
causes an infinite loop for certain constructs.  It's not
clear when they will be fixed, so let's just take them out
of the tree for now.

(I resolved a small conflict with r297453.)

llvm-svn: 298328

76732acc

Mar 20, 2017

[ARM] Fix PR32130: Handle promotion of zero sized constants. · ba789cbd

Vadzim Dambrouski authored Mar 20, 2017

The special case of zero sized values was previously not handled correctly.
This patch handles this by not promoting if the size is zero.

Patch by Tim Neumann.

Differential Revision: https://reviews.llvm.org/D31116

llvm-svn: 298320

ba789cbd

[Fuchsia] Use %gs for ABI slots under -mcmodel=kernel · e829eecc

Evgeniy Stepanov authored Mar 20, 2017

Make x86_64-fuchsia targets under -mcmodel=kernel use %gs rather
than %fs to access ABI slots for stack-protector and safe-stack

Patch by Roland McGrath.

Differential Revision: https://reviews.llvm.org/D30870

llvm-svn: 298302

e829eecc

[Hexagon] Recognize polynomial-modulo loop idiom again · 8490251d

Krzysztof Parzyszek authored Mar 20, 2017

Regain the ability to recognize loops calculating polynomial modulo
operation. This ability has been lost due to some changes in the
preceding optimizations. Add code to preprocess the IR to a form
that the pattern matching code can recognize.

llvm-svn: 298282

8490251d

[AMDGPU] Run always inliner early in opt · 2534bc07
Konstantin Zhuravlyov authored Mar 20, 2017
```
Differential Revision: https://reviews.llvm.org/D31141

llvm-svn: 298281
```
2534bc07

[AMDGPU][MC] Fix for Bugs 28201, 28199, 28170 + LIT tests · 1e124e18

Dmitry Preobrazhensky authored Mar 20, 2017

This fix enables sp3 abs modifier with constants

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D30825

llvm-svn: 298265

1e124e18

[Outliner] ACTUALLY remove the errs output · 02cbfb29

Jessica Paquette authored Mar 20, 2017

I don't know how to type. This fixes the last commit which would have made all
of the overflows legal, and kept the screaming.

llvm-svn: 298263

02cbfb29

[Outliner] Remove output for offset range check · 5d59a4ee

Jessica Paquette authored Mar 20, 2017

Forgot to remove some output before committing last time. (Instruction fixups
don't actually overflow anywhere in the test suite so far, so I missed it).

To prevent the outliner from screaming "Overflow!" in the event that that
does happen, this commit removes that output.

llvm-svn: 298260

5d59a4ee

[AMDGPU][MC] Fix for Bugs 28200, 28202 + LIT tests · 40af9c35

Dmitry Preobrazhensky authored Mar 20, 2017

Fixed several related issues with VOP3 fp modifiers.

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D30821

llvm-svn: 298255

40af9c35

[GlobalISel] Use the correct calling conv for calls · d79253a9

Diana Picus authored Mar 20, 2017

This commit adds a parameter that lets us pass in the calling convention
of the call to CallLowering::lowerCall. This allows us to handle
situations where the calling convetion of the callee is different from
that of the caller.

Differential Revision: https://reviews.llvm.org/D31039

llvm-svn: 298254

d79253a9