Commits · f4ec67822fb6dd96bb9959d84178d220833325c2 · Lorenzo Albano / LLVM bpEVL

May 24, 2018

[PowerPC] Remove the match pattern in the definition of LXSDX/STXSDX · f4ec6782

Lei Huang authored May 24, 2018

The match pattern in the definition of LXSDX is xoaddr, so the Pseudo
instruction XFLOADf64 never gets selected. XFLOADf64 expands to LXSDX/LFDX post
RA based on the register pressure. To avoid ambiguity, we need to remove the
select pattern for LXSDX, same as what was done for LXSD. STXSDX also have
the same issue.

Patch by Qing Shan Zhang (steven.zhang).

Differential Revision: https://reviews.llvm.org/D47178

llvm-svn: 333150

f4ec6782

[RISCV] Lower the tail pseudoinstruction · ddcb9566

Mandeep Singh Grang authored May 23, 2018

This patch lowers the tail pseudoinstruction. This has been modeled after ARM's
tail call opt.

llvm-svn: 333137

ddcb9566

May 23, 2018

[RISCV] Set CostPerUse for registers · eadce027

Sameer AbuAsal authored May 23, 2018

Summary:
 Set CostPerUse higher for registers that are not used in the compressed
 instruction set. This will influence the greedy register allocator to reduce
 the use of registers that can't be encoded in 16 bit instructions. This
 affects register allocation even when compressed instruction isn't targeted,
 we see no major negative codegen impact.

Reviewers: asb

Reviewed By: asb

Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang

Differential Revision: https://reviews.llvm.org/D47039

llvm-svn: 333132

eadce027

[Power9]Legalize and emit code for W vector extract and convert to QP · 8b0da65b

Lei Huang authored May 23, 2018

Implemente patterns to extract [Un]signed Word vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46536

llvm-svn: 333115

8b0da65b

[Power9]Legalize and emit code for DW vector extract and convert to QP · 8990168a

Lei Huang authored May 23, 2018

Implemente patterns to extract [Un]signed DWord vector element and convert to
quad-precision.

Differential Revision: https://reviews.llvm.org/D46333

llvm-svn: 333112

8990168a

[CodeGen][AArch64] Use RegUnits to track register aliases. (NFC) · 3f663631

Chad Rosier authored May 23, 2018

Use RegUnits to track register aliases in AArch64RedundantCopyElimination.

Differential Revision: https://reviews.llvm.org/D47269

llvm-svn: 333107

3f663631

Silence warnings introduced with r333093 · 7d37bb42

Petar Jovanovic authored May 23, 2018

r333093 introduced several warnings (-Wlogical-not-parentheses,
-Wbool-compare).
Adding parentheses in MipsSEInstrInfo::isCopyInstr() to silence it.

llvm-svn: 333097

7d37bb42

[X86][MIPS][ARM] New machine instruction property 'isMoveReg' · c051000b

Petar Jovanovic authored May 23, 2018

This property is needed in order to follow values movement between
registers. This property is used in TII to implement method that
returns true if simple copy like instruction is recognized, along
with source and destination machine operands.

Patch by Nikola Prica.

Differential Revision: https://reviews.llvm.org/D45204

llvm-svn: 333093

c051000b

Remove DEBUG macro. · 03d0b91f

Nicola Zaghen authored May 23, 2018

Now that the LLVM_DEBUG() macro landed on the various sub-projects
the DEBUG macro can be removed.
Also change the new uses of DEBUG to LLVM_DEBUG.

Differential Revision: https://reviews.llvm.org/D46952

llvm-svn: 333091

03d0b91f

[RISCV] Add symbol diff relocation support for RISC-V · 257d5b56

Alex Bradbury authored May 23, 2018

For RISC-V it is desirable to have relaxation happen in the linker once
addresses are known, and as such the size between two instructions/byte
sequences in a section could change.

For most assembler expressions, this is fine, as the absolute address results
in the expression being converted to a fixup, and finally relocations.
However, for expressions such as .quad .L2-.L1, the assembler folds this down
to a constant once fragments are laid out, under the assumption that the
difference can no longer change, although in the case of linker relaxation the
differences can change at link time, so the constant is incorrect. One place
where this commonly appears is in debug information, where the size of a
function expression is in a form similar to the above.

This patch extends the assembler to allow an AsmBackend to declare that it
does not want the assembler to fold down this expression, and instead generate
a pair of relocations that allow the linker to carry out the calculation. In
this case, the expression is not folded, but when it comes to emitting a
fixup, the generic FK_Data_* fixups are converted into a pair, one for the
addition half, one for the subtraction, and this is passed to the relocation
generating methods as usual. I have named these FK_Data_Add_* and
FK_Data_Sub_* to indicate which half these are for.

For RISC-V, which supports this via e.g. the R_RISCV_ADD64, R_RISCV_SUB64 pair
of relocations, these are also set to always emit relocations relative to
local symbols rather than section offsets. This is to deal with the fact that
if relocations were calculated on e.g. .text+8 and .text+4, the result 12
would be stored rather than 4 as both addends are added in the linker.

Differential Revision: https://reviews.llvm.org/D45181
Patch by Simon Cook.

llvm-svn: 333079

257d5b56

[Sparc] Use addAliasForDirective to support data directives · 3fa69dd0

Alex Bradbury authored May 23, 2018

The Sparc asm parser currently has custom parsing logic for .half, .word, 
.nword and .xword. Rather than use this custom logic, we can just use 
addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue.

https://reviews.llvm.org/D47003

llvm-svn: 333078

3fa69dd0

[AArch64] Use addAliasForDirective to support data directives · 0a59f189

Alex Bradbury authored May 23, 2018

The AArch64 asm parser currently has custom parsing logic for .hword, .word, 
and .xword. Rather than use this custom logic, we can just use 
addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue.

Differential Revision: https://reviews.llvm.org/D47000

llvm-svn: 333077

0a59f189

[RISCV] Correctly report sizes for builtin fixups · 1c010d0f

Alex Bradbury authored May 23, 2018

This is a different approach to fixing the problem described in D46746.
RISCVAsmBackend currently depends on the getSize helper function returning the
number of bytes a fixup may change (note: some other backends have a similar
helper named getFixupNumKindBytes). As noted in that review, this doesn't
return the correct size for FK_Data_1, FK_Data_2, or FK_Data_8 meaning that
too few bytes will be written in the case of FK_Data_8, and there's the
potential of writing outside the Data array for the smaller fixups.

D46746 extends getSize to recognise some of the builtin fixup types. Rather
than having a function that needs to be kept up to date as new builtin or
target-specific fixups are added, We can calculate an appropriate bound on the
number of bytes that might be touched using Info.TargetSize and
Info.TargetOffset.

Differential Revision: https://reviews.llvm.org/D46965

llvm-svn: 333076

1c010d0f

[Sparc] Add mnemonic aliases for flush, stb, stba, sth, and stha · 6356571e

Daniel Cederman authored May 23, 2018

Reviewers: jyknight

Reviewed By: jyknight

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D47140

llvm-svn: 333068

6356571e

[GlobalISel][ARM] Adding HPR and QPR regclasses to FPRB regbank · e79d656c

Roman Tereshin authored May 23, 2018

Also bringing ARMRegisterBankInfo::getRegBankFromRegClass
implementation up to speed with the *.td-definition.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D43982

llvm-svn: 333056

e79d656c

May 22, 2018

AMDGPU: Fix v2f16 fneg/fabs pattern · 606bc315

Matt Arsenault authored May 22, 2018

The integer operation convertion for some reason only happens
if the source is a bitcast from an integer, which happens to
always be the situation when the result is loaded. Add
an additional pattern for when the source operation is really
an FP operation.

llvm-svn: 333019

606bc315

Delete unused variable from r333015. · 785acce5

Eli Friedman authored May 22, 2018

(The assertion suppressed the unused variable warning on
Release+Asserts builds, so I didn't notice.)

llvm-svn: 333018

785acce5

AMDGPU: Move AMDGPUTargetLowering::isFPExtFoldable() into SITargetLowering · b12f4dec

Tom Stellard authored May 22, 2018

Summary: This is always false for R600.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47180

llvm-svn: 333016

b12f4dec

[MachineOutliner] Add "thunk" outlining for AArch64. · 042dc9e0

Eli Friedman authored May 22, 2018

When we're outlining a sequence that ends in a call, we can save up to
three instructions in the outlined function by turning the call into
a tail-call. I refer to this as thunk outlining because the resulting
outlined function looks like a thunk; suggestions welcome for a better
name.

In addition to making the outlined function shorter, thunk outlining
allows outlining calls which would otherwise be illegal to outline:
we don't need to save/restore LR, so we don't need to prove anything
about the stack access patterns of the callee.

To make this work effectively, I also added
MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner
code; this allows treating an arbitrary instruction as a terminator in
the suffix tree.

Differential Revision: https://reviews.llvm.org/D47173

llvm-svn: 333015

042dc9e0

[Hexagon] Add patterns for accumulating HVX compares · 840b02bc
Krzysztof Parzyszek authored May 22, 2018
```
llvm-svn: 333009
```
840b02bc

[mips] Merge MipsLongBranch and MipsHazardSchedule passes · a5f75518

Aleksandar Beserminji authored May 22, 2018

MipsLongBranchPass and MipsHazardSchedule passes are joined to one pass
because of mutual conflict. When MipsHazardSchedule inserts 'nop's, it
potentially breaks some jumps, so they have to be expanded to long
branches. When some branch is expanded to long branch, it potentially
creates a hazard situation, which should be fixed by adding nops.
New pass is called MipsBranchExpansion, it combines these two passes,
and runs them alternately until one of them reports no changes were made.

Differential Revision: https://reviews.llvm.org/D46641

llvm-svn: 332977

a5f75518

[mips] Correct the predicates of the cache and pref instructions · 437153bb

Simon Dardis authored May 22, 2018

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46949

llvm-svn: 332970

437153bb

[TTI] Add uniform/non-uniform constant Pow2 detection to... · 4162d777

Simon Pilgrim authored May 22, 2018

[TTI] Add uniform/non-uniform constant Pow2 detection to TargetTransformInfo::getInstructionThroughput

This enables us to detect more fast path sdiv cases under cost analysis.

This patch also enables us to handle non-uniform-constant pow2 cases for X86 SDIV costs.

Found while working on D46276

Future patches can then extend the vectorizers to more fully support non-uniform pow2 cases.

Differential Revision: https://reviews.llvm.org/D46637

llvm-svn: 332969

4162d777

AMDGPU: Make v2i16/v2f16 legal on VI · 1349a04e

Matt Arsenault authored May 22, 2018

This usually results in better code. Fixes using
inline asm with short2, and also fixes having a different
ABI for function parameters between VI and gfx9.

Partially cleans up the mess used for lowering of the d16
operations. Making v4f16 legal will help clean this up more,
but this requires additional work.

llvm-svn: 332953

1349a04e

[WebAssembly] Fix fast-isel lowering illegal argument and return types. · b8184827

Dan Gohman authored May 22, 2018

For both argument and return types, promote illegal types like i24 to i32,
and if a type can't be easily promoted, clear out the signature before
bailing out, so avoid leaving it in a partially complete state.

Fixes PR37546.

llvm-svn: 332947

b8184827

AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers · 44b30b45

Tom Stellard authored May 22, 2018

Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

llvm-svn: 332930

44b30b45

[X86] Remove 128/256-bit cvtdq2ps, cvtudq2ps, cvtqq2pd, cvtuqq2pd intrinsics. · 358b0949
Craig Topper authored May 21, 2018
```
These can all be implemented with sitofp/uitofp instructions.

llvm-svn: 332916
```
358b0949

May 21, 2018

[DAGCombine][X86][AArch64] Masked merge unfolding: vector edition. · 7772de25

Roman Lebedev authored May 21, 2018

Summary:
This **appears** to be the last missing piece for the masked merge pattern handling in the backend.

This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]].

[[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly.
Previously, `andps`+`andnps` / `bsl` would be generated. (see `@out`)
Now, they would no longer be generated  (see `@in`), and we need to make sure that they are generated.

Differential Revision: https://reviews.llvm.org/D46528

llvm-svn: 332904

7772de25

[X86] Simplify some X86 address mode folding code, NFCI · 537917d1

Reid Kleckner authored May 21, 2018

This code should really do exactly the same thing for 32-bit x86 and
64-bit small code models, with the exception that RIP-relative
addressing can't use base and index registers.

llvm-svn: 332893

537917d1

[X86] Remove masking from vpternlog intrinsics. Use a select in IR instead. · aad3aefa

Craig Topper authored May 21, 2018

This removes 6 intrinsics since we no longer need separate mask and maskz intrinsics.

Differential Revision: https://reviews.llvm.org/D47124

llvm-svn: 332890

aad3aefa

CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output. · 9a45114b
Peter Collingbourne authored May 21, 2018
```
Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47089

llvm-svn: 332881
```
9a45114b

MC: Separate creating a generic object writer from creating a target object writer. NFCI. · dcd7d6c3

Peter Collingbourne authored May 21, 2018

With this we gain a little flexibility in how the generic object
writer is created.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47045

llvm-svn: 332868

dcd7d6c3

Fix ubsan bounds check failure. · 2602a0d4
Peter Collingbourne authored May 21, 2018
```
llvm-svn: 332866
```
2602a0d4

[AMDGPU] Add divergence analysis as a dependency for ISel · 9badad20

Stanislav Mekhanoshin authored May 21, 2018

AMDGPUDAGToDAGISel adds DivergenceAnalysis in getAnalysisUsage
but does not list it in pass dependencies which may lead to
crash.

Differential Revision: https://reviews.llvm.org/D47151

llvm-svn: 332862

9badad20

MC: Change MCAsmBackend::writeNopData() to take a raw_ostream instead of an MCObjectWriter. NFCI. · 571a3301

Peter Collingbourne authored May 21, 2018

To make this work I needed to add an endianness field to MCAsmBackend
so that writeNopData() implementations know which endianness to use.

Part of PR37466.

Differential Revision: https://reviews.llvm.org/D47035

llvm-svn: 332857

571a3301

AMDGPU/GlobalISel: Address post-commit review comments for r332379 · a91ce17b
Tom Stellard authored May 21, 2018
```
MCRegisterInfo::getPhysRegSize() will be deprecated.

llvm-svn: 332856
```
a91ce17b

[X86][BtVer2] Add a 'J' prefix to the PRF/RCU defs. NFC · b5757abe

Andrea Di Biagio authored May 21, 2018

This is to keep the Jaguar model's naming convention. Processor resources all
have a 'J' prefix in the BtVer2 scheduling model.

llvm-svn: 332851

b5757abe

[X86] - Avoid SFB pass - fix bug in updating the offsets for newly created copies · 9417f7ff
Lama Saba authored May 21, 2018
```
Change-Id: I169ab6fe7e187727c0298c2a1e2868a683f3e688
llvm-svn: 332849
```
9417f7ff
[X86][SSE] Add an assert to ensure that rotation amount is converted to a scale · a8869e68
Simon Pilgrim authored May 21, 2018
```
Missed in rL332832 where we added SSE v4i32 rotations for PR37426.

llvm-svn: 332844
```
a8869e68

ARM: be conservative when asked load/store alignment of weird type. · 4e3eec39

Tim Northover authored May 21, 2018

Chances are we'll be asked again after type legalization, but before that point
it's better to claim misaligned accesses aren't allowed than to assert.

llvm-svn: 332840

4e3eec39