Commits · cac28aeb3fe14c7c4afd723a544086aa7237001d · Roger Ferrer / llvm-epi

Jun 15, 2018

[PowerPC] Add support for high and higha symbol modifiers on tls modifers. · cac28aeb

Sean Fertile authored Jun 15, 2018

Enables using the high and high-adjusted symbol modifiers on thread local
storage modifers in powerpc assembly. Needed to be able to support 64 bit
thread-pointer and dynamic-thread-pointer access sequences.

Differential Revision: https://reviews.llvm.org/D47754

llvm-svn: 334856

cac28aeb

[PPC64] Support "symbol@high" and "symbol@higha" symbol modifers. · 80b8f82f

Sean Fertile authored Jun 15, 2018

Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly.
The modifiers represent accessing the segment consiting of bits 16-31 of a
64-bit address/offset.

Differential Revision: https://reviews.llvm.org/D47729

llvm-svn: 334855

80b8f82f

Move redundant-vf2-cost.ll test to X86 directory · 72aed5e5

Diego Caballero authored Jun 15, 2018

redundant-vf2-cost.ll is X86 specific. Moved from
test/Transforms/LoopVectorize/redundant-vf2-cost.ll to
test/Transforms/LoopVectorize/X86/redundant-vf2-cost.ll

llvm-svn: 334854

72aed5e5

[llvm-mca][x86] Add Generic cpu resource tests · f5ecd8d5

Simon Pilgrim authored Jun 15, 2018

Added a Generic x86 cpu set of resource tests to allow us to check all ISAs.

We currently use SandyBridge as our generic CPU model, but it's better if we actually duplicate these tests for if/when we change the model, it also means we don't end up polluting the SandyBridge folder with tests for ISAs it doesn't support.

llvm-svn: 334853

f5ecd8d5

[X86] Lowering sqrt intrinsics to native IR · bcaab53d

Tomasz Krupa authored Jun 15, 2018

Summary: Complementary patch to lowering sqrt intrinsics in Clang.

Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k

Reviewed By: craig.topper

Subscribers: tkrupa, mike.dvoretsky, llvm-commits

Differential Revision: https://reviews.llvm.org/D41599

llvm-svn: 334849

bcaab53d

[X86] Prevent folding stack reloads into instructions in hasUndefRegUpdate. · 1657b7b8

Craig Topper authored Jun 15, 2018

An earlier commit prevented folds from the peephole pass by checking for IMPLICIT_DEF. But later in the pipeline IMPLICIT_DEF just becomes and Undef flag on the input register so we need to check for that case too.

llvm-svn: 334848

1657b7b8

Remove <undef> from rematerialized full register · 1a70426a

Krzysztof Parzyszek authored Jun 15, 2018

When coalescing a small register into a subregister of a larger register,
if the larger register is rematerialized, the function updateRegDefUses
can add an <undef> flag to the rematerialized definition (since it's
treating it as only definining the coalesced subregister). While with that
assumption doing so is not incorrect, make sure to remove the flag later
on after the call to updateRegDefUses.

llvm-svn: 334845

1a70426a

[InstCombine] Avoid iteration/mutation conflict · 6f406d4f

Joseph Tremoulet authored Jun 15, 2018

Summary:
When iterating users of a multiply in processUMulZExtIdiom, the
call to setOperand in the truncation case may replace the use
being visited; make sure the iterator has been advanced before
doing that replacement.

Reviewers: majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48192

llvm-svn: 334844

6f406d4f

[AArch64][SVE] Asm: Support for CPY SIMD/FP and GPR instructions. · a6edca72

Sander de Smalen authored Jun 15, 2018

Predicated splat/copy of SIMD/FP register or general purpose
register to SVE vector, along with MOV-aliases.

llvm-svn: 334842

a6edca72

Avoid copying PrettyStackTrace messages an extra time on Apple OSs · 7e535bc4

Jordan Rose authored Jun 15, 2018

We were unnecessarily going from SmallString to std::string just to
get a null-terminated C string. So just...don't do that. Crash
slightly faster!

llvm-svn: 334841

7e535bc4

[LV] Prevent LV to run cost model twice for VF=2 · 68795245

Diego Caballero authored Jun 15, 2018

This is a minor fix for LV cost model, where the cost for VF=2 was
computed twice when the vectorization of the loop was forced without
specifying a VF.

Reviewers: xusx595, hsaito, fhahn, mkuper

Reviewed By: hsaito, xusx595

Differential Revision: https://reviews.llvm.org/D48048

llvm-svn: 334840

68795245

[AArch64][SVE] Asm: Support for INC/DEC (scalar) instructions. · 18ac8f9f

Sander de Smalen authored Jun 15, 2018

Increment/decrement scalar register by (scaled) element count given by
predicate pattern, e.g. 'incw x0, all, mul #4'.

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D47713

llvm-svn: 334838

18ac8f9f

AMDGPU: Add combine for short vector extract_vector_elts · 63bc0e3c

Matt Arsenault authored Jun 15, 2018

Try to access pieces 4 bytes at a time. This helps
various hasOneUse extract_vector_elt combines, such
as load width reductions.

Avoids test regressions in a future commit.

llvm-svn: 334836

63bc0e3c

AMDGPU: Make v4i16/v4f16 legal · 02dc7e19

Matt Arsenault authored Jun 15, 2018

Some image loads return these, and it's awkward working
around them not being legal.

llvm-svn: 334835

02dc7e19

[llvm-readobj] Add -string-dump (-p) option · fa5597b2

Paul Semel authored Jun 15, 2018

This option prints the section content as a string.

Differential Revision: https://reviews.llvm.org/D47989

llvm-svn: 334834

fa5597b2

[MCA] Add -summary-view option · 9ddf128f

Roman Lebedev authored Jun 15, 2018

Summary:
While that is indeed a quite interesting summary stat,
there are cases where it does not really add anything
other than consuming extra lines.

Declutters the output of D48190.

Reviewers: RKSimon, andreadb, courbet, craig.topper

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48209

llvm-svn: 334833

9ddf128f

[MCA][x86][NFC] Add tests for -register-file-stats, -scheduler-stats · 7c423001

Roman Lebedev authored Jun 15, 2018

Summary:
There does not seem to be any other tests for this.
Split off from D47676.

Reviewers: RKSimon, craig.topper, courbet, andreadb

Reviewed By: andreadb

Subscribers: javed.absar, gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D48190

llvm-svn: 334832

7c423001

[AArch64][SVE] Asm: Support for FADD, FMUL and FMAX immediate instructions. · 5eb51d74

Sander de Smalen authored Jun 15, 2018

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: javed.absar

Differential Revision: https://reviews.llvm.org/D47712

llvm-svn: 334831

5eb51d74

Re-apply "[DebugInfo] Check size of variable in ConvertDebugDeclareToDebugValue" · 428caf98

Bjorn Pettersson authored Jun 15, 2018

This is r334704 (which was reverted in r334732) with a fix for
types like x86_fp80. We need to use getTypeAllocSizeInBits and
not getTypeStoreSizeInBits to avoid dropping debug info for
such types.

Original commit msg:
> Summary:
> Do not convert a DbgDeclare to DbgValue if the store
> instruction only refer to a fragment of the variable
> described by the DbgDeclare.
>
> Problem was seen when for example having an alloca for an
> array or struct, and there were stores to individual elements.
> In the past we inserted a DbgValue intrinsics for each store,
> just as if the store wrote the whole variable.
>
> When handling store instructions we insert a DbgValue that
> indicates that the variable is "undefined", as we do not know
> which part of the variable that is updated by the store.
>
> When ConvertDebugDeclareToDebugValue is used with a load/phi
> instruction we assert that the referenced value is large enough
> to cover the whole variable. Afaict this should be true for all
> scenarios where those methods are used on trunk. If the assert
> blows in the future I guess we could simply skip to insert a
> dbg.value instruction.
>
> In the future I think we should examine which part of the variable
> that is accessed, and add a DbgValue instrinsic with an appropriate
> DW_OP_LLVM_fragment expression.
>
> Reviewers: dblaikie, aprantl, rnk
>
> Reviewed By: aprantl
>
> Subscribers: JDevlieghere, llvm-commits
>
> Tags: #debug-info
>
> Differential Revision: https://reviews.llvm.org/D48024

llvm-svn: 334830

428caf98

[mips] Add licensing information of the microMIPS tablegen files. (NFC) · 98b9849d
Simon Dardis authored Jun 15, 2018
```
llvm-svn: 334827
```
98b9849d

[AArch64][SVE] Asm: Add parsing/printing support for exact FP immediates. · 3cbf1714

Sander de Smalen authored Jun 15, 2018

Some instructions require of a limited set of FP immediates as operands,
for example '#0.5 or #1.0' for SVE's FADD instruction.

This patch adds support for parsing and printing such FP immediates as
exact values (e.g. #0.499999 is not accepted for #0.5).

Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D47711

llvm-svn: 334826

3cbf1714

[NFC] chmod +x utils/update_analyze_test_checks.py · 1ef9b2a1
Roman Lebedev authored Jun 15, 2018
```
Looks like a simple oversight.

llvm-svn: 334825
```
1ef9b2a1
DAG: Fix creating concat_vectors with illegal type · df2f4ef2
Matt Arsenault authored Jun 15, 2018
```
Test passes as is, but fails with future patch to make v4i16/v4f16
legal.

llvm-svn: 334823
```
df2f4ef2

[SLP][X86] Add AVX2 run to POW2 SDIV Tests · 180497ea

Simon Pilgrim authored Jun 15, 2018

Non-uniform pow2 tests are only make sense on targets with fast (low cost) non-uniform shifts

llvm-svn: 334821

180497ea

[SLP][X86] Regenerate POW2 SDIV Tests · ca6215f8
Simon Pilgrim authored Jun 15, 2018
```
Added non-uniform pow2 test as well

llvm-svn: 334819
```
ca6215f8

[InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y) · 84c11aed

Roman Lebedev authored Jun 15, 2018

Summary:
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

The original commit was reverted because
it broke tests for amdgpu backend, which i didn't check.
Now, the backed was updated to recognize these new
patterns, so we are good.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm

Reviewed By: spatel, rampitec, nhaehnle

Subscribers: wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D47980

llvm-svn: 334818

84c11aed

[AMDGPU] Recognize x & ~(-1 << y) pattern. · dec562c8

Roman Lebedev authored Jun 15, 2018

Summary: The same pattern as D48010, but this one is IR-canonical as of D47428.

Reviewers: nhaehnle, bogner, tstellar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #amdgpu

Differential Revision: https://reviews.llvm.org/D48012

llvm-svn: 334817

dec562c8

[AMDGPU] Recognize x & ((1 << y) - 1) pattern. · 9c17dad8

Roman Lebedev authored Jun 15, 2018

Summary:
As a followup for D48007.

Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern,
which does not have ub for both the edge cases (`y == 0`, `y == bitwidth`),
i think also handling a pattern that is ub for `y == bitwidth` should be fine.

Reviewers: nhaehnle, bogner, tstellar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #amdgpu

Differential Revision: https://reviews.llvm.org/D48010

llvm-svn: 334816

9c17dad8

[AMDGPU] Recognize x & (-1 >> (32 - y)) pattern. · aa8587d1

Roman Lebedev authored Jun 15, 2018

Summary:
D47980 will canonicalize the `x << (32 - y) >> (32 - y)`,
which is the pattern the AMDGPU expects to `x &  (-1 >> (32 - y))`,
which is not recognized by AMDGPU.

Thus, it needs to be recognized, too.

Reviewers: nhaehnle, bogner, tstellar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #amdgpu

Differential Revision: https://reviews.llvm.org/D48007

llvm-svn: 334815

aa8587d1

[MC] Move bundling and MCSubtargetInfo to MCEncodedFragment [NFC] · 1503fc0f

Peter Smith authored Jun 15, 2018

Instruction bundling is only supported on descendants of the
MCEncodedFragment type. By moving the bundling functionality and
MCSubtargetInfo to this class it makes it easier to set and extract the
MCSubtargetInfo when it is necessary.

This is a refactoring change that will make it easier to pass the
MCSubtargetInfo through to writeNops when nop padding is required.

Differential Revision: https://reviews.llvm.org/D45959

llvm-svn: 334814

1503fc0f

[llvm-exegesis][NFC] Remove dead variable. · 205276bf
Clement Courbet authored Jun 15, 2018
```
llvm-svn: 334813
```
205276bf
[llvm-exegesis][NFC] Add more comments. · f64007fe
Clement Courbet authored Jun 15, 2018
```
llvm-svn: 334811
```
f64007fe
add myself to the CREDITS.TXT · 0651eb1b
QingShan Zhang authored Jun 15, 2018
```
llvm-svn: 334808
```
0651eb1b
NFC: Regenerating x86-sse41.ll test for InstCombine · 0531ec65
Mikhail Dvoretckii authored Jun 15, 2018
```
Test regenerated to reduce noise in further patches.

llvm-svn: 334806
```
0531ec65

[llvm-exegesis] Print the whole snippet in analysis. · 4273e1e8

Clement Courbet authored Jun 15, 2018

Summary:
On hover, the whole asm snippet is displayed, including operands.

This requires the actual assembly output instead of just the MCInsts:
This is because some pseudo-instructions get lowered to actual target
instructions during codegen (e.g. ABS_Fp32 -> SSE or X87).

Reviewers: gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48164

llvm-svn: 334805

4273e1e8

Revert r334802 "[X86] Prevent folding stack reloads with instructions that... · c8a763ed

Craig Topper authored Jun 15, 2018

Revert r334802 "[X86] Prevent folding stack reloads with instructions that have an undefined register update."

There's a typo causing the build to fail.

llvm-svn: 334803

c8a763ed

[X86] Prevent folding stack reloads with instructions that have an undefined register update. · 5ec210cc
Craig Topper authored Jun 15, 2018
```
We want to keep the load unfolded so we can use the same register for both sources to avoid a false dependency.

llvm-svn: 334802
```
5ec210cc

[X86] Add more instructions to the memory folding tables using the autogenerated table as a guide. · 3c4cc012

Craig Topper authored Jun 15, 2018

I think this covers most of the unmasked vector instructions. We're still missing a lot of the masked instructions.

There are some test changes here because of the new folding support. I don't think these particular cases should be folded because it creates an undef register dependency. I think the changes introduced in r334175 are not handling stack folding. They're only blocking the peephole pass.

llvm-svn: 334800

3c4cc012

[NFC] fix trivial typos in documents · c36a1f1c
Hiroshi Inoue authored Jun 15, 2018
```
llvm-svn: 334799
```
c36a1f1c
[X86] Fix some checks to use X86 instead of X32. · 3b060dab
Craig Topper authored Jun 15, 2018
```
These tests were recently updated so it looks like gone wrong.

llvm-svn: 334786
```
3b060dab