Commits · 590f0793e8c7f25a7d993031436928823cd2c95a · Lorenzo Albano / LLVM bpEVL

Nov 24, 2017

[mips] Set microMIPS ASE flag · 590f0793

Aleksandar Beserminji authored Nov 24, 2017

This patch fixes an issue where microMIPS ASE flag is not set
when a function has micromips attribute or when .set micromips
directive is used.

Differential Revision: https://reviews.llvm.org/D40316

llvm-svn: 318948

590f0793

[AMDGPU][MC][GFX9] Added support of 'inst_offset' modifier for compatibility with SP3 · dd2f1c99

Dmitry Preobrazhensky authored Nov 24, 2017

See bug 35329: https://bugs.llvm.org//show_bug.cgi?id=35329

Reviewers: arsenm, vpykhtin, artem.tamazov

Differential Revision: https://reviews.llvm.org/D40350

llvm-svn: 318947

dd2f1c99

Nov 23, 2017

[X86] Don't invert NewCC variable while processing the jcc/setcc/cmovcc... · 40a1edc3

Craig Topper authored Nov 23, 2017

[X86] Don't invert NewCC variable while processing the jcc/setcc/cmovcc instructions in optimizeCompareInstr.

The NewCC variable is calculated outside of the loop that processes jcc/setcc/cmovcc instructions. If we invert it during the loop it can cause an incorrect value to be used by a later iteration. Instead only read it during the loop and use a new variable to store the possibly inverted value.

Fixes PR35399.

llvm-svn: 318934

40a1edc3

[X86] Teach isel that X86ISD::CMPM_RND zeros the upper bits of the mask register. · f31b0b85
Craig Topper authored Nov 23, 2017
```
llvm-svn: 318933
```
f31b0b85

[X86][SSE] Use (V)PHMINPOSUW for vXi16 SMAX/SMIN/UMAX/UMIN horizontal reductions (PR32841) · 90accbc5

Simon Pilgrim authored Nov 23, 2017

(V)PHMINPOSUW determines the UMIN element in an v8i16 input, with suitable bit flipping it can also be used for SMAX/SMIN/UMAX cases as well.

This patch matches vXi16 SMAX/SMIN/UMAX/UMIN horizontal reductions and reduces the input down to a v8i16 vector before calling (V)PHMINPOSUW.

A later patch will use this for v16i8 reductions as well (PR32841).

Differential Revision: https://reviews.llvm.org/D39729

llvm-svn: 318917

90accbc5

[ARM GlobalISel] Support G_FDIV for s32 and s64 · c01f7f13

Diana Picus authored Nov 23, 2017

TableGen already generates code for selecting a G_FDIV, so we only need
to add a test.

For the legalizer and reg bank select, we do the same thing as for the
other floating point binary operations: either mark as legal if we have
a FP unit or lower to a libcall, and map to the floating point
registers.

llvm-svn: 318915

c01f7f13

[ARM GlobalISel] Support G_FMUL for s32 and s64 · 9faa09b2

Diana Picus authored Nov 23, 2017

TableGen already generates code for selecting a G_FMUL, so we only need
to add a test for that part.

For the legalizer and reg bank select, we do the same thing as the other
floating point binary operators: either mark as legal if we have a FP
unit or lower to a libcall, and map to the floating point registers.

llvm-svn: 318910

9faa09b2

[mips] Use the delay slot filler to convert branches for microMIPSR6. · eb5bfd98

Simon Dardis authored Nov 23, 2017

The MIPS delay slot filler converts delay slot branches into compact
forms for the MIPS ISAs which support them. For branches that compare
(in)equality with with zero, it converts them into branches with implict
zero register operands. These branches have a slightly greater range
than normal two register operands branches.

Changing the branches at this point in the pipeline offers the long
branch pass the ability to mark better judgements if a long branch
sequence is required.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D40314

llvm-svn: 318908

eb5bfd98

[x86][icelake]BITALG · e8bdd383

Coby Tayree authored Nov 23, 2017

2/3
vpshufbitqmb encoding
3/3
vpshufbitqmb intrinsics
Differential Revision: https://reviews.llvm.org/D40222

llvm-svn: 318904

e8bdd383

[MSan] Move the access address check before the shadow access for that address · 391804f5

Alexander Potapenko authored Nov 23, 2017

MSan used to insert the shadow check of the store pointer operand
_after_ the shadow of the value operand has been written.
This happens to work in the userspace, as the whole shadow range is
always mapped. However in the kernel the shadow page may not exist, so
the bug may cause a crash.

This patch moves the address check in front of the shadow access.

llvm-svn: 318901

391804f5

[X86] Regenerate the vector-popcnt and vector-tzcnt tests to get BITALG CHECK... · 3fba1bfb

Craig Topper authored Nov 22, 2017

[X86] Regenerate the vector-popcnt and vector-tzcnt tests to get BITALG CHECK linse on all functions not just the vXi16/vXi8.

llvm-svn: 318885

3fba1bfb

Nov 22, 2017

IR printing improvement for loop passes · 61975b49

Fedor Sergeev authored Nov 22, 2017

Summary:
Loop-pass printing is somewhat deficient since it does not provide the
context around the loop (e.g. preheader). This context information becomes
pretty essential when analyzing transformations that move stuff out of the loop.

Extending printLoop to cover preheader and exit blocks (if any).

Reviewers: sanjoy, silvas, weimingz

Reviewed By: sanjoy

Subscribers: apilipenko, skatkov, llvm-commits

Differential Revision: https://reviews.llvm.org/D40246

llvm-svn: 318878

61975b49

[Hexagon] Implement buildVector32 and buildVector64 as utility functions · 942fa163

Krzysztof Parzyszek authored Nov 22, 2017

Change LowerBUILD_VECTOR to use those functions. This commit will tempora-
rily affect constant vector generation (it will generate constant-extended
values instead of non-extended combines), but the code for the general case
should be better. The constant selection part will be fixed later.

llvm-svn: 318877

942fa163

[Hexagon] Add patterns to select A2_combine_ll and its variants · b9f33b32
Krzysztof Parzyszek authored Nov 22, 2017
```
llvm-svn: 318876
```
b9f33b32
[Hexagon] Remove trailing spaces, NFC · 6acecc96
Krzysztof Parzyszek authored Nov 22, 2017
```
llvm-svn: 318875
```
6acecc96

[X86] Support v32i16/v64i8 CTLZ using lookup table. · 726968d6

Craig Topper authored Nov 22, 2017

Had to tweak the setcc's used by the code to use a vXi1 result type with a sign extend back to vector size.

llvm-svn: 318871

726968d6

[DwarfDump] -debug-line=offset applies to .dwo too. · 6ca1dd6f
Paul Robinson authored Nov 22, 2017
```
llvm-svn: 318856
```
6ca1dd6f

[AMDGPU] Fix SITargetLowering::LowerCall for pointer info of byval argument · c5962266

Yaxun Liu authored Nov 22, 2017

SITargetLowering::LowerCall uses dummy pointer info for byval argument, which causes
flat load instead of buffer load.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40040

llvm-svn: 318844

c5962266

[DebugInfo] Dump a .debug_line section, including line-number program, · 511b54ca
Paul Robinson authored Nov 22, 2017
```
without any compile units.

Differential Revision: https://reviews.llvm.org/D40114

llvm-svn: 318842
```
511b54ca

[AMDGPU][mc][tests] Updated generated lit tests for GFX8/9 · c492500e

Dmitry Preobrazhensky authored Nov 22, 2017

Summary:
Added tests to better cover features introduced by commit rL318675.
See http://llvm.org/viewvc/llvm-project?view=revision&revision=318675

llvm-svn: 318841

c492500e

[DWARFv5] Support DW_FORM_strp in the .debug_line.dwo header. · 63811a47

Paul Robinson authored Nov 22, 2017

As a side effect, the .debug_line section will be dumped in physical
order, rather than in the order that compile units refer to their
associated portions of the .debug_line section.  These are probably
always the same order anyway, and no tests noticed the difference.

Differential Revision: https://reviews.llvm.org/D39854

llvm-svn: 318839

63811a47

[DWARF] Fix handling of extended line-number opcodes · e0833349
Paul Robinson authored Nov 22, 2017
```
Differential Revision: https://reviews.llvm.org/D40200

llvm-svn: 318838
```
e0833349

AMDGPU: Consider memory dependencies with moved instructions in SILoadStoreOptimizer · dd059c16

Nicolai Haehnle authored Nov 22, 2017

Summary:
This bug seems to have gone unnoticed because critical cases with LDS
instructions are eliminated by the peephole optimizer.

However, equivalent situations arise with buffer loads and stores
as well, so this fixes regressions since r317751 ("AMDGPU: Merge
S_BUFFER_LOAD_DWORD_IMM into x2, x4").

Fixes at least:
KHR-GL45.shader_storage_buffer_object.basic-operations-case1-cs
KHR-GL45.cull_distance.functional
piglit tes-input-gl_ClipDistance.shader_test
... and probably more

Change-Id: I0e371536288eb8e6afeaa241a185266fd45d129d

Reviewers: arsenm, mareko, rampitec

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D40303

llvm-svn: 318829

dd059c16

[DAGCombiner] Bugfix in isAlias(). · 181e260e

Jonas Paulsson authored Nov 22, 2017

Since i1 is a legal type, this:

  NumBytes = Op1->getMemoryVT().getSizeInBits() >> 3;

is wrong and should be instead

  NumBytes = Op0->getMemoryVT().getStoreSize();

There seems to be more places where this should be fixed outside DAGCombiner.

Review: Hal Finkel
https://bugs.llvm.org/show_bug.cgi?id=35366

llvm-svn: 318824

181e260e

[SCEV] Strengthen variance condition in calculateLoopDisposition · 23044fa6

Max Kazantsev authored Nov 22, 2017

Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively.
When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if
`L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are
consecutive sibling loops within the parent loop. In this case, `AR2` is also varying
w.r.t. `L1`, but we don't correctly identify it.

It can lead, for exaple, to attempt of incorrect folding. Consider:
  AR1 = {a,+,b}<L1>
  AR2 = {c,+,d}<L2>
  EXAR2 = sext(AR1)
  MUL = mul AR1, EXAR2
If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to
construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect
because `AR2` is not available on entrance of `L1`.

Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled
with one check: "header of `L1` dominates header of `L2`". This patch replaces the old
insufficient check with this one.

Differential Revision: https://reviews.llvm.org/D39453

llvm-svn: 318819

23044fa6

[SCCP] Pick the right lattice value for constants. · b480b5c2

Davide Italiano authored Nov 22, 2017

After the dataflow algorithm proves that an argument is constant,
it replaces it value with the integer constant and drops the lattice
value associated to the DEF.

e.g. in the example we have @f() that's called twice:
call @f(undef, ...)
call @f(2, ...)

`undef` MEET 2 = 2 so we replace the argument and all its uses with
the constant 2.

Shortly after, tryToReplaceWithConstantRange() tries to get the lattice
value for the argument we just replaced, causing an assertion.
This function is a little peculiar as it runs when we're doing replacement
and not as part of the solver but still queries the solver.

The fix is that of checking whether we replaced the value already and
get a temporary lattice value for the constant.

Thanks to Zhendong Su for the report!

Fixes PR35357.

llvm-svn: 318817

b480b5c2

Nov 21, 2017

Object: Improve COFF irsymtab comdat representation. · 6c484622

Peter Collingbourne authored Nov 21, 2017

Change the representation of COFF comdats so that a COFF linker
is able to accurately resolve comdats between IR and native object
files. Specifically, apply name mangling to comdat names consistently
with native object files, and do not export comdats with an internal
leader because they do not affect symbol resolution.

Differential Revision: https://reviews.llvm.org/D40278

llvm-svn: 318805

6c484622

[Hexagon] Make sure that RDF does not remove EH_LABELs · fc0a1812

Krzysztof Parzyszek authored Nov 21, 2017

Since EH_LABELs (and other labels) no longer have "side-effects", they
should be checked for separately.

llvm-svn: 318801

fc0a1812

[X86] Allow vpclmulqdq instructions to be commuted during isel to allow load folding. · ba150ef6
Craig Topper authored Nov 21, 2017
```
The commuting patterns for the AVX version actually still had priority over the new patterns.

llvm-svn: 318800
```
ba150ef6

Avoid unecessary opsize byte in segment move to memory · 61ffc9c0

Nirav Dave authored Nov 21, 2017

Segment moves to memory are always 16-bit. Remove invalid 32 and 64
bit variants.

Recommiting with missing clang inline assembly test change.

Fixes PR34478.

Reviewers: rnk, craig.topper

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D39847

llvm-svn: 318797

61ffc9c0

[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as *having* side effects. · fe97d736

Chad Rosier authored Nov 21, 2017

This partially reverts r298851.  The the underlying issue is that we don't
currently model the dependency between mrs (read system register) and
msr (write system register) instructions.

Something like the below should never be reordered:

 msr TPIDR_EL0, x0  ;; set thread pointer
 mrs x8, TPIDR_EL0  ;; read thread pointer

but was being reordered after r298851.  The functional part of the patch
that wasn't reverted needed to remain in place in order to not break
r299462.

PR35317

llvm-svn: 318788

fe97d736

Rename test/Transforms/CountingFunctionInserter -> EntryExitInstrumenter · d97c0f78
Hans Wennborg authored Nov 21, 2017
```
The pass was renamed in r318195.

llvm-svn: 318784
```
d97c0f78
EntryExitInstrumenter: support __cyg_profile_func_enter_bare · 37cbf28e
Hans Wennborg authored Nov 21, 2017
```
It works just like __cyg_profile_func_enter but takes no arguments.

llvm-svn: 318783
```
37cbf28e

[ARM] Remove pre-UAL FLDM/FSTM aliases · 9cb89f66

Oliver Stannard authored Nov 21, 2017

These are pre-UAL syntax, and we don't support any other pre-UAL instructions,
with the exception of FLDMX/FSTMX, which don't have a UAL equivalent. Therefore
there's no reason to keep them or their AsmParser hacks around.

With the AsmParser hacks removed, the FLDMX and FSTMX instructions get the same
operand diagnostics as the UAL instructions.

Differential revision: https://reviews.llvm.org/D39196

llvm-svn: 318777

9cb89f66

[ARM] Don't omit non-default predication code · 1e6d4b9e

Oliver Stannard authored Nov 21, 2017

This was causing the (invalid) predicated versions of the NEON VRINTX and
VRINTZ instructions to be accepted, with the condition code being ignored.

Also, there is no NEON VRINTR instruction, so that part of the check was not
necessary.

Differential revision: https://reviews.llvm.org/D39193

llvm-svn: 318771

1e6d4b9e

[Asm] Improve "too few operands" errors · 1e73e95f

Oliver Stannard authored Nov 21, 2017

- We can still emit this error if the actual instruction has two or more
  operands missing compared to the expected one.
- We should only emit this error once per instruction.

Differential revision: https://reviews.llvm.org/D36746

llvm-svn: 318770

1e73e95f

Revert r318759 due to make check-all failure on Windows · 4acd57eb
Sander de Smalen authored Nov 21, 2017
```
llvm-svn: 318768
```
4acd57eb
[ARM] Add diagnostics for SPR/DPR lists · d6ca9879
Oliver Stannard authored Nov 21, 2017
```
Differential revision: https://reviews.llvm.org/D39195

llvm-svn: 318766
```
d6ca9879
[InstCombine] Test for PR35354: unable to vectorize loop with std::max · a054ea98
Alexey Bataev authored Nov 21, 2017
```
on floats, NFC.

llvm-svn: 318764
```
a054ea98

[AMDGPU] SDWA: remove omod src operand for VOP2b instructions · c27e3b6f

Sam Kolton authored Nov 21, 2017

Summary: VOP2b instructions (v_subbrev_u32, v_add_i32 ...) shouldn't support OMod operand in SDWA encoding

Reviewers: rampitec, dp

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D40172

llvm-svn: 318761

c27e3b6f