Commits · 9364fa3434b6967b796e1eedc480198806ead916 · Lorenzo Albano / LLVM bpEVL

Dec 04, 2017

Move splitIndirectCriticalEdges() to BasicBlockUtils.h. · 9364fa34

Hiroshi Yamauchi authored Dec 04, 2017

Summary:
Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so
that it can be called from other places.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40750

llvm-svn: 319689

9364fa34

[ConstantFold] Support vector index when factoring out GEP index into preceding dimensions · 234eabaf

Haicheng Wu authored Dec 04, 2017

Follow-up of r316824. This patch supports the vector type for both current and
previous index when factoring out the current one into the previous one.

Differential Revision: https://reviews.llvm.org/D39556

llvm-svn: 319683

234eabaf

[SCEV] Use a "Discovered" set instead of a "Visited" set; NFC · adf37517
Sanjoy Das authored Dec 04, 2017
```
Suggested by Max Kazantsev in https://reviews.llvm.org/D39361

llvm-svn: 319679
```
adf37517

[SCEV] A different fix for PR33494 · 7e363379

Sanjoy Das authored Dec 04, 2017

Summary:
I don't think rL309080 is the right fix for PR33494 -- caching ExitLimit only
hides the problem[0].  The real issue is that because of how we forget SCEV
expressions ScalarEvolution::getBackedgeTakenInfo, in the test case for PR33494
computing the backedge for any loop invalidates the trip count for every other
loop.  This effectively makes the SCEV cache useless.

I've instead made the SCEV expression invalidation in
ScalarEvolution::getBackedgeTakenInfo less aggressive to fix this issue.

[0]: One way to think about this is that rL309080 essentially augmented the
backedge-taken-count cache with another equivalent exit-limit cache.  The bug
went away because we were explicitly not clearing the exit-limit cache in
getBackedgeTakenInfo.  But instead of doing all of that, we can just avoid
clearing the backedge-taken-count cache.

Reviewers: mkazantsev, mzolotukhin

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D39361

llvm-svn: 319678

7e363379

[BypassSlowDivision] Improve our handling of divisions by constants · aa92cae1

Sanjoy Das authored Dec 04, 2017

(This reapplies r314253.  r314253 was reverted on r314482 because of a
correctness regression on P100, but that regression was identified to be
something else.)

Summary:
Don't bail out on constant divisors for divisions that can be narrowed without
introducing control flow .  This gives us a 32 bit multiply instead of an
emulated 64 bit multiply in the generated PTX assembly.

Reviewers: jlebar

Subscribers: jholewinski, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38265

llvm-svn: 319677

aa92cae1

MachineVerifier: undef phi arg doesn't need to be live-out from predecessor · 7eae251b
Matthias Braun authored Dec 04, 2017
```
Differential Revision: https://reviews.llvm.org/D40756

llvm-svn: 319674
```
7eae251b

[CodeGen] Unify MBB reference format in both MIR and debug output · 25528d6d

Francis Visoiu Mistrih authored Dec 04, 2017

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665

25528d6d

Fix function pointer tail calls in armv8-M.base · 2b438584

Pablo Barrio authored Dec 04, 2017

Summary:
The compiler fails with the following error message:

fatal error: error in backend: ran out of registers during
register allocation

Tail call optimization for Armv8-M.base fails to meet all the required
constraints when handling calls to function pointers where the
arguments take up r0-r3. This is because the pointer to the
function to be called can only be stored in r0-r3, but these are
all occupied by arguments. This patch makes sure that tail call
optimization does not try to handle this type of calls.

Reviewers: chill, MatzeB, olista01, rengolin, efriedma

Reviewed By: olista01, efriedma

Subscribers: efriedma, aemerson, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40706

llvm-svn: 319664

2b438584

Revert "[cmake] Enable zlib support on windows" · f2fdc183

Pavel Labath authored Dec 04, 2017

This reverts commit r319533 as it broke llvm-config --system-libs output
and everything that depends on it (which is mostly out of tree or
downstream folks, but includes a couple of llvm buildbots as well).

I think I have a fix for this in D40779, but I want someone to look
review it first. In the mean time, I am reverting this change, as it
seems to break a lot of people.

llvm-svn: 319663

f2fdc183

[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole. · 5f7f32c3

Sam Kolton authored Dec 04, 2017

Summary:

Reviewers: arsenm, vpykhtin, rampitec

Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D37817

llvm-svn: 319662

5f7f32c3

[Loop Predication] Teach LP about reverse loops · 7b360434

Anna Thomas authored Dec 04, 2017

Summary:
Currently, we only support predication for forward loops with step
of 1.  This patch enables loop predication for reverse or
countdownLoops, which satisfy the following conditions:
   1. The step of the IV is -1.
   2. The loop has a singe latch as B(X) = X <pred>
latchLimit with pred as s> or u>
   3. The IV of the guard is the decrement
IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit).

This patch was downstream for a while and is the last series of patches
that's from our LP implementation downstream.

Reviewers: apilipenko, mkazantsev, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40353

llvm-svn: 319659

7b360434

[NVPTX] Assign valid global names · 5db24d7c

Jonas Hahnfeld authored Dec 04, 2017

PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The
existing pass already ensured this for globals and this patch adds
the cleanup for functions with local linkage.

However, there was a different problem in the case of collisions
of the adjusted name: The ValueSymbolTable then automatically
appended ".N" with increasing Ns to get a unique name while helping
the ABI demangling. Special case this behavior to omit the dots and
append N directly. This will always give us legal names according
to the PTX requirements.

Differential Revision: https://reviews.llvm.org/D40573

llvm-svn: 319657

5db24d7c

Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operands · 7ab60605

Oliver Stannard authored Dec 04, 2017

This is causing a failure in the llvm-clang-x86_64-expensive-checks-win
buildbot, and I can't reproduce it locally, so reverting until I can work out
what is wrong.

llvm-svn: 319654

7ab60605

Revert "[ValueTracking] Pass only a single lambda to... · d0d43e6f

Sam McCall authored Dec 04, 2017

Revert "[ValueTracking] Pass only a single lambda to computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI"

This reverts commit r319624, which seems to cause a miscompile (breaks the
multistage PPC buildbots)

llvm-svn: 319652

d0d43e6f

AMDGPU: fix missing s_waitcnt · 6c6d5e24

Tim Corringham authored Dec 04, 2017

Summary:
The pass that inserts s_waitcnt instructions where needed propagated
info used to track dependencies for each block by iterating over the
predecessor blocks. The iteration was terminated when a predecessor
that had not yet been processed was encountered. Any info in blocks
later in the list was therefore not processed, leading to the
possiblility of a required s_waitcnt not being inserted.

The fix is simply to change the "break" to "continue" for the
relevant loops, so that all visited blocks are processed. This
is likely what was intended when the code was written.

There is no test case provided for this fix because:
1) the only example that reproduces this is large and resistant to
being reduced
2) the change is trivial

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D40544

llvm-svn: 319651

6c6d5e24

[Asm, ARM] Add fallback diag for multiple invalid operands · 7cd4db94

Oliver Stannard authored Dec 04, 2017

This adds a "invalid operands for instruction" diagnostic for
instructions where there is an instruction encoding with the correct
mnemonic and which is available for this target, but where multiple
operands do not match those which were provided. This makes it clear
that there is some combination of operands that is valid for the current
target, which the default diagnostic of "invalid instruction" does not.

Since this is a very general error, we only emit it if we don't have a
more specific error.

Differential revision: https://reviews.llvm.org/D36747

llvm-svn: 319649

7cd4db94

[TwoAddressInstructionPass] Bugfix in handling of sunk instructions. · e86327f2

Jonas Paulsson authored Dec 04, 2017

An instruction returned by TII->convertToThreeAddress() may contain a %noreg
(undef) operand, which is not expected by tryInstructionTransform(). So if
this MI is sunk to a lower point in MBB, it must be skipped when later
encountered.

A new set SunkInstrs is used for this purpose.

Note: there is no test supplied here, as this was triggered on SystemZ while
working on a review of instruction flags. A test case for this bugfix will be
included in the upcoming SystemZ commit.

Review: Quentin Colombet
https://reviews.llvm.org/D40711

llvm-svn: 319646

e86327f2

[DAGCombine] Remove isAndLoadExtLoad arguments · 1e26d986

Sam Parker authored Dec 04, 2017

Both LoadedVT and NarrowLoad are passed as references and neither
of them are used by any of its callers.

Differential Revision: https://reviews.llvm.org/D40713

llvm-svn: 319645

1e26d986

[AArch64] Allow using emulated tls on platforms other than ELF · eca862de

Martin Storsjö authored Dec 04, 2017

This matches how it is done on X86.

This allows using emulated tls on windows; in MinGW environments,
native tls isn't supported at the moment.

Set the right Data*bitsDirective for windows to match the existing
tests for other platforms. Make parts of the existing tests a regex,
to allow matching .section .rdata for windows, to avoid having to
duplicate the rest of the tests for windows.

Differential Revision: https://reviews.llvm.org/D40770

llvm-svn: 319644

eca862de

[ARM] Allow using emulated tls on platforms other than ELF · c85cc418

Martin Storsjö authored Dec 04, 2017

This matches how it is done on X86.

This allows using emulated tls on windows; in MinGW environments,
native tls isn't supported at the moment.

Differential Revision: https://reviews.llvm.org/D40769

llvm-svn: 319643

c85cc418

[X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit... · 4520d4f8

Craig Topper authored Dec 04, 2017

[X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit vectors when AVX512 is enabled.

These instructions can be used by widening to 512-bits and extracting back to 128/256. We do similar to several other instructions already.

llvm-svn: 319641

4520d4f8

[X86] Don't turn UINT_TO_FP into SINT_TO_FP during lowering. · 1151facf

Craig Topper authored Dec 04, 2017

We already do this as a DAG combine. The version during lowering can only trigger if known bits changes something that improves known bits analysis. But this means we should be improving known bits analysis to work on the unlowered form instead.

llvm-svn: 319640

1151facf

[SelectionDAG] Teach computeKnownBits some improvements to ISD::SRL with a... · 67217d7e

Craig Topper authored Dec 04, 2017

[SelectionDAG] Teach computeKnownBits some improvements to ISD::SRL with a non-splat constant shift amount.

If we have a non-splat constant shift amount, the minimum shift amount can be used to infer the number of zero upper bits of the result. There's probably a lot more that we can do here, but this
fixes a case where I wanted to infer the sign bit as zero when all the shift amounts are non-zero.

llvm-svn: 319639

67217d7e

Dec 03, 2017
- [X86][AVX512] Tag PH2PS/PS2PH conversion instructions scheduler classes · 569e53b0
  Simon Pilgrim authored Dec 03, 2017
```
llvm-svn: 319637
```
  569e53b0
- [X86][AVX512] Tag packed F2I/I2F/F2F conversion instructions scheduler class · 465a88bb
  Simon Pilgrim authored Dec 03, 2017
```
llvm-svn: 319636
```
  465a88bb
- [X86][SSE] Remove unused IIC_SSE_CVT_PI2PS_RR/IIC_SSE_CVT_PI2PS_RM itineraries · bc8d0223
  Simon Pilgrim authored Dec 03, 2017
```
llvm-svn: 319634
```
  bc8d0223
- CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr space · 30e4608c
  Yaxun Liu authored Dec 03, 2017
```
SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is
not true for amdgcn---amdgiz target.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40255

llvm-svn: 319630
```
  30e4608c
- [SelectionDAG] Use the inlined APInt shift methods since we've already bounds checked the shift. · f3470e1e
  Craig Topper authored Dec 03, 2017
```
The version that takes APInt is out of line. The 'unsigned' version optimizes for the common case of single word APInts.

llvm-svn: 319628
```
  f3470e1e
- Reland "[WebAssembly] Add visibility flag to Wasm symbol flags"" · a2b35dac
  Sam Clegg authored Dec 03, 2017
```
Original change was rL319488.

This was reverted rL319602 due to a gcc 7.1 warning.

Differential Revision: https://reviews.llvm.org/D40772

llvm-svn: 319626
```
  a2b35dac
- [ValueTracking] Pass only a single lambda to computeKnownBitsFromShiftOperator... · 199acd88
  Craig Topper authored Dec 02, 2017
```
[ValueTracking] Pass only a single lambda to computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI

llvm-svn: 319624
```
  199acd88
Dec 02, 2017

CodeGen: Fix pointer info in SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT · 49477040

Yaxun Liu authored Dec 02, 2017

Two issues found when doing codegen for splitting vector with non-zero alloca addr space:

DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating
SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to
infer the correct pointer info, which ends up with a dummy pointer info for the target to lower
store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to
represent MachinePointerInfo which is known in alloca address space but without other information.

TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for
multiplication of index and then add it to the pointer. However the pointer may be in an addr
space which has different size than addr space 0. The fix is to use the pointer value type for
index multiplication.

Differential Revision: https://reviews.llvm.org/D39758

llvm-svn: 319622

49477040

[X86][SSE] Cleanup float/int conversion scheduler itinerary classes · 299a54c5

Simon Pilgrim authored Dec 02, 2017

Makes it easier to grok where each is supposed to be used, mainly useful for adding to the AVX512 instructions but hopefully can be used more in SSE/AVX as well.

llvm-svn: 319614

299a54c5

[X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15. · 7d9a3b82
Craig Topper authored Dec 02, 2017
```
llvm-svn: 319612
```
7d9a3b82

[X86] Support %dr8-%dr15 in the assembler. · 3e846ecb

Craig Topper authored Dec 02, 2017

Apparently I failed to make this work when I fixed it in the disassembler way back in r224862.

llvm-svn: 319611

3e846ecb

[ARC] Add instruction subset for the ARC backend. · f665f6a2

Tatyana Krasnukha authored Dec 02, 2017

Reviewers: petecoup, kparzysz

Reviewed By: petecoup

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37983

llvm-svn: 319609

f665f6a2

[DAG][AArch64] Disable post-legalization store · 839ff79a

Nirav Dave authored Dec 02, 2017

Disable post-legalization store for AArch64 backend which is causing
errors out-of-tree.

llvm-svn: 319607

839ff79a

[WebAssembly] Revert r319488 "Add visibility flag to Wasm symbol flags" · e74a864c

Heejin Ahn authored Dec 02, 2017

This patch reportedly broke one of LLVM bots (ubuntu-gcc7.1-werror).

See http://lab.llvm.org:8011/builders/ubuntu-gcc7.1-werror/builds/3369 for
details.

llvm-svn: 319602

e74a864c

Dec 01, 2017

Revert "[X86] Improvement in CodeGen instruction selection for LEAs." · 9e658c97
Matt Morehouse authored Dec 01, 2017
```
This reverts r319543, due to ASan bot breakage.

llvm-svn: 319591
```
9e658c97

[MachineOutliner] NFC: Throw out self-intersections on candidates early · 52df8015

Jessica Paquette authored Dec 01, 2017

Currently, the outliner considers candidates that intersect with themselves in
the candidate pruning step. That is, candidates of the form "AA" in ranges like
"AAAAAA". In that range, it looks like there are 5 instances of "AA" that could
possibly be outlined, and that's considered in the benefit calculation.

However, only at most 3 instances of "AA" could ever be outlined in "AAAAAA".
Thus, it's possible to pass through "AA" to the candidate selection step even
though it's *never* the case that "AA" could be outlined. This makes it so that
when we find candidates, we consider only non-overlapping occurrences of that
candidate.

llvm-svn: 319588

52df8015

[DAG][ARM] Revert "Reenable post-legalize store merge" · 3e76e1e8
Nirav Dave authored Dec 01, 2017
```
due to failures in AArch and ARM code gen.

llvm-svn: 319587
```
3e76e1e8