Commits · 67646d0570b4e25215416cb5cf73756733e130e0 · Roger Ferrer / llvm-epi

Mar 27, 2019

Evgeniy Stepanov authored Mar 27, 2019

Summary:
Follow-up for D56743.
* Add more "--" in llvm-rc invocations.
* Add llvm-rc to the tools list. This uses full path to llvm-rc in test
  RUN lines (llvm-lit -v), making them copy-pasteable.

Reviewers: mstorsjo, zturner

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59858

llvm-svn: 357118

67646d05

[WebAssembly] Add some whitespace to WebAssemblyFixIrreducibleControlFlow · e9e01cc7

Alon Zakai authored Mar 27, 2019

Differential Revision: https://reviews.llvm.org/D59855

modified:   llvm/lib/Target/WebAssembly/WebAssemblyFixIrreducibleControlFlow.cpp
llvm-svn: 357117

e9e01cc7

Revert r356996 "[DAG] Avoid smart constructor-based dangling nodes." · c6dfaa0e
Nirav Dave authored Mar 27, 2019
```
This patch appears to trigger very large compile time increases in
halide builds.

llvm-svn: 357116
```
c6dfaa0e

[ConstantRange] Add isWrappedSet() and isUpperSignWrapped() · 7b4e9a1c

Nikita Popov authored Mar 27, 2019

Split off from D59749. This adds isWrappedSet() and
isUpperSignWrapped() set with the same behavior as isSignWrappedSet()
and isUpperWrapped() for the respectively other domain.

The methods isWrappedSet() and isSignWrappedSet() will not consider
ranges of the form [X, Max] == [X, 0) and [X, SignedMax] == [X, SignedMin)
to be wrapping, while isUpperWrapped() and isUpperSignWrapped() will.

Also replace the checks in getUnsignedMin() and friends with method
calls that implement the same logic.

llvm-svn: 357112

7b4e9a1c

[CGP] Reset DT when optimizing select instructions · b7e21380

Teresa Johnson authored Mar 27, 2019

Summary:
A recent fix (r355751) caused a compile time regression because setting
the ModifiedDT flag in optimizeSelectInst means that each time a select
instruction is optimized the function walk in runOnFunction stops and
restarts again (which was needed to build a new DT before we started
building it lazily in r356937). Now that the DT is built lazily, a
simple fix is to just reset the DT at this point, rather than restarting
the whole function walk.

In the future other places that set ModifiedDT may want to switch to
just resetting the DT directly. But that will require an evaluation to
ensure that they don't otherwise need to restart the function walk.

Reviewers: spatel

Subscribers: jdoerfert, llvm-commits, xur

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59889

llvm-svn: 357111

b7e21380

[opt-viewer] Teach optrecord.py about !Failure tags · eaf4df47

Jessica Paquette authored Mar 27, 2019

WarnMissedTransforms.cpp produces remarks that use !Failure tags.

These weren't supported in optrecord.py, so if you encountered one in any of
the tools, the tool would crash.

Add them as a type of missed optimization.

Differential Revision: https://reviews.llvm.org/D59895

llvm-svn: 357110

eaf4df47

[ARM] Don't confuse the scheduler for very large VLDMDIA etc. · c388bfa2

Eli Friedman authored Mar 27, 2019

ARMBaseInstrInfo::getNumLDMAddresses is making bad assumptions about the
memory operands of load and store-multiple operations.  This doesn't
really fix the problem properly, but it's enough to prevent crashing,
at least.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41231 .

Differential Revision: https://reviews.llvm.org/D59834

llvm-svn: 357109

c388bfa2

[AArch64][GlobalISel] Make G_PHI of v2s64, v4s32, v2s32 legal. · 8a02aea6
Amara Emerson authored Mar 27, 2019
```
llvm-svn: 357108
```
8a02aea6

[ConstantRange] Rename isWrappedSet() to isUpperWrapped() · 6d855ea0

Nikita Popov authored Mar 27, 2019

Split out from D59749. The current implementation of isWrappedSet()
doesn't do what it says on the tin, and treats ranges like
[X, Max] as wrapping, because they are represented as [X, 0) when
using half-inclusive ranges. This also makes it inconsistent with
the semantics of isSignWrappedSet().

This patch renames isWrappedSet() to isUpperWrapped(), in preparation
for the introduction of a new isWrappedSet() method with corrected
behavior.

llvm-svn: 357107

6d855ea0

[opt-viewer] Make filter_=None by default in get_remarks and gather_results · beda859a

Jessica Paquette authored Mar 27, 2019

Right now, if you try to use optdiff.py on any opt records, it will fail because
its calls to gather_results weren't updated to support filtering.

Since filters are supposed to be optional, this makes them None by default in
get_remarks and in gather_results. This allows other tools that don't support
filtering to still use the functions as is.

Differential Revision: https://reviews.llvm.org/D59894

llvm-svn: 357106

beda859a

RegPressure: Fix crash on blocks with only dbg_value · 2e9ddcc3

Matt Arsenault authored Mar 27, 2019

If there were only dbg_values in the block, recede would hit the
beginning of the block and try to use thet dbg_value as a real
instruction.

llvm-svn: 357105

2e9ddcc3

[InstCombine] Use uadd.sat and usub.sat for canonicalization · 7462303e

Nikita Popov authored Mar 27, 2019

Start using the uadd.sat and usub.sat intrinsics for the existing
canonicalizations. These intrinsics should optimize better than
expanded IR, have better handling in the X86 backend and should
be no worse than expanded IR in other backends, as far as we know.

rL357012 already introduced use of uadd.sat for the add+umin pattern.

Differential Revision: https://reviews.llvm.org/D58872

llvm-svn: 357103

7462303e

[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. · 381188f1

Amara Emerson authored Mar 27, 2019

The artifact combiners push instructions which have been marked for deletion
onto an list for the legalizer to deal with on return. However, for trunc(ext)
combines the combiner routine recursively calls itself. When it does this the
dead instructions list may not be empty, and the other combiners don't expect
to be dealing with essentially invalid MIR (multiple vreg defs etc).

This change fixes it by ensuring that the dead instructions are processed on
entry into tryCombineInstruction.

As a result, this fix exposed a few places in tests where G_TRUNC instructions
were not being deleted even though they were dead.

Differential Revision: https://reviews.llvm.org/D59892

llvm-svn: 357101

381188f1

[X86MacroFusion][NFC] Add a bulldozer test. · f8666b06
Clement Courbet authored Mar 27, 2019
```
llvm-svn: 357099
```
f8666b06

Reapply "AMDGPU: Scavenge register instead of findUnusedReg" · 7b14b242

Matt Arsenault authored Mar 27, 2019

This reapplies r356149, using the correct overload of findUnusedReg
which passes the current iterator.

This worked most of the time, because the scavenger iterator was moved
at the end of the frame index loop in PEI. This would fail if the
spill was the first instruction. This was further hidden by the fact
that the scavenger wasn't passed in for normal frame index
elimination.

llvm-svn: 357098

7b14b242

AMDGPU: Add testcase I meant to merge into r357093 · 86e4fc05
Matt Arsenault authored Mar 27, 2019
```
llvm-svn: 357097
```
86e4fc05

[X86] Add post-isel pseudos for rotate by immediate using SHLD/SHRD · 7c9afc35

Craig Topper authored Mar 27, 2019

Haswell CPUs have special support for SHLD/SHRD with the same register for both sources. Such an instruction will go to the rotate/shift unit on port 0 or 6. This gives it 1 cycle latency and 0.5 cycle reciprocal throughput. When the register is not the same, it becomes a 3 cycle operation on port 1. Sandybridge and Ivybridge always have 1 cyc latency and 0.5 cycle reciprocal throughput for any SHLD.

When FastSHLDRotate feature flag is set, we try to use SHLD for rotate by immediate unless BMI2 is enabled. But MachineCopyPropagation can look through a copy and change one of the sources to be different. This will break the hardware optimization.

This patch adds psuedo instruction to hide the second source input until after register allocation and MachineCopyPropagation. I'm not sure if this is the best way to do this or if there's some other way we can make this work.

Fixes PR41055

Differential Revision: https://reviews.llvm.org/D59391

llvm-svn: 357096

7c9afc35

[PeepholeOpt] Don't stop simplifying copies on sequence of subregs · 89daf49e

Quentin Colombet authored Mar 27, 2019

This patch removes an overly conservative check that would prevent
simplifying copies when the value we were tracking would go through
several subregister indices.
Indeed, the intend of this check was to not track values whenever
we have to compose subregister, but actually what the check was
doing was bailing anytime we see a second subreg, even if that
second subreg would actually be the new source of truth (as opposed
to a part of that subreg).

Differential Revision: https://reviews.llvm.org/D59891

llvm-svn: 357095

89daf49e

[AArch64][SVE] Asm: error on unexpected SVE vector register type suffix · e1eab42f

Sander de Smalen authored Mar 27, 2019

This patch fixes an assembler bug that allowed SVE vector registers to contain a
type suffix when not expected. The SVE unpredicated movprfx instruction is the
only instruction affected.

The following are examples of what was previously valid:

    movprfx z0.b, z0.b
    movprfx z0.b, z0.s
    movprfx z0, z0.s

These instructions are now erroneous.

Patch by Cullen Rhodes (c-rhodes)

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D59636

llvm-svn: 357094

e1eab42f

AMDGPU: Enable the scavenger for large frames · 17e39100

Matt Arsenault authored Mar 27, 2019

Another test is needed for the case where the scavenge fail, but
there's another issue with that which needs an additional fix.

llvm-svn: 357093

17e39100

AMDGPU: Add additional MIR tests for exec mask optimizations · 4d47ac3b

Matt Arsenault authored Mar 27, 2019

Also includes one example of how this transform is unsound. This isn't
verifying the copies are used in the control flow intrinisic patterns.

Also add option to disable exec mask opt pass. Since this pass is
unsound, it may be useful to turn it off until it is fixed.

llvm-svn: 357091

4d47ac3b

AMDGPU: Skip debug_instr when collapsing end_cf · 4ab28b64
Matt Arsenault authored Mar 27, 2019
```
Based on how these are inserted, I doubt this was causing a problem in
practice.

llvm-svn: 357090
```
4ab28b64
AMDGPU: Fix missing scc implicit def on s_andn2_b64_term · a42b7247
Matt Arsenault authored Mar 27, 2019
```
Introduce new helper class to copy properties directly from the base
instruction.

llvm-svn: 357089
```
a42b7247

New methods to check for under-/overflow in the SMT API · f5f8d27d

Mikhail R. Gadelha authored Mar 27, 2019

Summary: Added methods to check for under-/overflow in additions, subtractions, signed divisions/modulus, negations, and multiplications.

Reviewers: ddcc, gou4shi1

Reviewed By: ddcc, gou4shi1

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59796

llvm-svn: 357088

f5f8d27d

PEI: Delay checking requiresFrameIndexReplacementScavenging · b1936124

Matt Arsenault authored Mar 27, 2019

Currently this is called before the frame size is set on the
function. For AMDGPU, the scavenger is used for large frames where
part of the offset needs to be materialized in a register, so
estimating the frame size is useful for knowing whether the scavenger
is useful.

llvm-svn: 357087

b1936124

[MCA] Fix -Wparentheses warning breaking the -Werror build. · a194656f
Andrea Di Biagio authored Mar 27, 2019
```
Waring was introduced at r357074.

llvm-svn: 357085
```
a194656f
AMDGPU: Don't hardcode num defs for MUBUF instructions · 28f97f1d
Matt Arsenault authored Mar 27, 2019
```
This shouldn't change anything since the no-ret atomics are selected
later.

llvm-svn: 357084
```
28f97f1d

MIR: Freeze reserved regs after parsing everything · 733b8571

Matt Arsenault authored Mar 27, 2019

The AMDGPU implementation of getReservedRegs depends on
MachineFunctionInfo fields that are parsed from the YAML section. This
was reserving the wrong register since it was setting the reserved
regs before parsing the correct one.

Some tests were relying on the default reserved set for the assumed
default calling convention.

llvm-svn: 357083

733b8571

AMDGPU: wave_barrier is not isBarrier · e9ad7e9a

Matt Arsenault authored Mar 27, 2019

This is not a control flow instruction, so should not be marked as
isBarrier. This fixes a verifier error if followed by unreachable.

llvm-svn: 357081

e9ad7e9a

[BPF] use std::map to ensure consistent output · 6c56edfe

Yonghong Song authored Mar 27, 2019



The .BTF.ext FuncInfoTable and LineInfoTable contain
information organized per ELF section. Current definition
of FuncInfoTable/LineInfoTable is:
  std::unordered_map<uint32_t, std::vector<BTFFuncInfo>> FuncInfoTable
  std::unordered_map<uint32_t, std::vector<BTFLineInfo>> LineInfoTable
where the key is the section name off in the string table.
The unordered_map may cause the order of section output
different for different platforms.

The same for unordered map definition of
  std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>>
    DataSecEntries
where BTF_KIND_DATASEC entries may have different ordering
for different platforms.

This patch fixed the issue by using std::map.
Test static-var-derived-type.ll is modified to generate two
DataSec's which will ensure the ordering is the same for all
supported platforms.

Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 357077

6c56edfe

[X86MacroFusion][NFC] Improve macrofusion testing. · 678d128b
Clement Courbet authored Mar 27, 2019
```
Add negative tests.
Add arithmetic/inc/cmp/and macrofusion tests.

llvm-svn: 357076
```
678d128b

[MCA][Pipeline] Don't visit stages in reverse order when calling method cycleEnd(). NFCI · 333a3264

Andrea Di Biagio authored Mar 27, 2019

There is no reason why stages should be visited in reverse order.
This patch allows the definition of stages that push instructions forward from
their cycleEnd() routine.

llvm-svn: 357074

333a3264

AMDGPU: Fix areLoadsFromSameBasePtr for DS atomics · bbc59d8d
Matt Arsenault authored Mar 27, 2019
```
The offset operand index is different for atomics.

llvm-svn: 357073
```
bbc59d8d
gn build: Merge r357047 · 88efba81
Nico Weber authored Mar 27, 2019
```
llvm-svn: 357071
```
88efba81

[DAGCombiner] Unify Lifetime and memory Op aliasing. · b5630a2a

Nirav Dave authored Mar 27, 2019

Rework BaseIndexOffset and isAlias to fully work with lifetime nodes
and fold in lifetime alias analysis.

This is mostly NFC.

Reviewers: courbet

Reviewed By: courbet

Subscribers: hiraditya, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59794

llvm-svn: 357070

b5630a2a

[DAGCombine] Refactor GatherAllAliases. NFCI. · 96a264e0
Nirav Dave authored Mar 27, 2019
```
llvm-svn: 357069
```
96a264e0

Re-commit r355490 "[CodeGen] Omit range checks from jump tables when lowering... · 5c0d7a24

Hans Wennborg authored Mar 27, 2019

Re-commit r355490 "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default"

Original commit by Ayonam Ray.

This commit adds a regression test for the issue discovered in the
previous commit: that the range check for the jump table can only be
omitted if the fall-through destination of the jump table is
unreachable, which isn't necessarily true just because the default of
the switch is unreachable.

This addresses the missing optimization in PR41242.

> During the lowering of a switch that would result in the generation of a
> jump table, a range check is performed before indexing into the jump
> table, for the switch value being outside the jump table range and a
> conditional branch is inserted to jump to the default block. In case the
> default block is unreachable, this conditional jump can be omitted. This
> patch implements omitting this conditional branch for unreachable
> defaults.
>
> Differential Revision: https://reviews.llvm.org/D52002
> Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev

llvm-svn: 357067

5c0d7a24

Revert of 357063 [AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes · 40f0162a
Dmitry Preobrazhensky authored Mar 27, 2019
```
Reason: the change was mistakenly committed before review
llvm-svn: 357066
```
40f0162a

The IR verifier currently supports the constrained floating point intrinsics, · 4f3cdc65

Kevin P. Neal authored Mar 27, 2019

but the implementation is hard to extend. It doesn't currently have an
easy way to support intrinsics that, for example, lack a rounding mode.
This will be needed for impending new constrained intrinsics.

This code is split out of D55897 <https://reviews.llvm.org/D55897>, which
itself was split out of D43515 <https://reviews.llvm.org/D43515>.

Reviewed by:	arsenm
Differential Revision:	http://reviews.llvm.org/D59830

llvm-svn: 357065

4f3cdc65

[AArch64] NFC: Cleanup isAArch64FrameOffsetLegal · 90d1b551

Sander de Smalen authored Mar 27, 2019

Cleanup isAArch64FrameOffsetLegal by:
- Merging the large switch statement to reuse AArch64InstrInfo::getMemOpInfo().
- Using AArch64InstrInfo::getUnscaledLdSt() to determine whether an instruction
  has an unscaled variant.
- Simplifying the logic that calculates the offset to fit the immediate.

Reviewers: paquette, evandro, eli.friedman, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D59636

llvm-svn: 357064

90d1b551