Commits · 6d9f750dec29e8ae5366092e64cd343dae2c7464 · Roger Ferrer / llvm-epi

Feb 08, 2018

[DAGCombiner] Fix a couple mistakes from r324311 by really passing the... · c19aed96

Craig Topper authored Feb 08, 2018

[DAGCombiner] Fix a couple mistakes from r324311 by really passing the original load to ExtendSetCCUses.

We're passing the binary op that uses the load instead of the load.

Noticed by inspection. Not sure how to test this because this just prevents the introduction of an extend that will later be truncated and will probably be combined out.

llvm-svn: 324568

c19aed96

[DAGCombiner] Don't create truncate nodes in (aext (zextload x)) -> (zextload... · 9b9d5274

Craig Topper authored Feb 08, 2018

[DAGCombiner] Don't create truncate nodes in (aext (zextload x)) -> (zextload x) and similar folds. NFCI

The truncate is being used to replace other users of of the load, but we checked that the load only has one use so there are no other uses to replace.

llvm-svn: 324567

9b9d5274

ARM: Remove dead code. NFCI. · 559ff1fe
Peter Collingbourne authored Feb 08, 2018
```
llvm-svn: 324565
```
559ff1fe

[CodeGen] Print MachineBasicBlock labels using MIR syntax in -debug output · da89d181

Francis Visoiu Mistrih authored Feb 08, 2018

Instead of:

%bb.1: derived from LLVM BB %for.body

print:

bb.1.for.body:

Also use MIR syntax for MBB attributes like "align", "landing-pad", etc.

llvm-svn: 324563

da89d181

[DAGCombiner] Avoid creating truncate nodes in (zext (and (load)))->(and... · cbfe41ac

Craig Topper authored Feb 08, 2018

[DAGCombiner] Avoid creating truncate nodes in (zext (and (load)))->(and (zextload)) fold until we know for sure we're going to need it. NFCI

The truncate is only needed if the load has additional users. It used to get passed to extendSetCCUses so was created early, but that's no longer the case.

llvm-svn: 324562

cbfe41ac

[DAGCombiner] Rename variable to be slightly better. NFC · bf4ed426

Craig Topper authored Feb 08, 2018

We were calling a load LN0 but it came from N0.getOperand(0) so its really more like LN00 if we follow the name used in other places.

llvm-svn: 324561

bf4ed426

bpf: Improve expanding logic in LowerSELECT_CC · f2075aef

Yonghong Song authored Feb 08, 2018



LowerSELECT_CC is not generating optimal Select_Ri pattern at the moment. It
is not guaranteed to place ConstantNode at RHS which would miss matching
Select_Ri.

A new testcase added into the existing select_ri.ll, also there is an
existing case in cmp.ll which would be improved to use Select_Ri after this
patch, it is adjusted accordingly.

Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
llvm-svn: 324560

f2075aef

gold-plugin: Do not set codegen opt level based on LTO opt level. · bae5918d

Peter Collingbourne authored Feb 08, 2018

The LTO opt level should not affect the codegen opt level, and indeed
it does not affect it in lld. Ideally the codegen opt level should
be controlled by an IR-level attribute based on the compile-time opt
level, but that hasn't been implemented yet.

Differential Revision: https://reviews.llvm.org/D43040

llvm-svn: 324557

bae5918d

AMDGPU: Fix incorrect reordering when inline asm defines LDS address · b02cebf5
Matt Arsenault authored Feb 08, 2018
```
Defs of operands outside of the instruction's explicit defs need
to be checked.

llvm-svn: 324554
```
b02cebf5

Fix PR36268. · 362fccf1

Rafael Espindola authored Feb 08, 2018

The issue is that clang was first creating a extern_weak hidden GV and
then changing the linkage to external.

Once we know it is not extern_weak we know it must be dso_local.

This patch refactors the code that sets the implicit dso_local to a
helper private function that is used every time we change the linkage
or visibility.

I will commit a patch to clang in a minute.

llvm-svn: 324551

362fccf1

AMDGPU: Don't crash when trying to fold implicit operands · c908e3f7
Matt Arsenault authored Feb 08, 2018
```
llvm-svn: 324550
```
c908e3f7
[NVPTX] When dying due to a bad address space value, print out the value. · 321b443e
Justin Lebar authored Feb 08, 2018
```
llvm-svn: 324549
```
321b443e

[AMDGPU] Fixed wait count reuse · db39b4b0

Stanislav Mekhanoshin authored Feb 08, 2018

The code reusing existing wait counts is incorrect since it keeps
adding new operands to an old instruction instead of replacing
the immediate. It was also effectively switched off by the condition
that wait count is not an AMDGPU::S_WAITCNT.

Also switched to BuildMI instead of creating instructions directly.

Differential Revision: https://reviews.llvm.org/D42997

llvm-svn: 324547

db39b4b0

[x86] Fix nasty bug in the x86 backend that is essentially impossible to · 0be0cfa6

Chandler Carruth authored Feb 07, 2018

hit from IR but creates a minefield for MI passes.

The x86 backend has fairly powerful logic to try and fold loads that
feed register operands to instructions into a memory operand on the
instruction. This is almost always a good thing, but there are specific
relocated loads that are only allowed to appear in specific
instructions. Notably, R_X86_64_GOTTPOFF is only allowed in `movq` and
`addq`. This patch blocks folding of memory operands using this
relocation unless the target is in fact `addq`.

The particular relocation indicates why we simply don't hit this under
normal circumstances. This relocation is only used for TLS, and it gets
used in very specific ways in conjunction with %fs-relative addressing.
The result is that loads using this relocation are essentially never
eligible for folding into an instruction's memory operands. Unless, of
course, you have an MI pass that inserts usage of such a load. I have
exactly such an MI pass and was greeted by truly mysterious miscompiles
where the linker replaced my instruction with a completely garbage byte
sequence. Go team.

This is the only such relocation I'm aware of in x86, but there may be
others that need to be similarly restricted.

Fixes PR36165.

Differential Revision: https://reviews.llvm.org/D42732

llvm-svn: 324546

0be0cfa6

Verify profile data confirms large loop trip counts. · 06ac8cfb

Mircea Trofin authored Feb 07, 2018

Summary:
Loops with inequality comparers, such as:

   // unsigned bound
   for (unsigned i = 1; i < bound; ++i) {...}

have getSmallConstantMaxTripCount report a large maximum static
trip count - in this case, 0xffff fffe. However, profiling info
may show that the trip count is much smaller, and thus
counter-recommend vectorization.

This change:
- flips loop-vectorize-with-block-frequency on by default.
- validates profiled loop frequency data supports vectorization,
  when static info appears to not counter-recommend it. Absence
  of profile data means we rely on static data, just as we've
  done so far.

Reviewers: twoh, mkuper, davidxl, tejohnson, Ayal

Reviewed By: davidxl

Subscribers: bkramer, llvm-commits

Differential Revision: https://reviews.llvm.org/D42946

llvm-svn: 324543

06ac8cfb

Feb 07, 2018

[X86] Prune some unreachable 'return SDValue()' paths from... · 37765ff3

Craig Topper authored Feb 07, 2018

[X86] Prune some unreachable 'return SDValue()' paths from LowerSIGN_EXTEND/LowerZERO_EXTEND/LowerANY_EXTEND.

We were doing a lot of whitelisting of what we handle in these routines, but setOperationAction constrains what we can get here. So just add some asserts and prune the unreachable paths.

llvm-svn: 324538

37765ff3

[X86] Remove dead code from EmitTest that looked for an i1 type which should... · 1db5ebc0

Craig Topper authored Feb 07, 2018

[X86] Remove dead code from EmitTest that looked for an i1 type which should have already been type legalized away. NFC

llvm-svn: 324536

1db5ebc0

[X86] When doing callee save/restore for k-registers make sure we don't use... · 8baa9c77

Craig Topper authored Feb 07, 2018

[X86] When doing callee save/restore for k-registers make sure we don't use KMOVQ on non-BWI targets

If we are saving/restoring k-registers, the default behavior of getMinimalRegisterClass will find the VK64 class with a spill size of 64 bits. This will cause the KMOVQ opcode to be used for save/restore. If we don't have have BWI instructions we need to constrain the class returned to give us VK16 with a 16-bit spill size. We can do this by passing the either v16i1 or v64i1 into getMinimalRegisterClass.

Also add asserts to make sure BWI is enabled anytime we use KMOVD/KMOVQ. These are what caught this bug.

Fixes PR36256

Differential Revision: https://reviews.llvm.org/D42989

llvm-svn: 324533

8baa9c77

[X86] Auto-generate complete checks. NFC · ce26819f
Craig Topper authored Feb 07, 2018
```
llvm-svn: 324530
```
ce26819f
Revert "[DebugInfo] Improvements to representation of enumeration types (PR36168)" · 74906a46
Momchil Velikov authored Feb 07, 2018
```
Revert commit r324489, it broke LLDB tests.

llvm-svn: 324511
```
74906a46
[SLP] Add a tests for PR36280, NFC. · cd8d6de3
Alexey Bataev authored Feb 07, 2018
```
llvm-svn: 324510
```
cd8d6de3

Generate PDB files for profiling even in Release build. · 876dc712

Zachary Turner authored Feb 07, 2018

This patch enables PDB generation for Release build, which has
slightly different optimize option with RelWithDebInfo on windows.

This helps to know slow part of Release build when profiling.

Patch by Takuto Ikuta
Differential Revision: https://reviews.llvm.org/D42632

llvm-svn: 324504

876dc712

[X86] Regenerate test using update_mir_test_checks.py. NFC · d1843001
Craig Topper authored Feb 07, 2018
```
llvm-svn: 324497
```
d1843001
Revert "AMDGPU: Add 32-bit constant address space" · f4e3f3e3
Rafael Espindola authored Feb 07, 2018
```
This reverts commit r324487.

It broke clang tests.

llvm-svn: 324494
```
f4e3f3e3

Revert dsymutil -update commits · 36df7631

Jonas Devlieghere authored Feb 07, 2018

Revert "[dsymutil][test] Check the updated dSYM instead of companion file."
Revert "[dsymutil] Upstream update feature."

llvm-svn: 324493

36df7631

[SelectionDAG] More Aggressibly prune nodes in AddChains. NFCI. · efed6568

Nirav Dave authored Feb 07, 2018

Travel all chains paths to first non-tokenfactor node can be
exponential work. Add simple redundency check to avoid this.
Fixes PR36264.

llvm-svn: 324491

efed6568

[DebugInfo] Improvements to representation of enumeration types (PR36168) · c502027e

Momchil Velikov authored Feb 07, 2018

This patch is the LLVM part of fixing the issues, described in
https://bugs.llvm.org/show_bug.cgi?id=36168

* The representation of enumerator values in the debug info metadata now
  contains a boolean flag isUnsigned, which determines how the bits of
  the value are interpreted.
* The DW_TAG_enumeration type DIE now always (for DWARF version >= 3)
  includes a DW_AT_type attribute, which refers to the underlying
  integer type, as suggested in DWARFv4 (5.7 Enumeration Type Entries).
* The debug info metadata for enumeration type contains (in flags)
  indication whether this is a C++11 "fixed enum".
* For C++11 enumeration with a fixed underlying type, the DIE also
  includes the DW_AT_enum_class attribute (for DWARF version >= 4).
* Encoding of enumerator constants uses DW_FORM_sdata for signed values
  and DW_FORM_udata for unsigned values, as suggested by DWARFv4 (7.5.4
  Attribute Encodings).

The changes should be backwards compatible:

* the isUnsigned attribute is optional and defaults to false.
* if the underlying type for the enumeration is not available, the
  enumerator values are considered signed.
* the FixedEnum flag defaults to clear.
* the bitcode format for DIEnumerator stores the unsigned flag bit #1 of
  the first record element, so the format does not change and the zero
  previously stored there is consistent with the false default for
  IsUnsigned.

Differential Revision: https://reviews.llvm.org/D42734

llvm-svn: 324489

c502027e

AMDGPU: Add 32-bit constant address space · 871c30e5

Marek Olsak authored Feb 07, 2018

Note: This is a candidate for LLVM 6.0, because it was planned to be
      in that release but was delayed due to a long review period.

Merge conflict in release_60 - resolution:
    Add "-p6:32:32" into the second (non-amdgiz) string.

Only scalar loads support 32-bit pointers. An address in a VGPR will
fail to compile. That's OK because the results of loads will only be used
in places where VGPRs are forbidden.

Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC.
The tests cover all uses cases we need for Mesa.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D41651

llvm-svn: 324487

871c30e5

AMDGPU: Remove the s_buffer workaround for GFX9 chips · b2cc7798

Marek Olsak authored Feb 07, 2018

Summary:
I checked the AMD closed source compiler and the workaround is only
needed when x3 is emulated as x4, which we don't do in LLVM.

SMEM x3 opcodes don't exist, and instead there is a possibility to use x4
with the last component being unused. If the last component is out of
buffer bounds and falls on the next 4K page, the hw hangs.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D42756

llvm-svn: 324486

b2cc7798

[X86][AVX] Add PACKSSDW/PACKUSDW support for truncation of clamped values · b4e789e8
Simon Pilgrim authored Feb 07, 2018
```
SSE and shorter vector sizes will have to wait until we can add support for general SMIN/SMAX matching.

llvm-svn: 324485
```
b4e789e8

[dsymutil][test] Check the updated dSYM instead of companion file. · b4191082

Jonas Devlieghere authored Feb 07, 2018

This patch has llvm-dwarfdump check the whole dSYM, rather than the
hard-coded path to the Mach-O companion file. This might be what's
causing the Windows bot to fail.

llvm-svn: 324483

b4191082

[SLPVectorizer][NFC] Make a loop more readable. · 9c22d801
Clement Courbet authored Feb 07, 2018
```
llvm-svn: 324482
```
9c22d801

[dsymutil] Upstream update feature. · a4b9417b

Jonas Devlieghere authored Feb 07, 2018

Now that dsymutil can generate accelerator tables, we can upstream the
update logic that, as the name implies, updates the accelerator tables
in an existing dSYM bundle. In combination with `-minimize` this can be
used to remove redundant .debug_(inlines|pubtypes|pubnames).

Differential revision: https://reviews.llvm.org/D42880

llvm-svn: 324480

a4b9417b

[X86] Regenerate atomic i32 tests · c90d79f8
Simon Pilgrim authored Feb 07, 2018
```
llvm-svn: 324479
```
c90d79f8
[Orc] Pacify -pedantic. · 6ddafa56
Benjamin Kramer authored Feb 07, 2018
```
llvm-svn: 324478
```
6ddafa56
[mips] Support 'y' operand code to print exact log2 of the operand · 70498f81
Simon Atanasyan authored Feb 07, 2018
```
llvm-svn: 324477
```
70498f81

[mips] Handle 'M' and 'L' operand codes for memory operands · 737bec38

Simon Atanasyan authored Feb 07, 2018

Both operand codes now work the same way in case of register or memory
operands. It print high-order or low-order word in a double-word
register or memory location.

llvm-svn: 324476

737bec38

[BinaryFormat] Remove dangling declaration of DiscriminantString · 524bd9cd
Pavel Labath authored Feb 07, 2018
```
The implementation of the function was deleted in r324426. This also
removes the declaration.

llvm-svn: 324474
```
524bd9cd

Re-enable "[SCEV] Make isLoopEntryGuardedByCond a bit smarter" · b299ade2

Max Kazantsev authored Feb 07, 2018

The failures happened because of assert which was overconfident about
SCEV's proving capabilities and is generally not valid.

Differential Revision: https://reviews.llvm.org/D42835

llvm-svn: 324473

b299ade2

[MergeICmps] Re-commit rL324317 "Enable the MergeICmps Pass by default." · 10003e31

Clement Courbet authored Feb 07, 2018

With fixes from rL324341.

Original commit message:

[MergeICmps] Enable the MergeICmps Pass by default.

Summary: Now that PR33325 is fixed, this should always improve the generated code.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42793

llvm-svn: 324465

10003e31