Commits · 22a56f2f5a1fd0c248260b597f85586973294523 · Roger Ferrer / llvm-epi

Jan 24, 2017

[AMDGPU] Add VGPR copies post regalloc fix pass · 22a56f2f

Stanislav Mekhanoshin authored Jan 24, 2017

Regalloc creates COPY instructions which do not formally use VALU.
That results in v_mov instructions displaced after exec mask modification.
One pass which do it is SIOptimizeExecMasking, but potentially it can be
done by other passes too.

This patch adds a pass immediately after regalloc to add implicit exec
use operand to all VGPR copy instructions.

Differential Revision: https://reviews.llvm.org/D28874

llvm-svn: 292956

22a56f2f

[AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128' · 7784cacd

Evandro Menezes authored Jan 24, 2017

In order to follow the pattern of the existing 'slow-misaligned-128store'
option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'.

llvm-svn: 292954

7784cacd

[Lanai] Rename LanaiInstPrinter library to LanaiAsmPrinter · bef847c3

Chris Bieneman authored Jan 24, 2017

Summary:
    This is in keeping with LLVM convention. The classes are InstPrinters, but the library is ${target}AsmPrinter.

This patch is in response to bryant pointing out to me that Lanai was the only backend deviating from convention here. Thanks!

Reviewers: jpienaar, bryant

Subscribers: mgorny, jgosnell, llvm-commits

Differential Revision: https://reviews.llvm.org/D29043

llvm-svn: 292953

bef847c3

[InstSimplify] try to eliminate icmp Pred (add nsw X, C1), C2 · 56227253

Sanjay Patel authored Jan 24, 2017

I was surprised to see that we're missing icmp folds based on 'add nsw' in InstCombine, 
but we should handle the InstSimplify cases first because that could make the InstCombine
code simpler.

Here are Alive-based proofs for the logic:

Name: add_neg_constant
Pre: C1 < 0 && (C2 > ((1<<(width(C1)-1)) + C1))
%a = add nsw i7 %x, C1
%b = icmp sgt %a, C2
  =>
%b = false

Name: add_pos_constant
Pre: C1 > 0 && (C2 < ((1<<(width(C1)-1)) + C1 - 1))
%a = add nsw i6 %x, C1
%b = icmp slt %a, C2
  =>
%b = false

Name: nuw
Pre: C1 u>= C2
%a = add nuw i11 %x, C1
%b = icmp ult %a, C2
  =>
%b = false

Differential Revision: https://reviews.llvm.org/D29053

llvm-svn: 292952

56227253

[CodeView] Fix off-by-one error in def range gap emission · 11cf053b

Reid Kleckner authored Jan 24, 2017

Also fixes a much worse bug where we emitted the wrong gap size for the
def range uncovered by the test for this issue.

Fixes PR31726.

llvm-svn: 292949

11cf053b

[SelectionDAG] Handle inverted conditions when splitting into multiple branches. · 92a286ae

Geoff Berry authored Jan 24, 2017

Summary:
When conditional branches with complex conditions are split into
multiple branches in SelectionDAGBuilder::FindMergedConditions, also
handle inverted conditions.  These may sometimes appear without having
been optimized by InstCombine when CodeGenPrepare decides to sink and
duplicate cmp instructions, causing them to have only one use.  This
problem can be increased by e.g. GVNHoist hiding more cmps from
InstCombine by combining equivalent cmps from different blocks.

For example codegen X & !(Y | Z) as:
    jmp_if_X TmpBB
    jmp FBB
  TmpBB:
    jmp_if_notY Tmp2BB
    jmp FBB
  Tmp2BB:
    jmp_if_notZ TBB
    jmp FBB

Reviewers: bogner, MatzeB, qcolombet

Subscribers: llvm-commits, hiraditya, mcrosier, sebpop

Differential Revision: https://reviews.llvm.org/D28380

llvm-svn: 292944

92a286ae

[X86][AVX512] Remove unused argument from PMOVX tablegen patterns. NFCI. · 893d2119
Simon Pilgrim authored Jan 24, 2017
```
Seems to be a copy+paste legacy from the AVX2 patterns.

llvm-svn: 292941
```
893d2119
Fix formating in foldSelectCttzCtlz. NFC · 5da456e6
Amaury Sechet authored Jan 24, 2017
```
llvm-svn: 292934
```
5da456e6

[PH] Replace uses of AssertingVH from members of analysis results with · 6acdca78

Chandler Carruth authored Jan 24, 2017

a lazy-asserting PoisoningVH.

AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.

This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.

The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.

The rest is straight cleanup.

I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.

Differential Revision: https://reviews.llvm.org/D29006

llvm-svn: 292928

6acdca78

[X86][SSE] Add explicit braces to avoid -Wdangling-else warning. · 526299c8

Martin Bohme authored Jan 24, 2017

Reviewers: RKSimon

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D29076

llvm-svn: 292924

526299c8

Fix unused variable warning · 0c453389
Simon Pilgrim authored Jan 24, 2017
```
llvm-svn: 292921
```
0c453389
[X86][SSE] Add support for constant folding vector arithmetic shift by immediates · e1ec9072
Simon Pilgrim authored Jan 24, 2017
```
llvm-svn: 292919
```
e1ec9072
[X86][SSE] Add support for constant folding vector logical shift by immediates · 6340e548
Simon Pilgrim authored Jan 24, 2017
```
llvm-svn: 292915
```
6340e548

[InstCombine][X86] MULDQ/MULUDQ undef -> zero · 78f8630a

Simon Pilgrim authored Jan 24, 2017

Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now

Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst

llvm-svn: 292913

78f8630a

[Support] Use O_CLOEXEC only when declared · f726dfa6

Pavel Labath authored Jan 24, 2017

Summary:
Use the O_CLOEXEC flag only when it is available. Some old systems (e.g.
SLES10) do not support this flag. POSIX explicitly guarantees that this
flag can be checked for using #if, so there is no need for a CMake
check.

In case O_CLOEXEC is not supported, fall back to fcntl(FD_CLOEXEC)
instead.

Reviewers: rnk, rafael, mgorny

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28894

llvm-svn: 292912

f726dfa6

[Support] Add sys::fs::set_current_path() (aka chdir) · 2f096097

Pavel Labath authored Jan 24, 2017

Summary:
This adds a cross-platform way of setting the current working directory
analogous to the existing current_path() function used for retrieving
it. The function will be used in lldb.

Reviewers: rafael, silvas, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29035

llvm-svn: 292907

2f096097

[SLP] Refactoring of HorizontalReduction class, NFC. · 9f8bb384

Alexey Bataev authored Jan 24, 2017

Removed data members ReduxWidth and MinVecRegSize + some C++11 stylish
improvements.

Differential Revision: https://reviews.llvm.org/D29010

llvm-svn: 292899

9f8bb384

Update domtree incrementally in loop peeling. · 098ee2fe

Serge Pavlov authored Jan 24, 2017

With this change dominator tree remains in sync after each step of loop
peeling.

Differential Revision: https://reviews.llvm.org/D29029

llvm-svn: 292895

098ee2fe

[X86] Remove unnecessary peakThroughBitcasts call that's already take care of... · fc8798fa

Craig Topper authored Jan 24, 2017

[X86] Remove unnecessary peakThroughBitcasts call that's already take care of by the ISD::isBuildVectorAllOnes check below.

llvm-svn: 292894

fc8798fa

AMDGPU : Add trap handler support. · ee21a36f
Wei Ding authored Jan 24, 2017
```
llvm-svn: 292893
```
ee21a36f

[AVX-512] Simplify multiclasses for integer logic operations. There were... · b0cbd5b5

Craig Topper authored Jan 24, 2017

[AVX-512] Simplify multiclasses for integer logic operations. There were several inputs that didn't vary.

While there give them the same scheduling itinerary as the SSE/AVX versions.

llvm-svn: 292892

b0cbd5b5

Make VerifyDomInfo and VerifyLoopInfo global variables · 69b3ff9d

Serge Pavlov authored Jan 24, 2017

Verifications of dominator tree and loop info are expensive operations
so they are disabled by default. They can be enabled by command line
options -verify-dom-info and -verify-loop-info. These options however
enable checks only in files Dominators.cpp and LoopInfo.cpp. If some
transformation changes dominaror tree and/or loop info, it would be
convenient to place similar checks to the files implementing the
transformation.

This change makes corresponding flags global, so they can be used in
any file to optionally turn verification on.

llvm-svn: 292889

69b3ff9d

[SystemZ] Gracefully fail in GeneralShuffle::add() instead of assertion. · 463e2a6f

Jonas Paulsson authored Jan 24, 2017

The GeneralShuffle::add() method used to have an assert that made sure that
source elements were at least as big as the destination elements. This was
wrong, since it is actually expected that an EXTRACT_VECTOR_ELT node with a
smaller source element type than the return type gets extended.

Therefore, instead of asserting this, it is just checked and if this is the
case 'false' is returned from the GeneralShuffle::add() method. This case
should be very rare and is not handled further by the backend.

Review: Ulrich Weigand.
llvm-svn: 292888

463e2a6f

[X86] Don't split v8i32 all ones values if only AVX1 is available. Keep it... · 993edc9d

Craig Topper authored Jan 24, 2017

[X86] Don't split v8i32 all ones values if only AVX1 is available. Keep it intact and split it at isel.

This allows us to remove the check in ANDN combining that had to look through the extraction.

llvm-svn: 292881

993edc9d

[X86] Remove Undef handling from extractSubVector. This is now handled inside getNode. · eb440a14
Craig Topper authored Jan 24, 2017
```
llvm-svn: 292877
```
eb440a14

[SelectionDAG] Teach getNode to simplify a couple easy cases of EXTRACT_SUBVECTOR · ff272ad4

Craig Topper authored Jan 24, 2017

Summary:
This teaches getNode to simplify extracting from Undef. This is similar to what is done for EXTRACT_VECTOR_ELT. It also adds support for extracting from CONCAT_VECTOR when we can reuse one of the inputs to the concat. These seem like simple non-target specific optimizations.

For X86 we currently handle undef in extractSubvector, but not all EXTRACT_SUBVECTOR creations go through there.

Ultimately, my motivation here is to simplify extractSubvector and remove custom lowering for EXTRACT_SUBVECTOR since we don't do anything but handle undef and BUILD_VECTOR optimizations, but those should be DAG combines.

Reviewers: RKSimon, delena

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29000

llvm-svn: 292876

ff272ad4

[APInt] Remove calls to clearUnusedBits from XorSlowCase and operator^= · 9028f055

Craig Topper authored Jan 24, 2017

Summary:
There's a comment in XorSlowCase that says "0^0==1" which isn't true. 0 xored with 0 is still 0. So I don't think we need to clear any unused bits here.

Now there is no difference between XorSlowCase and AndSlowCase/OrSlowCase other than the operation being performed

Reviewers: majnemer, MatzeB, chandlerc, bkramer

Reviewed By: MatzeB

Subscribers: chfast, llvm-commits

Differential Revision: https://reviews.llvm.org/D28986

llvm-svn: 292873

9028f055

LiveIntervalAnalysis: Calculate liveness even if a superreg is reserved. · b901d334

Matthias Braun authored Jan 24, 2017

A register unit may be allocatable and non-reserved but some of the
register(tuples) built with it are reserved. We still need to calculate
liveness in this case.

Note to out of tree targets: If you start seeing machine verifier errors
with this commit, it probably means that you do not properly mark super
registers of reserved register as reserved. See for example r292836 or
r292870 for example on how to fix that.

rdar://29996737

Differential Revision: https://reviews.llvm.org/D28881

llvm-svn: 292871

b901d334

PowerPC: Mark super regs of reserved regs reserved. · 1d77599b

Matthias Braun authored Jan 24, 2017

When a register like R1 is reserved, X1 should be reserved as well. This
was already done "manually" when 64bit code was enabled, however using
the markSuperRegs() function on the base register is more convenient and
allows to use the checksAllSuperRegsMarked() function even in 32bit mode
to avoid accidental breakage in the future.

This is also necessary to allow https://reviews.llvm.org/D28881

Differential Revision: https://reviews.llvm.org/D29056

llvm-svn: 292870

1d77599b

[LTO] Teach lib/LTO about the new pass manager. · 0dd200e0
Davide Italiano authored Jan 24, 2017
```
Differential Revision:  https://reviews.llvm.org/D28997

llvm-svn: 292864
```
0dd200e0
[PM] Flesh out the new pass manager LTO pipeline. · 089a9123
Davide Italiano authored Jan 24, 2017
```
Differential Revision:  https://reviews.llvm.org/D28996

llvm-svn: 292863
```
089a9123

[sanitizer-coverage] emit __sanitizer_cov_trace_pc_guard w/o a preceding 'if'... · 4b2ff07c

Kostya Serebryany authored Jan 24, 2017

[sanitizer-coverage] emit __sanitizer_cov_trace_pc_guard w/o a preceding 'if' by default. Update the docs, also add deprecation notes around other parts of sanitizer coverage

llvm-svn: 292862

4b2ff07c

[APFloat] Add PPCDoubleDouble multiplication · 7f127624

Tim Shen authored Jan 24, 2017

Reviewers: echristo, hfinkel, kbarton, iteratee

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28382

llvm-svn: 292860

7f127624

[WebAssembly] Update LibFunc::Func -> LibFunc · 4b320b72
Derek Schuff authored Jan 24, 2017
```
Fixes compile failures after r292848

llvm-svn: 292857
```
4b320b72
SimplifyLibCalls: Replace more unary libcalls with intrinsics · 954a624f
Matt Arsenault authored Jan 23, 2017
```
llvm-svn: 292855
```
954a624f

[LoopUnroll] First form LCSSA, then loop-simplify · 461aa57a

Michael Kuperstein authored Jan 23, 2017

Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the
same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher
in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving
LoopSimplify on the result.

This fixes PR31718.

Differential Revision: https://reviews.llvm.org/D29055

llvm-svn: 292854

461aa57a

[AMDGPU] Fix obsolete comments, spotted by Malcolm Parsons. (NFC) · a63528cf
Eugene Zelenko authored Jan 23, 2017
```
llvm-svn: 292853
```
a63528cf

Makes promoteIndirectCall an external function. · 14bf0290

Dehao Chen authored Jan 23, 2017

Summary: promoteIndirectCall should be a utility function that could be invoked by other optimization passes.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29051

llvm-svn: 292850

14bf0290

[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) · d21529fa

David L. Jones authored Jan 23, 2017

Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848

d21529fa

AMDGPU: Custom lower more vector operations · 3aef8093
Matt Arsenault authored Jan 23, 2017
```
This avoids stack usage.

llvm-svn: 292846
```
3aef8093