Commits · c3dccd020e1d00975af326d2010b17d31f777a19 · Roger Ferrer / llvm-epi

Jan 12, 2017

[DebugInfo] Remove redundant check in SimplifyCFG; NFC. · b0124c1e
Robert Lougher authored Jan 12, 2017
```
llvm-svn: 291813
```
b0124c1e

[DebugInfo] Handle same locations in DILocation::getMergedLocation · 426851e6

Robert Lougher authored Jan 12, 2017

Revision 289661 introduced the function DILocation::getMergedLocation for
merging of debug locations. At the time is was simply a stub which always
returned no location. This patch modifies getMergedLocation to handle the
case where the two locations are the same or can't be discriminated.

Differential Revision: https://reviews.llvm.org/D28521

llvm-svn: 291809

426851e6

[SCEV] Simplify SolveLinEquationWithOverflow a bit. · b5c3a0d1
Eli Friedman authored Jan 12, 2017
```
Cleanup in preparation for generalizing it.

llvm-svn: 291808
```
b5c3a0d1

[X86] Replace AND+IMM64 with SRL/SHL · f02ac0ee

Nikolai Bozhenov authored Jan 12, 2017

Emit SHRQ/SHLQ instead of ANDQ with a 64 bit constant mask if the result
is unused and the mask has only higher/lower bits set. For example, with
this patch LLVM emits

  shrq $41, %rdi
  je

instead of

  movabsq $0xFFFFFE0000000000, %rcx
  testq   %rcx, %rdi
  je

This reduces number of instructions, code size and register pressure.
The transformation is applied only for cases where the mask cannot be
encoded as an immediate value within TESTQ instruction.

Differential Revision: https://reviews.llvm.org/D28198

llvm-svn: 291806

f02ac0ee

[X86] Modify BypassSlowDivision tests to match their new names (NFC) · 3db8bcdb

Nikolai Bozhenov authored Jan 12, 2017

- bypass-slow-division-32.ll:
  tests verifying correctness of divl-to-divb bypassing

- bypass-slow-division-64.ll:
  tests verifying correctness of divq-to-divl bypassing

- bypass-slow-division-tune.ll:
  tests verifying that bypassing is enabled only when appropriate

Differential Revision: https://reviews.llvm.org/D28551

llvm-svn: 291804

3db8bcdb

[llvm-config] Fix obviously wrong code in parsing DyLib components. · a5f1ff1a

Marcello Maggioni authored Jan 12, 2017

The code parsing the string was using the offset returned from
StringRef::find() wrong, assuming it was relative to the staring
offset that is passed to the function, but the returned offset
is always relative to the beginning of the line.

This causes odd behaviour while parsing the component string.
Spotted thanks to the newly added test:

tools/llvm-config/booleans.test

llvm-svn: 291803

a5f1ff1a

[X86] Rename tests for bypassing slow division (NFC) · 6684aeb1

Nikolai Bozhenov authored Jan 12, 2017

For tests on bypassing slow division there's no need to be
Atom-specific. The patch renames all tests on division bypassing
and makes their names more consistent:

  atom-bypass-slow-division.ll -> bypass-slow-division-32.ll
  (tests verifying correctness of divl-to-divb bypassing)

  atom-bypass-slow-division-64.ll -> bypass-slow-division-64.ll
  (tests verifying correctness of divq-to-divl bypassing)

  slow-div.ll -> bypass-slow-division-tune.ll
  (tests verifying that bypassing is enabled only when appropriate)

Differential Revision: https://reviews.llvm.org/D28197

llvm-svn: 291802

6684aeb1

[X86] Tune bypassing of slow division for Intel CPUs · 6bdf92ce

Nikolai Bozhenov authored Jan 12, 2017

64-bit integer division in Intel CPUs is extremely slow, much slower
than 32-bit division. On the other hand, 8-bit and 16-bit divisions
aren't any faster. The only important exception is Atom where DIV8
is fastest. Because of that, the patch
1) Enables bypassing of 64-bit division for Atom, Silvermont and
   all big cores.
2) Modifies 64-bit bypassing to use 32-bit division instead of
   16-bit one. This doesn't make the shorter division slower but
   increases chances of taking it. Moreover, it's much more likely
   to prove at compile-time that a value fits 32 bits and doesn't
   require a run-time check (e.g. zext i32 to i64).

Differential Revision: https://reviews.llvm.org/D28196

llvm-svn: 291800

6bdf92ce

[X86] Update LLC tests for slow division bypassing (NFC) · 05b40959

Nikolai Bozhenov authored Jan 12, 2017

Run update_llc_test_checks.py on

    CodeGen/X86/atom-bypass-slow-division.ll
    CodeGen/X86/atom-bypass-slow-division-64.ll
    CodeGen/X86/slow-div.ll

Differential Revision: https://reviews.llvm.org/D28469

llvm-svn: 291799

05b40959

AMDGPU: Skip fneg/select combine if it can fold into other · 45337df0
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291792
```
45337df0
AMDGPU: Fold free fneg into sin · 31c039ef
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291790
```
31c039ef

ARM: slightly more table driven libcall setup · 555e5980

Saleem Abdulrasool authored Jan 12, 2017

Switch some additional library call setup to be table driven.  This
makes it more immediately obvious what the library call looks like.
This is important for ARM since the calling conventions for the builtins
change based on the target/libcall name.  NFC

llvm-svn: 291789

555e5980

[DebugInfo] DILocation variable declaration should be const; NFC. · 6717a6fe
Robert Lougher authored Jan 12, 2017
```
llvm-svn: 291787
```
6717a6fe
Avoid std::errc::protocol_* to appease mingw · 84da6615
Hans Wennborg authored Jan 12, 2017
```
Like r291636 and r285261.

llvm-svn: 291786
```
84da6615
[DebugInfo] Add const to DILocation variable declaration; NFC. · f5df7a18
Robert Lougher authored Jan 12, 2017
```
llvm-svn: 291785
```
f5df7a18
AMDGPU: Fold fneg into fmul_legacy · a8c325e2
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291784
```
a8c325e2
Bump year to 2017 in LICENSE.txt · 1bcabc49
Hans Wennborg authored Jan 12, 2017
```
llvm-svn: 291782
```
1bcabc49
AMDGPU: Fold fneg into rcp · ff7e5aad
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291779
```
ff7e5aad
AMDGPU: Fold fneg into fp_round · 4242d48c
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291778
```
4242d48c
AMDGPU: Fold fneg into fp_extend · 98d2bf10
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291777
```
98d2bf10
Fix some -Wsign-compare warnings by making some integer literals explicitly unsigned · c7e51b14
David Blaikie authored Jan 12, 2017
```
llvm-svn: 291776
```
c7e51b14
TTI: Add comment clarifying the meaning of MemIntrinsicInfo::PtrVal. · 8a00aeee
Chad Rosier authored Jan 12, 2017
```
Patch by Tom Stellard.
Differential Revision: https://reviews.llvm.org/D27563

llvm-svn: 291772
```
8a00aeee

[globalisel] Move as much RegisterBank initialization to the constructor as possible · b7391dd3

Daniel Sanders authored Jan 12, 2017

Summary:
The register bank is now entirely initialized in the constructor. However,
we still have the hardcoded number of register classes which will be
dealt with in the TableGen patch (D27338) since we do not have access
to this information to resolve this at this stage. The number of register
classes is known to the TRI and to TableGen but the RegisterBank
constructor is too early for the former and too late for the latter.
This will be fixed when the data is tablegen-erated.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27809

llvm-svn: 291770

b7391dd3

[DebugInfo] Added DI macro creation API to DIBuilder. · 96075718
Amjad Aboud authored Jan 12, 2017
```
Differential Revision: https://reviews.llvm.org/D16077

llvm-svn: 291769
```
96075718

[globalisel] Initialize RegisterBanks with static data. · ae03595b

Daniel Sanders authored Jan 12, 2017

Summary:
Refactor the RegisterBank initialization to use static data. This requires
GlobalISel implementations to rewrite calls to createRegisterBank() and
addRegBankCoverage() into a call to setRegBankData().

Out of tree targets can use diff 4 of D27807
(https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump
the register classes and other data that needs to be provided to
setRegBankData(). This is the method that was used to generate the static data
in this patch.

Tablegen-eration of this static data will follow after some refactoring.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27807
Differential Revision: https://reviews.llvm.org/D27808

llvm-svn: 291768

ae03595b

[Devirtualization] MemDep returns non-local !invariant.group dependencies · 9530883e

Piotr Padlewski authored Jan 12, 2017

Summary:
Memory Dependence Analysis was limited to return only local dependencies
for invariant.group handling. Now it returns NonLocal when it finds it
and then by asking getNonLocalPointerDependency we get found dep.

Thanks to this we are able to devirtualize loops!

    void indirect(A &a, int n) {
      for (int i = 0 ; i < n; i++)
        a.foo();

    }
    void test(int n) {
      A a;
      indirect(a);
    }

After inlining a.foo() will be changed to direct call, even if foo and A::A()
is external (but only if vtable definition is be available).

Reviewers: nlewycky, dberlin, chandlerc, rsmith

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D28137

llvm-svn: 291762

9530883e

Wdocumentation fix · fef77a43
Simon Pilgrim authored Jan 12, 2017
```
llvm-svn: 291761
```
fef77a43

Fix windows buildbots building llvm-xray · 8ed57057

Simon Pilgrim authored Jan 12, 2017

2 issues:
1 - replaced unix-style pid_t with cross-platform llvm::sys::ProcessInfo::ProcessId 
2 - fixed shadow variable warning in lambda expression

Reviewed by @filcab

llvm-svn: 291760

8ed57057

[XRay] Include <numeric> for std::accumulate. · 0c6392d3
Dean Michael Berris authored Jan 12, 2017
```
Fix-up following D24377.

llvm-svn: 291750
```
0c6392d3

[XRay] Implement the `llvm-xray account` subcommand · 429bac89

Dean Michael Berris authored Jan 12, 2017

Summary:
This is the third of a multi-part change to implement subcommands for
the `llvm-xray` tool.

Here we define the `account` subcommand which does simple function call
accounting, generating basic statistics on function calls we find in an
XRay log/trace. We support text output and csv output for this
subcommand.

This change also supports sorting, summing, and filtering the top N
results.

Part of this tool will later be turned into a library that could be used
for basic function call accounting.

Depends on D24376.

Reviewers: dblaikie, echristo

Subscribers: mehdi_amini, dberris, beanz, llvm-commits

Differential Revision: https://reviews.llvm.org/D24377

llvm-svn: 291749

429bac89

AMDGPU: Fix sub_oneuse being marked commutative · f003198b
Matt Arsenault authored Jan 12, 2017
```
llvm-svn: 291748
```
f003198b
[AVX-512] Improve lowering of zero_extend of v4i1 to v4i32 and v2i1 to v2i64... · 24c3a239
Craig Topper authored Jan 12, 2017
```
[AVX-512] Improve lowering of zero_extend of v4i1 to v4i32 and v2i1 to v2i64 with VLX, but no DQ or BW support.

llvm-svn: 291747
```
24c3a239

[AVX-512] Improve lowering of sign_extend of v4i1 to v4i32 and v2i1 to v2i64... · 69ab67b2

Craig Topper authored Jan 12, 2017

[AVX-512] Improve lowering of sign_extend of v4i1 to v4i32 and v2i1 to v2i64 when avx512vl is available, but not avx512dq.

llvm-svn: 291746

69ab67b2

[X86][AVX512] Fix PR31515 - Do not flip vselect condition if it's not a vXi1 mask · c5ba925e

Elad Cohen authored Jan 12, 2017

r289653 added a case where `vselect <cond> <vector1> <all-zeros>`
is transformed to:
`vselect xor(cond, DAG.getConstant(1, DL, CondVT) <all-zeros> <vector1>`
This was not aimed to catch cases where Cond is not a vXi1
mask but it does. Moreover, when Cond type is VxiN (N > 1)
then xor(cond, DAG.getConstant(1, DL, CondVT) != NOT(cond).
This patch changes the above to xor with allones, and avoids
entering the case for non-mask Conds.

llvm-svn: 291745

c5ba925e

[AVX-512] Add more varied avx512 feature command lines to the avx512-cvt.ll... · 56f9610b

Craig Topper authored Jan 12, 2017

[AVX-512] Add more varied avx512 feature command lines to the avx512-cvt.ll test to show some poor codegen examples.

We're definitely doing bad things when avx512vl is enabled without avx512dq. It looks like avx512vl/dq without avx512bw may also have some issues.

llvm-svn: 291744

56f9610b

Make a test actually test what it set out to test. · b4d9a310

Chandler Carruth authored Jan 12, 2017

This test seems to have largely been relying on asserts being tripped.
It had a very specific and somewhat uninteresting grep of the output,
but it never really did anything to cause SCEV to be preserved across
loop simplify, certainly not explicitly. And a later addition to it
actually added CHECK lines despite the test never running FileCheck.

Now we actually print SCEV before and after loop simplify to make sure
it is *changing* and being *updated*. Which seems to be much more likely
the point of the test.

llvm-svn: 291740

b4d9a310

AMDGPU: Fold fneg into fma or fmad · 63f95379
Matt Arsenault authored Jan 12, 2017
```
Patch mostly by Fiona Glaser

llvm-svn: 291733
```
63f95379
AMDGPU: Fold fneg into fmul · 4103a81d
Matt Arsenault authored Jan 12, 2017
```
Patch mostly by Fiona Glaser

llvm-svn: 291732
```
4103a81d
AMDGPU: Fold fneg into fadd · 2529fba9
Matt Arsenault authored Jan 12, 2017
```
Patch mostly by Fiona Glaser

llvm-svn: 291731
```
2529fba9
AMDGPU: Pull fneg/fabs out of a select · 2a04ff97
Matt Arsenault authored Jan 11, 2017
```
Allows better source modifier usage.

llvm-svn: 291729
```
2a04ff97