Commits · 84fd4bee6cbfba32da94b155290f6661d788b77c · Roger Ferrer / llvm-epi

Jul 20, 2016

RegScavenging: Add scavengeRegisterBackwards() · 84fd4bee

Matthias Braun authored Jul 19, 2016

This is a variant of scavengeRegister() that works for
enterBasicBlockEnd()/backward(). The benefit of the backward mode is
that it is not affected by incomplete kill flags.

This patch also changes
PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register
scavenger in backwards mode.

Differential Revision: http://reviews.llvm.org/D21885

llvm-svn: 276044

84fd4bee

RegisterScavenger: Introduce backward() mode. · 4cb68e10

Matthias Braun authored Jul 19, 2016

This adds two pieces:
- RegisterScavenger:::enterBasicBlockEnd() which behaves similar to
  enterBasicBlock() but starts tracking at the end of the basic block.
- A RegisterScavenger::backward() method. It is subtly different
  from the existing unprocess() method which only considers uses with
  the kill flag set: If a value is dead at the end of a basic block with
  a last use inside the basic block, unprocess() will fail to mark it as
  live. However we cannot change/fix this behaviour because unprocess()
  needs to perform the exact reverse operation of forward().

Differential Revision: http://reviews.llvm.org/D21873

llvm-svn: 276043

4cb68e10

regenerate checks · d4ea94eb
Sanjay Patel authored Jul 19, 2016
```
llvm-svn: 276042
```
d4ea94eb

[AArch64] Properly validate the reciprocal estimation. · 238fa765

Evandro Menezes authored Jul 19, 2016

Add check for legal data types when expanding into a Newton series.

Differential Revision: https://reviews.llvm.org/D22267

llvm-svn: 276041

238fa765

[InstCombine] fold add(zext(xor X, C), C) --> sext X when C is INT_MIN in the source type · 2d477e59

Sanjay Patel authored Jul 19, 2016

The pattern may look more obviously like a sext if written as:

  define i32 @g(i16 %x) {
    %zext = zext i16 %x to i32
    %xor = xor i32 %zext, 32768
    %add = add i32 %xor, -32768
    ret i32 %add
  }

We already have that fold in visitAdd().

Differential Revision: https://reviews.llvm.org/D22477

llvm-svn: 276035

2d477e59

Jul 19, 2016

Attempt to appease MSVC buildbots. · 22a0f1a0
George Burgess IV authored Jul 19, 2016
```
Broken by r276026.

llvm-svn: 276032
```
22a0f1a0
[AMDGPU] Remove spurious line (should've been removed in r276029). · 63e59680
Davide Italiano authored Jul 19, 2016
```
llvm-svn: 276030
```
63e59680
[AMDGPU] Remove dead code. · 1576e385
Davide Italiano authored Jul 19, 2016
```
LGTM'd by Matt Arsenault.

llvm-svn: 276029
```
1576e385
[CFLAA] Make a test tell the truth. NFC. · 8b85321b
George Burgess IV authored Jul 19, 2016
```
Dishonesty noted by Jia Chen.

llvm-svn: 276028
```
8b85321b

[CFLAA] Add some interproc. analysis to CFLAnders. · 3b059841

George Burgess IV authored Jul 19, 2016

This patch adds function summary support to CFLAnders. It also comes
with a lot of tests! Woohoo!

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22450

llvm-svn: 276026

3b059841

Next step along the way to getting good error messages for bad archives. · 6524bd8c

Kevin Enderby authored Jul 19, 2016

This step builds on Lang Hames work to change Archive::child_iterator
for better interoperation with Error/Expected.  Building on that it is now
possible to return an error message when the size field of an archive
contains non-decimal characters.

llvm-svn: 276025

6524bd8c

add even more missing tests for simplifySelectBitTest() · 47c04f95
Sanjay Patel authored Jul 19, 2016
```
llvm-svn: 276024
```
47c04f95

[CFLAA] Teach CFLAnders to distinguish reads from writes. · c01b42fa

George Burgess IV authored Jul 19, 2016

This patch adds more specific edges to CFLAndersAliasAnalysis. The goal
of these edges is to give us more information about *how* two values
that MayAlias alias. With this, we can now tell cases like

a = b; // ergo, a may alias b

apart from

a = c;
b = c;

// so, a may alias b, but only because they were both assigned to c.

...And others.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22429

llvm-svn: 276023

c01b42fa

This code block breaks the docs build... · a0c1f408

Aaron Ballman authored Jul 19, 2016

This code block breaks the docs build (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11921/steps/docs-llvm-html/logs/stdio). Setting the code highlighting to none instead of llvm to hopefully get the bot stumbling back towards green.

llvm-svn: 276018

a0c1f408

Use posix_fallocate instead of ftruncate. · 3816c53f

Rafael Espindola authored Jul 19, 2016

This makes sure that space is actually available. With this change
running lld on a full file system causes it to exit with

failed to open foo: No space left on device

instead of crashing with a sigbus.

llvm-svn: 276017

3816c53f

[tsan] Don't instrument __llvm_gcov_global_state_pred or __llvm_gcda* · 57faf2d2

Vedant Kumar authored Jul 19, 2016

r274801 did not go far enough to allow gcov+tsan to cooperate. With this
commit it's possible to run the following code without false positives:

  std::thread T1(fib), T2(fib);
  T1.join(); T2.join();

llvm-svn: 276015

57faf2d2

ARM: move feature for Thumb2 pkhbt/pkhtb onto architectures. · 554fbd05

Tim Northover authored Jul 19, 2016

There's not much functional change, but it really is an architectural feature
(on v6T2, v7A, v7R and v7EM) rather than something each CPU implements
individually.

The main functional change is the default behaviour you get when specifying
only "-triple".

llvm-svn: 276013

554fbd05

[GlobalISel] Mark newly-created gvregs as having a bank. · 5a59b24b

Ahmed Bougacha authored Jul 19, 2016

Also verify that we never try to set the size of a vreg associated
to a register class.

Report an error when we encounter that in MIR. Fix a testcase that
hit that error and had a size for no reason.

llvm-svn: 276012

5a59b24b

[GlobalISel] Simplify more RegClassOrRegBank is+get. NFC. · 0313a08a
Ahmed Bougacha authored Jul 19, 2016
```
llvm-svn: 276011
```
0313a08a

[FunctionAttrs] Correct the safety analysis for inference of 'returned' · 5246e0b2

David Majnemer authored Jul 19, 2016

We skipped over ReturnInsts which didn't return an argument which would
lead us to incorrectly conclude that an argument returned by another
ReturnInst was 'returned'.

This reverts commit r275756.

This fixes PR28610.

llvm-svn: 276008

5246e0b2

[SCCP] Improve assert messages. NFCI. · 63266b6b

Davide Italiano authored Jul 19, 2016

I've been hitting those already while working on SCCP and I think
it's be useful to provide a more explanatory diagnostic.

llvm-svn: 276007

63266b6b

[libFuzzer] properly intercept memmem · 6b08be92
Kostya Serebryany authored Jul 19, 2016
```
llvm-svn: 276006
```
6b08be92
[DSE] Add additional debug output. NFC. · 8b5fa7a2
Chad Rosier authored Jul 19, 2016
```
llvm-svn: 276005
```
8b5fa7a2
Add a testcase for r275581 · 07ea3442
David Majnemer authored Jul 19, 2016
```
llvm-svn: 276002
```
07ea3442

[RegionInfo] Some cleanups · 938a6c7c

David Majnemer authored Jul 19, 2016

- Use unique_ptr instead of managing a container of new'd pointers.
- Use range based for loops.

No functional change is intended.

llvm-svn: 276001

938a6c7c

[RegionPass] Some minor cleanups · f29b7baf
David Majnemer authored Jul 19, 2016
```
No functional change is intended.

llvm-svn: 276000
```
f29b7baf
[LoopPass] Some minor cleanups · 1a4576e7
David Majnemer authored Jul 19, 2016
```
No functional change is intended.

llvm-svn: 275999
```
1a4576e7

This code block breaks the docs build... · 887ad0e9

Aaron Ballman authored Jul 19, 2016

This code block breaks the docs build (http://lab.llvm.org:8011/builders/llvm-sphinx-docs/builds/11920/steps/docs-llvm-html/logs/stdio), but I cannot see anything immediately wrong with it and cannot reproduce the diagnostic locally. Setting the code highlighting to none instead of nasm to hopefully get the bot stumbling back towards green.

llvm-svn: 275998

887ad0e9

add tests related to PR28466 · 8b76ebe5
Sanjay Patel authored Jul 19, 2016
```
llvm-svn: 275995
```
8b76ebe5
[X86][AVX512] Added AVX512 subvector broadcast tests · 5366d0e0
Simon Pilgrim authored Jul 19, 2016
```
llvm-svn: 275994
```
5366d0e0
[X86][AVX] Fixed typo in test names · f2d02cb0
Simon Pilgrim authored Jul 19, 2016
```
llvm-svn: 275992
```
f2d02cb0
[DSE] Add additional debug output. NFC. · 667b1ca0
Chad Rosier authored Jul 19, 2016
```
llvm-svn: 275991
```
667b1ca0
add missing test for simplifySelectBitTest() · d2ff6d72
Sanjay Patel authored Jul 19, 2016
```
llvm-svn: 275990
```
d2ff6d72

[InstCombine] Enable cast-folding in logic(cast(icmp), cast(icmp)) · 1c382622

Tobias Grosser authored Jul 19, 2016

Summary:
Currently, InstCombine is already able to fold expressions of the form `logic(cast(A), cast(B))` to the simpler form `cast(logic(A, B))`, where logic designates one of `and`/`or`/`xor`. This transformation is implemented in `foldCastedBitwiseLogic()` in InstCombineAndOrXor.cpp. However, this optimization will not be performed if both `A` and `B` are `icmp` instructions. The decision to preclude casts of `icmp` instructions originates in r48715 in combination with r261707, and can be best understood by the title of the former one:

> Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.

Apparently, it introduced a transformation that is a reverse of the transformation that is done in `foldCastedBitwiseLogic()`. Its purpose is to expose pairs of `zext icmp` that would subsequently be optimized by `transformZExtICmp()` in InstCombineCasts.cpp. Therefore, in order to avoid an endless loop of switching back and forth between these two transformations, the one in `foldCastedBitwiseLogic()` has been restricted to exclude `icmp` instructions which is mirrored in the responsible check:

`if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) && ...`

This check seems to sort out more cases than necessary because:
- the reverse transformation is obviously done for `or` instructions only
- and also not every `zext icmp` pair is necessarily the result of this reverse transformation

Therefore we now remove this check and replace it by a more finegrained one in `shouldOptimizeCast()` that now rejects only those `logic(zext(icmp), zext(icmp))` that would be able to be optimized by `transformZExtICmp()`, which also avoids the mentioned endless loop. That means we are now able to also simplify expressions of the form `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` (`cast` being an arbitrary `CastInst`).

As an example, consider the following IR snippet

```
%1 = icmp sgt i64 %a, %b
%2 = zext i1 %1 to i8
%3 = icmp slt i64 %a, %c
%4 = zext i1 %3 to i8
%5 = and i8 %2, %4
```

which would now be transformed to

```
%1 = icmp sgt i64 %a, %b
%2 = icmp slt i64 %a, %c
%3 = and i1 %1, %2
%4 = zext i1 %3 to i8
```

This issue became apparent when experimenting with the programming language Julia, which makes use of LLVM. Currently, Julia lowers its `Bool` datatype to LLVM's `i8` (also see https://github.com/JuliaLang/julia/pull/17225). In fact, the above IR example is the lowered form of the Julia snippet `(a > b) & (a < c)`. Like shown above, this may introduce `zext` operations, casting between `i1` and `i8`, which could for example hinder ScalarEvolution and Polly on certain code.

Reviewers: grosser, vtjnash, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22511

Contributed-by: Matthias Reisinger
llvm-svn: 275989

1c382622

AMDGPU: Only use legal inline immediates with kill pseudo · 03006fd3

Matt Arsenault authored Jul 19, 2016

Only if the value is negative or positive is what matters,
so use a constant that doesn't require an instruction to
materialize.

These should really just emit the write exec directly,
but for stick with the kill pseudo-terminator.

llvm-svn: 275988

03006fd3

[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR · 0ea8d275

Simon Pilgrim authored Jul 19, 2016

D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.

It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).

This patch changes both scalar and packed versions back to using x86-specific builtins.

It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.

A companion clang patch is at D22105

Differential Revision: https://reviews.llvm.org/D22106

llvm-svn: 275981

0ea8d275

[ARM] Refactor Thumb2 Mul and Mla instr descs · 6ca4bbb0

Sam Parker authored Jul 19, 2016

Recommitting after r274347 was reverted. This patch introduces some
classes to refactor the 3 and 4 register Thumb2 multiplication
instruction descriptions, plus improved tests for some of those
instructions.

Differential Revision: https://reviews.llvm.org/D21929

llvm-svn: 275979

6ca4bbb0

[AArch64] PredictableSelectIsExpensive for Vulcan. · 1bfca191

Pankaj Gode authored Jul 19, 2016

Adding PredictableSelectIsExpensive for Vulcan

Differential Revision: https://reviews.llvm.org/D22448

llvm-svn: 275978

1bfca191

Add support for tlsldm assembler operator to ARM target · cbcecca5

Peter Smith authored Jul 19, 2016

    
The standard local dynamic model for TLS on ARM systems needs two 
relocations:
- R_ARM_TLS_LDM32 (module idx)
- R_ARM_TLS_LDO32 (offset of object from origin of module TLS block)
    
In GNU style assembler we use symbol(tlsldm) and symbol(tlsldo) to
produce these relocations.
    
llvm-mc for ARM supports symbol(tlsldo) but does not support symbol(tlsldm).
This patch wires up the existing symbol(tlsldm) to R_ARM_TLS_LDM32.
    
TLS for ARM is defined in Addenda to, and Errata in, the ABI for the
ARM Architecture
    
Differential Revision: https://reviews.llvm.org/D22461

llvm-svn: 275977

cbcecca5

[AARCH64] Fix linu triple typo · b87a21f1
Simon Pilgrim authored Jul 19, 2016
```
As promised in D22191

llvm-svn: 275976
```
b87a21f1