Commits · 397f9d9d05fc9451b274e3126891236cfd72f639 · Lorenzo Albano / LLVM bpEVL

Nov 16, 2016

ARM: fix CodeGen for 64-bit shifts. · 397f9d9d

Tim Northover authored Nov 16, 2016

One half of the shifts obviously needed conditional selection based on whether
the shift amount is more than 32-bits, but leaving the other half as the
natural shift isn't acceptable either: it's undefined behaviour to shift a
32-bit value by more than 31.

llvm-svn: 287149

397f9d9d

Make block placement deterministic · 66827427

Rong Xu authored Nov 16, 2016

We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO
bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type
whose iterating order depends on the pointer value.

This patch fixes this issue by changing to use SmallSetVector.

Differential Revision: http://reviews.llvm.org/D26634

llvm-svn: 287148

66827427

[InstCombine] replace unreachable with assert and remove unreachable code; NFCI · 80baf69c
Sanjay Patel authored Nov 16, 2016
```
llvm-svn: 287147
```
80baf69c

AMDGPU: Enable ConstrainCopy DAG mutation · 3b36bb1d

Matt Arsenault authored Nov 16, 2016

This fixes a probably unintended divergence from the default
scheduler behavior.

llvm-svn: 287146

3b36bb1d

[InstCombine] fix formatting and add FIXMEs to foldOperationIntoSelectOperand(); NFC · 1b9560ff
Sanjay Patel authored Nov 16, 2016
```
llvm-svn: 287145
```
1b9560ff
adding operator* to help working with primitive values · 423405b1
Mike Aizatsky authored Nov 16, 2016
```
Subscribers: kubabrecka

Differential Revision: https://reviews.llvm.org/D26756

llvm-svn: 287144
```
423405b1

[ELF] Don't replace path separators on *NIX. · c223d1bc

Davide Italiano authored Nov 16, 2016

Apparently this is wrong because it's legal to have a filename
on UNIX which contains a backslash.

Differential Revision:  https://reviews.llvm.org/D26734

llvm-svn: 287143

c223d1bc

[AArch64] Handle vector types in replaceZeroVectorStore. · 8301c645

Geoff Berry authored Nov 16, 2016

Summary:
Extend replaceZeroVectorStore to handle more vector type stores,
floating point zero vectors and set alignment more accurately on split
stores.

This is a follow-up change to r286875.

This change fixes PR31038.

Reviewers: MatzeB

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26682

llvm-svn: 287142

8301c645

Relax testcase. · 001c6789

Adrian Prantl authored Nov 16, 2016

This removes checks that are irrelevant for what is being tested.

llvm-svn: 287141

001c6789

Reduce number of tasks in parallel_for_each. · 87ff6fef

Rui Ueyama authored Nov 16, 2016

TaskGroup has a fairly high overhead, so we don't want to partition
tasks into too small tasks. This patch partition tasks into up to
1024 tasks.

I compared this patch with the original LLD's parallel_for_each.
I reverted r287042 locally for comparison.

With this patch, time to self-link lld with debug info changed from
6.23 seconds to 4.62 seconds (-25.8%), with -threads and without -build-id.
With both -threads and -build-id, it improved from 11.71 seconds
to 4.94 seconds (-57.8%). Full results are below.

BTW, GNU gold takes 11.65 seconds to link the same binary.

NOW

--no-threads --build-id=none
       6789.847776 task-clock (msec)         #    1.000 CPUs utilized            ( +-  1.86% )
               685 context-switches          #    0.101 K/sec                    ( +-  2.82% )
                 4 cpu-migrations            #    0.001 K/sec                    ( +- 31.18% )
         1,424,690 page-faults               #    0.210 M/sec                    ( +-  1.07% )
    21,339,542,522 cycles                    #    3.143 GHz                      ( +-  1.49% )
    13,092,260,230 stalled-cycles-frontend   #   61.35% frontend cycles idle     ( +-  2.23% )
   <not supported> stalled-cycles-backend
    21,462,051,828 instructions              #    1.01  insns per cycle
                                             #    0.61  stalled cycles per insn  ( +-  0.41% )
     3,955,296,378 branches                  #  582.531 M/sec                    ( +-  0.39% )
        75,699,909 branch-misses             #    1.91% of all branches          ( +-  0.08% )

       6.787630744 seconds time elapsed                                          ( +-  1.86% )

--threads --build-id=none
      14767.148697 task-clock (msec)         #    3.196 CPUs utilized            ( +-  2.56% )
            28,891 context-switches          #    0.002 M/sec                    ( +-  1.99% )
               905 cpu-migrations            #    0.061 K/sec                    ( +-  5.49% )
         1,262,122 page-faults               #    0.085 M/sec                    ( +-  1.68% )
    43,116,163,217 cycles                    #    2.920 GHz                      ( +-  3.07% )
    33,690,171,242 stalled-cycles-frontend   #   78.14% frontend cycles idle     ( +-  3.67% )
   <not supported> stalled-cycles-backend
    22,836,731,536 instructions              #    0.53  insns per cycle
                                             #    1.48  stalled cycles per insn  ( +-  1.13% )
     4,382,712,998 branches                  #  296.788 M/sec                    ( +-  1.33% )
        78,622,295 branch-misses             #    1.79% of all branches          ( +-  0.54% )

       4.621228056 seconds time elapsed                                          ( +-  1.90% )

--threads --build-id=sha1
      24594.457135 task-clock (msec)         #    4.974 CPUs utilized            ( +-  1.78% )
            29,902 context-switches          #    0.001 M/sec                    ( +-  2.62% )
             1,097 cpu-migrations            #    0.045 K/sec                    ( +-  6.29% )
         1,313,947 page-faults               #    0.053 M/sec                    ( +-  2.36% )
    70,516,415,741 cycles                    #    2.867 GHz                      ( +-  0.78% )
    47,570,262,296 stalled-cycles-frontend   #   67.46% frontend cycles idle     ( +-  0.86% )
   <not supported> stalled-cycles-backend
    73,124,599,029 instructions              #    1.04  insns per cycle
                                             #    0.65  stalled cycles per insn  ( +-  0.33% )
    10,495,266,104 branches                  #  426.733 M/sec                    ( +-  0.41% )
        91,444,149 branch-misses             #    0.87% of all branches          ( +-  0.83% )

       4.944291711 seconds time elapsed                                          ( +-  1.72% )

PREVIOUS

--threads --build-id=none
       7307.437544 task-clock (msec)         #    1.160 CPUs utilized            ( +-  2.34% )
             3,128 context-switches          #    0.428 K/sec                    ( +-  4.37% )
               352 cpu-migrations            #    0.048 K/sec                    ( +-  5.98% )
         1,354,450 page-faults               #    0.185 M/sec                    ( +-  2.20% )
    22,081,733,098 cycles                    #    3.022 GHz                      ( +-  1.46% )
    13,709,991,267 stalled-cycles-frontend   #   62.09% frontend cycles idle     ( +-  1.77% )
   <not supported> stalled-cycles-backend
    21,634,468,895 instructions              #    0.98  insns per cycle
                                             #    0.63  stalled cycles per insn  ( +-  0.86% )
     3,993,062,361 branches                  #  546.438 M/sec                    ( +-  0.83% )
        76,188,819 branch-misses             #    1.91% of all branches          ( +-  0.19% )

       6.298101157 seconds time elapsed                                          ( +-  2.03% )

--threads --build-id=sha1
      12845.420265 task-clock (msec)         #    1.097 CPUs utilized            ( +-  1.95% )
             4,020 context-switches          #    0.313 K/sec                    ( +-  2.89% )
               369 cpu-migrations            #    0.029 K/sec                    ( +-  6.26% )
         1,464,822 page-faults               #    0.114 M/sec                    ( +-  1.37% )
    40,668,449,813 cycles                    #    3.166 GHz                      ( +-  0.96% )
    18,863,982,388 stalled-cycles-frontend   #   46.38% frontend cycles idle     ( +-  1.82% )
   <not supported> stalled-cycles-backend
    71,560,499,058 instructions              #    1.76  insns per cycle
                                             #    0.26  stalled cycles per insn  ( +-  0.14% )
    10,044,152,441 branches                  #  781.925 M/sec                    ( +-  0.19% )
        87,835,773 branch-misses             #    0.87% of all branches          ( +-  0.09% )

      11.711773314 seconds time elapsed                                          ( +-  1.51% )

llvm-svn: 287140

87ff6fef

Add the missing FileCheck invocation to this testcase. · f4c5a0e6
Adrian Prantl authored Nov 16, 2016
```
llvm-svn: 287139
```
f4c5a0e6
Rangify for loops, NFC. · 3998a09d
Yaron Keren authored Nov 16, 2016
```
llvm-svn: 287138
```
3998a09d
Export fewer functions from Error.h. · 28212d71
Rui Ueyama authored Nov 16, 2016
```
Also add a comment saying that check() returns a value.

llvm-svn: 287136
```
28212d71

[LoopVectorize] Fix for non-determinism in codegen · 000ce9a6

Mandeep Singh Grang authored Nov 16, 2016

Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718

Reviewers: mssimpso

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D26727

llvm-svn: 287135

000ce9a6

Fix PR31029 by attaching an artificial debug location to msabi thunks. · d3c4e1b1
Adrian Prantl authored Nov 16, 2016
```
This was a latent bug that was recently uncovered by r286400.

llvm-svn: 287134
```
d3c4e1b1

[ELF] - Separate locals list from versions. · 17c65af8

George Rimar authored Nov 16, 2016

This change separates all versioned locals to be a separate list in config,
that was suggested by Rafael and simplifies the logic a bit.

Differential revision: https://reviews.llvm.org/D26754

llvm-svn: 287132

17c65af8

AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass · 0d162b1c

Tom Stellard authored Nov 16, 2016

Summary:
1. Don't try to copy values to and from the same register class.
2. Replace copies with of registers with immediate values with v_mov/s_mov
   instructions.

The main purpose of this change is to make MachineSink do a better job of
determining when it is beneficial to split a critical edge, since the pass
assumes that copies will become move instructions.

This prevents a regression in uniform-cfg.ll if we enable critical edge
splitting for AMDGPU.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23408

llvm-svn: 287131

0d162b1c

[ExecutionEngine] Fix examples build broken in r287126 and other Include What You Use warnings. · caf28033
Eugene Zelenko authored Nov 16, 2016
```
llvm-svn: 287130
```
caf28033

Fixed layout of test/ASTMerge. · ee6d3fa0

Sean Callanan authored Nov 16, 2016

As outlined in a previous RFC, the test/ASTMerge/Inputs folder is getting full and the tests are starting to become interdependent. This is undesirable because

- it makes it harder to write new tests
- it makes it harder to figure out at a glance what old tests are doing, and
- it adds the risk of breaking one test while changing a different one, because of the interdependencies.

To fix this, according to the conversation in the RFC, I have changed the layout from

a.c
Inputs/a1.c
Inputs/a2.c

to

a/test.c
a/Inputs/a1.c
a/Inputs/a2.c
for all existing tests. I have also eliminated interdependencies by replicating the input files for each test that uses them.

https://reviews.llvm.org/D26571

llvm-svn: 287129

ee6d3fa0

[Frontend] Allow attaching an external sema source to compiler instance and... · 7de9969b

Benjamin Kramer authored Nov 16, 2016

[Frontend] Allow attaching an external sema source to compiler instance and extra diags to TypoCorrections

This can be used to append alternative typo corrections to an existing diag.
include-fixer can use it to suggest includes to be added.

Differential Revision: https://reviews.llvm.org/D26745

llvm-svn: 287128

7de9969b

fix comment formatting; NFC · 4ce99d4d
Sanjay Patel authored Nov 16, 2016
```
llvm-svn: 287127
```
4ce99d4d

[ExecutionEngine] Fix some Clang-tidy modernize-use-default,... · cecb0183

Eugene Zelenko authored Nov 16, 2016

[ExecutionEngine] Fix some Clang-tidy modernize-use-default, modernize-use-equals-delete and Include What You Use warnings; other minor fixes.

Differential revision: https://reviews.llvm.org/D26729

llvm-svn: 287126

cecb0183

Don't error if __tls_get_addr is defined. · 95eae57d

Rafael Espindola authored Nov 16, 2016

Turns out some systems do define it. Not producing an error in this
case matches gold and bfd.

llvm-svn: 287125

95eae57d

[ELF] - Added support for extern "c++" local symbols in version script. · e0fc2421

George Rimar authored Nov 16, 2016

Previously we did not support them,
patch implements this functionality

Differential revision: https://reviews.llvm.org/D26604

llvm-svn: 287124

e0fc2421

[ELF] - Change error message according to review comment. NFC. · 7759e5b6
George Rimar authored Nov 16, 2016
```
Forgot about that, I am sorry.

llvm-svn: 287123
```
7759e5b6

[x86] add fake scalar FP logic instructions to ReplaceableInstrs to save some bytes · 7f3d51f8

Sanjay Patel authored Nov 16, 2016

We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions. 
Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of 
compilers, but logically equivalent int, float, and double variants of bitwise-logic 
instructions are reality in x86, and the float variant may be a shorter instruction 
depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all 
the time.

This is a preliminary step towards solving PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137

Differential Revision:
https://reviews.llvm.org/D26712

llvm-svn: 287122

7f3d51f8

[Orc] Re-enable the RPC unit test disabled in r286917. · d4758898

Lang Hames authored Nov 16, 2016

This unit test infinite-looped on s390x due to a thread_yield being optimized
out. I've updated the QueueChannel class (where thread_yield was called) to use
a condition variable instead. This should cause the unit test to behave
correctly.

llvm-svn: 287121

d4758898

[ELF] - Improve diagnostic messages. · 76f429b4

George Rimar authored Nov 16, 2016

Particulaty "cannot preempt symbol" message
is extended with locations now.

Differential revision: https://reviews.llvm.org/D26738

llvm-svn: 287120

76f429b4

Define -build-id=tree as a synonym for -build-id=sha1. · cb876751

Rui Ueyama authored Nov 16, 2016

Our build-id is a tree hash anyway, so I'll define this as a synonym
for sha1. GNU gold takes this parameter, so this is for compatibility
with that.

llvm-svn: 287119

cb876751

[change-namespace] handle constructor initializer: Derived : Base::Base() {}... · ff51f011

Eric Liu authored Nov 16, 2016

[change-namespace] handle constructor initializer: Derived : Base::Base() {} and added conflict detections

Summary:
namespace nx { namespace ny { class Base { public: Base(i) {}} } }
namespace na {
namespace nb {
class X : public nx::ny {
public:
  X() : Base::Base(1) {}
};
}
}

When changing from na::nb to x::y, "Base::Base" will be changed to "nx::ny::Base" and
 "Base::" in "Base::Base" will be replaced with "nx::ny::Base" too, which causes
conflict. This conflict should've been detected when adding replacements but was hidden by `addOrMergeReplacement`. We now also detect conflict when adding replacements where conflict must not happen.

The namespace lookup is tricky here, we simply replace "Base::Base()" with "nx::ny::Base()" as a workaround, which compiles but not perfect.

Reviewers: hokein

Subscribers: bkramer, cfe-commits

Differential Revision: https://reviews.llvm.org/D26637

llvm-svn: 287118

ff51f011

[sancov] Name the global containing the main source file name · 3a83e768

Reid Kleckner authored Nov 16, 2016

If the global name doesn't start with __sancov_gen, ASan will insert
unecessary red zones around it.

llvm-svn: 287117

3a83e768

test commit, changed tab to spaces, NFC · e870398e
Daniil Fukalov authored Nov 16, 2016
```
llvm-svn: 287116
```
e870398e
target-data test update for TCE and TCELE · 6aa07ee4
Pekka Jaaskelainen authored Nov 16, 2016
```
llvm-svn: 287115
```
6aa07ee4
Remove duplicate condition (PR30648). NFCI. · 0b33f111
Simon Pilgrim authored Nov 16, 2016
```
We only need to check that the bitstream entry is a Record.

llvm-svn: 287114
```
0b33f111

Remove Windows-specific minidump plugin · 1ca677f4

Adrian McCarthy authored Nov 16, 2016

With the cross-platform minidump plugin working, the Windows-specific one is no longer needed. This eliminates the unnecessary code.

This does not eliminate the Windows-specific tests, as they hit a few cases the general tests don't. (The Windows-specific tests are currently passing.) I'll look into a separate patch to make sure we're not doing too much duplicate testing.

After that I might do a little re-org in the Windows plugin, as there was some factoring there (Common & Live) that probably isn't necessary anymore.

Differential Revision: https://reviews.llvm.org/D26697

llvm-svn: 287113

1ca677f4

Add a little endian variant of TCE. · 67354487
Pekka Jaaskelainen authored Nov 16, 2016
```
llvm-svn: 287112
```
67354487
Add a little endian variant of TCE. · 8483cf0a
Pekka Jaaskelainen authored Nov 16, 2016
```
llvm-svn: 287111
```
8483cf0a
[X86] Add integer division test for PR23590 · 79416ea7
Simon Pilgrim authored Nov 16, 2016
```
Shows missed opportunity to recognise reduced integer division result size

llvm-svn: 287110
```
79416ea7
Fix -verify tests for older ccache versions · c2b49f5a
Eric Fiselier authored Nov 16, 2016
```
llvm-svn: 287109
```
c2b49f5a

[X86][AVX512] Autoupgrade lossless i32/u32 to f64 conversion intrinsics with generic IR · b57dd171

Simon Pilgrim authored Nov 16, 2016

Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen.

LLVM counterpart to D26686

Differential Revision: https://reviews.llvm.org/D26736

llvm-svn: 287108

b57dd171