Commits · 582a5237f95a3852cead5208f28a84b4cab0efb2 · Roger Ferrer / llvm-epi

Feb 15, 2017

[AMDGPU] Revert failed scheduling · 582a5237

Stanislav Mekhanoshin authored Feb 15, 2017

This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.

In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.

Differential Revision: https://reviews.llvm.org/D29971

llvm-svn: 295206

582a5237

Revert "[JumpThreading] Thread through guards" · 94c8d497

Anna Thomas authored Feb 15, 2017

This reverts commit r294617.

We fail on an assert while trying to get a condition from an
unconditional branch.

llvm-svn: 295200

94c8d497

[X86] Regenerate scalar stack reload test · d811bdd6
Simon Pilgrim authored Feb 15, 2017
```
llvm-svn: 295195
```
d811bdd6
Fix unittest for buildbot with mips host (32bit big endian) from r295174 · 4b21d022
David Bozier authored Feb 15, 2017
```
llvm-svn: 295188
```
4b21d022
[InlineFunction] use getFunction(); NFC · 288f075f
Sanjay Patel authored Feb 15, 2017
```
llvm-svn: 295185
```
288f075f
Fix spelling mistake - paramater -> parameter. NFCI. · 1746e215
Simon Pilgrim authored Feb 15, 2017
```
llvm-svn: 295182
```
1746e215
[InlineFunction] use getCaller(); NFCI · 32d753ca
Sanjay Patel authored Feb 15, 2017
```
llvm-svn: 295181
```
32d753ca
[InlineFunction] use range-for loop; NFCI · ada717e2
Sanjay Patel authored Feb 15, 2017
```
llvm-svn: 295179
```
ada717e2
[X86] Regenerate i64 ext-load on 32-bit target tests · a0e56d2d
Simon Pilgrim authored Feb 15, 2017
```
llvm-svn: 295177
```
a0e56d2d

Attempt to fix buildbots after commit of r295173. · 5c8e5f37

David Bozier authored Feb 15, 2017

Unit tests needed to check on the endianness of the host platform. (Test was failing for big endian hosts).

llvm-svn: 295174

5c8e5f37

Fix incorrect formatting of DataRefImpl members in operator<< function · 4ab9a06f

David Bozier authored Feb 15, 2017

Changed format specifiers to use format macro constant for pointer type. 
Moved width part of format specifier in the correct place for formatting members a and b.

Added a unit test to confirm the output.

Differential Revision: https://reviews.llvm.org/D28957

llvm-svn: 295173

4ab9a06f

[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs · 0f0e5bd3

Simon Pilgrim authored Feb 15, 2017

Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets

llvm-svn: 295169

0f0e5bd3

[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el · ec657929

Sagar Thakur authored Feb 15, 2017

Summary: Adds support for xray instrumentation on mips for both 32-bit and 64-bit.

Reviewed by sdardis, dberris
Differential: D27697

llvm-svn: 295164

ec657929

Revert r295110 and r295144. · eef9b033

Daniel Jasper authored Feb 15, 2017

This fails under ASAN:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/798/steps/check-llvm%20asan/logs/stdio

llvm-svn: 295162

eef9b033

[X86][AVX] Remove REX_W from AVX instructions. · b8a4f255

Ayman Musa authored Feb 15, 2017

There is no meaning for REX_W in VEX encoded AVX instruction.

Differential Revision: https://reviews.llvm.org/D29894

llvm-svn: 295157

b8a4f255

[X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types · fbc7805e

Craig Topper authored Feb 15, 2017

Summary:
We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs.

As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast.

I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable.

This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused.

Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0.

Reviewers: delena, RKSimon, zvi

Reviewed By: zvi

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D28747

llvm-svn: 295155

fbc7805e

[AVX-512] Add PACKSS/PACKUS instructions to load folding tables. · ec5df5f4
Craig Topper authored Feb 15, 2017
```
llvm-svn: 295154
```
ec5df5f4

[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where... · 96ec7a23

Craig Topper authored Feb 15, 2017

[SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask

Summary:
The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract.

This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract.

Reviewers: zvi, RKSimon

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29926

llvm-svn: 295152

96ec7a23

[Orc][RPC] Add a AsyncHandlerTraits specialization for non-value-type response · 56b3d6b1

Lang Hames authored Feb 15, 2017

handler args.

The specialization just inherits from the std::decay'd response handler type.
This allows member functions (via MemberFunctionWrapper) to be used as async
handlers.

llvm-svn: 295151

56b3d6b1

AssumptionCache: Update documentation comment. · 96e36a67

Peter Collingbourne authored Feb 15, 2017

The comment was somewhat misleading in that it implied that passes were not
responsible for adding new assumptions to the assumption cache. This new
wording now explicitly mentions that they are required to do so.

Differential Revision: https://reviews.llvm.org/D29977

llvm-svn: 295148

96e36a67

SimplifyCFG: Register cloned assume intrinsics with assumption cache when creating critical edge. · 0609acc1
Peter Collingbourne authored Feb 15, 2017
```
Differential Revision: https://reviews.llvm.org/D29976

llvm-svn: 295145
```
0609acc1

WholeProgramDevirt: Separate the code that applies optzns from the code that... · e2367415

Peter Collingbourne authored Feb 15, 2017

WholeProgramDevirt: Separate the code that applies optzns from the code that decides whether to apply them. NFCI.

The idea is that the apply* functions will also be called when importing
devirt optimizations.

Differential Revision: https://reviews.llvm.org/D29745

llvm-svn: 295144

e2367415

Revert r295138: Instead of a series of string operations, use snprintf(). · 4b58f577
Rui Ueyama authored Feb 15, 2017
```
This broke buildbots.

llvm-svn: 295142
```
4b58f577
Instead of a series of string operations, use snprintf(). · aae04a9a
Rui Ueyama authored Feb 15, 2017
```
llvm-svn: 295138
```
aae04a9a
Return early. NFC. · a39d148a
Rui Ueyama authored Feb 15, 2017
```
llvm-svn: 295137
```
a39d148a
Use LLVM-style naming scheme. · 789c4220
Rui Ueyama authored Feb 15, 2017
```
llvm-svn: 295136
```
789c4220

[AMDGPU] Fix MaxWorkGroupsPerCU for large workgroups · 19f98c6a

Stanislav Mekhanoshin authored Feb 15, 2017

This patch corrects the maximum workgroups per CU if we have big
workgroups (more than 128). This calculation contributes to the
occupancy calculation in respect to LDS size.

Differential Revision: https://reviews.llvm.org/D29974

llvm-svn: 295134

19f98c6a

Use LLVM-style naming scheme. · 09786c4c
Rui Ueyama authored Feb 15, 2017
```
llvm-svn: 295132
```
09786c4c
Remove useless local variable. · 143b52c5
Rui Ueyama authored Feb 15, 2017
```
llvm-svn: 295131
```
143b52c5
Split WinCOFFObjectWriter::defineSection. NFC. · 24e27b47
Rui Ueyama authored Feb 15, 2017
```
llvm-svn: 295128
```
24e27b47
Simplify WinCOFFObjectWriter by removing a template member function. · dfc8aa8e
Rui Ueyama authored Feb 14, 2017
```
llvm-svn: 295126
```
dfc8aa8e
Do not lookup a DenseMap twice using the same key. · 0fcdb48c
Rui Ueyama authored Feb 14, 2017
```
llvm-svn: 295124
```
0fcdb48c
Use endian::write32le instead of endian::write. · 86e3ef92
Rui Ueyama authored Feb 14, 2017
```
llvm-svn: 295120
```
86e3ef92
Use zero-initialization instead of memset. · cbb4e7c1
Rui Ueyama authored Feb 14, 2017
```
llvm-svn: 295119
```
cbb4e7c1
[libFuzzer] increase the size of FixedWord from 27 to 64, see PR31950 · 32c5004c
Kostya Serebryany authored Feb 14, 2017
```
llvm-svn: 295117
```
32c5004c

Feb 14, 2017

Disable wrapping llvm-xray YAML output · 9afed037

Dimitry Andric authored Feb 14, 2017

Summary:
The YAML output produced by llvm-xray is supposed to be wrapped at the
arbitrary default of 70 columns set by `yaml:Output`.  Unfortunately,
the wrapping is rather unpredictable, and can easily go past the set
number of columns, depending on the execution environment.

To make the YAML output environment-independent, disable wrapping
instead.

Reviewers: dberris

Reviewed By: dberris

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D29962

llvm-svn: 295116

9afed037

Fix a bug in caller's BFI update code after inlining. · 5a12f236

Easwaran Raman authored Feb 14, 2017

Multiple blocks in the callee can be mapped to a single cloned block
since we prune the callee as we clone it. The existing code
iterates over the value map and clones the block frequency (and
eventually scales the frequencies of the cloned blocks). Value map's
iteration is not deterministic and so the cloned block might get the
frequency of any of the original blocks. The fix is to set the max of
the original frequencies to the cloned block. The first block in the
sequence must have this max frequency and, in the call context,
subsequent blocks must have its frequency.

Differential Revision: https://reviews.llvm.org/D29696

llvm-svn: 295115

5a12f236

Use "%zd" format specifier for printing number of testcases executed. · ae579a79

Kostya Serebryany authored Feb 14, 2017

Summary:
This helps to avoid signed integer overflow after running a fast fuzz target for several hours, e.g.:

<...>
Done -1097903291 runs in 54001 second(s)



Reviewers: kcc

Reviewed By: kcc

Differential Revision: https://reviews.llvm.org/D29941

llvm-svn: 295112

ae579a79

[LV] Rename Induction to PrimaryInduction. NFC. · 569162fe
Michael Kuperstein authored Feb 14, 2017
```
llvm-svn: 295111
```
569162fe

WholeProgramDevirt: Change internal vcall data structures to match summary. · 534c0175

Peter Collingbourne authored Feb 14, 2017

Group calls into constant and non-constant arguments up front, and use uint64_t
instead of ConstantInt to represent constant arguments. The goal is to allow
the information from the summary to fit naturally into this data structure in
a future change (specifically, it will be added to CallSiteInfo).

This has two side effects:
- We disallow VCP for constant integer arguments of width >64 bits.
- We remove the restriction that the bitwidth of a vcall's argument and return
  types must match those of the vfunc definitions.
I don't expect either of these to matter in practice. The first case is
uncommon, and the second one will lead to UB (so we can do anything we like).

Differential Revision: https://reviews.llvm.org/D29744

llvm-svn: 295110

534c0175