Commits · 4506e447c1b6b8bb1da0f4bd2ea3b353d72f4a87 · Lorenzo Albano / LLVM bpEVL

Dec 27, 2016
- [doc] Add mention of the difference in optimization level between Release and... · 4506e447
  Mehdi Amini authored Dec 26, 2016
```
[doc] Add mention of the difference in optimization level between Release and RelWithDebInfo in Cmake.rst

This is surprising to many people.

llvm-svn: 290556
```
  4506e447
- [ADT] Add an llvm::erase_if utility to make the standard erase+remove_if · cc44ab63
  Chandler Carruth authored Dec 26, 2016
```
pattern easier to write.

Differential Revision: https://reviews.llvm.org/D28120

llvm-svn: 290555
```
  cc44ab63
- [InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions · c9cf7fc7
  Simon Pilgrim authored Dec 26, 2016
```
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

llvm-svn: 290554
```
  c9cf7fc7
- [ADT] Add a boring std::partition wrapper similar to our std::remove_if · d9eaa54e
  Chandler Carruth authored Dec 26, 2016
```
wrapper.

llvm-svn: 290553
```
  d9eaa54e
Dec 26, 2016

Update comment to match dr1770. · 4f9b3f4a
Richard Smith authored Dec 26, 2016
```
llvm-svn: 290552
```
4f9b3f4a
clang-format NewGVN files · 85f91b0e
Daniel Berlin authored Dec 26, 2016
```
llvm-svn: 290551
```
85f91b0e

Misc cleanups and simplifications for NewGVN. · 85cbc8c0

Daniel Berlin authored Dec 26, 2016

Mostly use a bit more idiomatic C++ where we can,
so we can combine some things later.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28111

llvm-svn: 290550

85cbc8c0

Don't use our own incorrect version of isTriviallyDeadInstruction in NewGVN. Fixes PR/31472 · d59e8010
Daniel Berlin authored Dec 26, 2016
```
llvm-svn: 290549
```
d59e8010

[NewGVN] Add a flag to enable the pass via `-mllvm`. · fe7a3ee5

Davide Italiano authored Dec 26, 2016

NewGVN can be tested passing `-mllvm -enable-newgvn` to clang.

Differential Revision:  https://reviews.llvm.org/D28059

llvm-svn: 290548

fe7a3ee5

Wdocumentation fix · 6f3e1ea4
Simon Pilgrim authored Dec 26, 2016
```
llvm-svn: 290547
```
6f3e1ea4

[NewGVN] Change test to reflect difference between GVN and NewGVN. · 8ea5e4fc

Davide Italiano authored Dec 26, 2016

The current GVN algorithm folds unconditional branches to, it claims,
expose more PRE oportunities. The folding, if really needed,
(which is not sure, as it's not really proved it improves analysis)
can be done by an earlier cleanup pass instead of GVN itself.
Ack'ed/SGTM'd by Daniel Berlin.

Differential Revision:  https://reviews.llvm.org/D28117

llvm-svn: 290546

8ea5e4fc

Wdocumentation fix · cd9d7294
Simon Pilgrim authored Dec 26, 2016
```
llvm-svn: 290545
```
cd9d7294
[X86][AVX512] Added v64i8 reverse shuffle test (PR31470) · e8a5ab35
Simon Pilgrim authored Dec 26, 2016
```
llvm-svn: 290544
```
e8a5ab35
[NewGVN] Fold lookupOperandLeader() when there's only one use. NFCI. · a312ca84
Davide Italiano authored Dec 26, 2016
```
llvm-svn: 290543
```
a312ca84
[InstCombiner] Simplify lib calls to `round{,f}` · b5e03b61
Bryant Wong authored Dec 26, 2016
```
Differential Revision: https://reviews.llvm.org/D28110

llvm-svn: 290542
```
b5e03b61
Fix build error caused by r290539. · c5cf7a8b
Marina Yatsina authored Dec 26, 2016
```
llvm-svn: 290541
```
c5cf7a8b

[inline-asm]No error for conflict between inputs\outputs and clobber list · 168b9546

Marina Yatsina authored Dec 26, 2016

Updated test according to commit 290539:

According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict.
for example:

const long double a = 0.0;
int main()
{

char b;
double t1 = a;
__asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)");

return 0;
}

This should conflict with the output - t1 which is st, and st which is st aswell.
The patch fixes it.

Commit on behald of Ziv Izhar.

Differential Revision: https://reviews.llvm.org/D15075

llvm-svn: 290540

168b9546

[inline-asm]No error for conflict between inputs\outputs and clobber list · c42fd03b

Marina Yatsina authored Dec 26, 2016

According to extended asm syntax, a case where the clobber list includes a variable from the inputs or outputs should be an error - conflict.
for example:

const long double a = 0.0;
int main()
{

char b;
double t1 = a;
__asm__ ("fucompp": "=a" (b) : "u" (t1), "t" (t1) : "cc", "st", "st(1)");

return 0;
}

This should conflict with the output - t1 which is st, and st which is st aswell.
The patch fixes it.

Commit on behald of Ziv Izhar.

Differential Revision: https://reviews.llvm.org/D15075

llvm-svn: 290539

c42fd03b

Update to isl-0.18-17-g2844ebf · 60094135

Tobias Grosser authored Dec 26, 2016

This update improves isl's ability to coalesce different convex sets/maps,
especially when the contain existentially quantified variables.

llvm-svn: 290538

60094135

Test the different scenarios of GlobalDCE and comdats more · 80db76d5

Chandler Carruth authored Dec 26, 2016

systematically and document in the test what all is going on.

This replaces the PR-named test that was the only coverage for GlobalDCE
and comdats previously. I wrote this because I wasn't certain how
comdat DCE was supposed to work and wanted to step through what
GlobalDCE did to fully understand it. After talking to folks and reading
the code and really staring at things it all makes sense but it seemed
good to help write down some of this in a more explicit and fully
covering test case.

For example, it seemed like a bug that GlobalDCE didn't consider comdat
participation of ifuncs. Specifically it seemed like an accident because
testing didn't really cover that case. But in fact, ifuncs specifically
cannot participate in a comdat despite having that API. The new test
case covers this and explicitly documents that DCE gets to fire here
even though there are comdats involved.

Also, we didn't have any positive tests for the challenging cases such
as usage cycles between comdat participants that might make them seem
alive except that there is no external edge into the cycle.

llvm-svn: 290537

80db76d5

[AVX-512] Fix some patterns to use extended register classes. · 5ef13ba1
Craig Topper authored Dec 26, 2016
```
llvm-svn: 290536
```
5ef13ba1

[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with... · 7b788ada

Craig Topper authored Dec 26, 2016

[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.

Summary:
I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon.

I'll do the same thing for packed add/sub/mul/div in a future patch.

Reviewers: delena, RKSimon, zvi, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27879

llvm-svn: 290535

7b788ada

test: add explicit triples to the invocation · c47e1aab
Saleem Abdulrasool authored Dec 26, 2016
```
llvm-svn: 290534
```
c47e1aab

Driver: warn on -fPIC/-fpic/-fPIE/-fpie on Windows · d133dc22

Saleem Abdulrasool authored Dec 26, 2016

Use of these flags would result in the use of ELF-style PIE/PIC code
which is incorrect on Windows.  Windows is inherently PIC by means of
the DLL slide that occurs at load.  This also mirrors the behaviour on
GCC for MinGW.  Currently, the Windows x86_64 forces the relocation
model to PIC (Level 2).  This is unchanged for now, though we should
remove any assumptions on that and change it to a static relocation
model.

llvm-svn: 290533

d133dc22

[AVX-512] Don't assume that the rounding mode argument to intrinsics is a... · f56d985f

Craig Topper authored Dec 26, 2016

[AVX-512] Don't assume that the rounding mode argument to intrinsics is a constant. While clang will guarantee this, nothing in the backend will.

A non-constant value will now result in an isel error instead of just asserting or crashing due to a bad cast during lowering.

llvm-svn: 290532

f56d985f

Fix some bad indentation that I or another introduced somehow. · 0cf829c1
Chandler Carruth authored Dec 26, 2016
```
llvm-svn: 290531
```
0cf829c1

[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics... · e3280457

Craig Topper authored Dec 25, 2016

[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics into shufflevector instructions

Summary:
This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants.

We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that.

Reviewers: zvi, delena, spatel, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27825

llvm-svn: 290530

e3280457

Fix `update_test_checks.py` bug that incorrectly truncates IR body. · c6b46d80
Bryant Wong authored Dec 25, 2016
```
Differential Revision: https://reviews.llvm.org/D26619

llvm-svn: 290529
```
c6b46d80

[ADT] Add a generic concatenating iterator and range (take 2). · cb22b89f

Chandler Carruth authored Dec 25, 2016

This recommits r290512 that was reverted when MSVC failed to compile it. Since
then I've played with various approaches using rextester.com (where I was able
to reproduce the failure) and think that I have a solution thanks in part to
the help of Dave Blaikie! It seems MSVC just has a defective `decltype` in this
version. Manually writing out the type seems to do the trick, even though it is
.... quite complicated.

Original commit message:
This allows both defining convenience iterator/range accessors on types
which walk across N different independent ranges within the object, and
more direct and simple usages with range based for loops such as shown
in the unittest. The same facilities are used for both. They end up
quite small and simple as it happens.

I've also switched an iterator on `Module` to use this. I would like to
add another convenience iterator that includes even more sequences as
part of it and seeing this one already present motivated me to actually
abstract it away and introduce a general utility.

Differential Revision: https://reviews.llvm.org/D28093

llvm-svn: 290528

cb22b89f

[MemorySSA] Define a restricted upward AccessList splice. · 4213d941
Bryant Wong authored Dec 25, 2016
```
Differential Revision: https://reviews.llvm.org/D26661

llvm-svn: 290527
```
4213d941

Dec 25, 2016

[AliasAnalysis] Teach BasicAA about memcpy. · a07d9b14
Bryant Wong authored Dec 25, 2016
```
Differential Revision: https://reviews.llvm.org/D27034

llvm-svn: 290526
```
a07d9b14

Value number stores and memory states so we can detect when memory states are... · d7c12ee5

Daniel Berlin authored Dec 25, 2016

Value number stores and memory states so we can detect when memory states are equivalent (IE store of same value to memory).

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28084

llvm-svn: 290525

d7c12ee5

Rename GVNExpression *ops_ members to *op_* to match conventions in the rest of LLVM · 65f5f0d7
Daniel Berlin authored Dec 25, 2016
```
llvm-svn: 290524
```
65f5f0d7

[Orc][RPC] Add a ParallelCallGroup utility for dispatching and waiting on · c9d0ff13

Lang Hames authored Dec 25, 2016

multiple asynchronous RPC calls.

ParallelCallGroup allows multiple asynchronous calls to be dispatched,
and provides a wait method that blocks until all asynchronous calls have
been executed on the remote and all return value handlers run on the
local machine.

This will allow, for example, the JIT client to issue memory allocation calls
for all sections in parallel, then block until all memory has been allocated
on the remote and the allocated addresses registered with the client, at which
point the JIT client can proceed to applying relocations.

llvm-svn: 290523

c9d0ff13

Fix assertion failure when deducing an auto-typed argument against a different-width int. · 993f2032
Richard Smith authored Dec 25, 2016
```
llvm-svn: 290522
```
993f2032

[sanitizer] Define some CPU type symbols (like CPU_SUBTYPE_X86_64_H) when they're not available. · a6a17738

Kuba Mracek authored Dec 25, 2016

This allows compiler-rt to be built on older macOS SDKs, where there symbols are not defined.

Patch by Jeremy Huddleston Sequoia <jeremyhu@apple.com>.

llvm-svn: 290521

a6a17738

[Orc][RPC] Clang-format RPCUtils header. · aac390ee

Lang Hames authored Dec 25, 2016

Some of the recent RPC call type-checking changes weren't formatted prior to
commit.

llvm-svn: 290520

aac390ee

Add newline to end of file to quiet warnings. · 1eb0bca1
Greg Clayton authored Dec 25, 2016
```
llvm-svn: 290519
```
1eb0bca1

Specify the default values of the cache parameters · 1c2927b2

Roman Gareev authored Dec 25, 2016

If the parameters of the target cache (i.e., cache level sizes, cache level
associativities) are not specified or have wrong values, we use ones for
parameters of the macro-kernel and do not perform data-layout optimizations of
the matrix multiplication. In this patch we specify the default values of the
cache parameters to be able to apply the pattern matching optimizations even in
this case. Since there is no typical values of this parameters, we use the
parameters of Intel Core i7-3820 SandyBridge that also help to attain the
high-performance on IBM POWER System S822 and IBM Power 730 Express server.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D28090

llvm-svn: 290518

1c2927b2

revert commit 290516 · 86602e85
Michael Zuckerman authored Dec 25, 2016
```
llvm-svn: 290517
```
86602e85