Commits · a3bfb4e313718713df0f2323321b8f0fd3b1fcca · Roger Ferrer / llvm-epi

Jul 21, 2016

[InstSimplify] recognize trunc + icmp sgt/slt variants of select simplifications (PR28466) · a3bfb4e3

Sanjay Patel authored Jul 21, 2016

rL245171 exposed a hole in InstSimplify that manifested in a strange way in PR28466:
https://llvm.org/bugs/show_bug.cgi?id=28466

It's possible to use trunc + icmp sgt/slt in place of an and + icmp eq/ne, so we need to
recognize that pattern to eliminate selects that are choosing between some value and some
bitmasked version of that value.

Note that there is significant room for improvement (refactoring) and enhancement (more
patterns, possibly in InstCombine rather than here).

Differential Revision: https://reviews.llvm.org/D22537

llvm-svn: 276341

a3bfb4e3

[OptDiag,LDist] Convert remaining opt remarks to use the new API · 84a6425d
Adam Nemet authored Jul 21, 2016
```
llvm-svn: 276340
```
84a6425d

[LV] Move vector int induction update to end of latch · 102729cf

Matthew Simpson authored Jul 21, 2016

This patch moves the update instruction for vectorized integer induction phi
nodes to the end of the latch block. This ensures consistent placement of all
induction updates across all the kinds of int inductions we create (scalar,
splat vector, or vector phi).

Differential Revision: https://reviews.llvm.org/D22416

llvm-svn: 276339

102729cf

Fix the clang-cl self-host with VS 2013 headers · a4de8460

Reid Kleckner authored Jul 21, 2016

std::numeric_limits<int64_t>::max() is not constexpr in VC 2013 headers,
and Clang complains that it isn't. MSVC 2013 itself is emitting a
dynamic initializer for this thing. Instead, use an enum.

llvm-svn: 276334

a4de8460

Normalize file docs. NFC. · 825a8687

George Burgess IV authored Jul 21, 2016

Having the added `\brief` made doxygen interpret it as the summary for
the `llvm` namespace (visible at:
http://llvm.org/doxygen/namespaces.html).

llvm-svn: 276331

825a8687

[PGO] Make needsComdatForCounter() available (NFC) · 97b68c5e

Rong Xu authored Jul 21, 2016

Move needsComdatForCounter() to lib/ProfileData/InstrProf.cpp from
lib/Transforms/Instrumentation/InstrProfiling.cpp to make is available for
other files.

Differential Revision: https://reviews.llvm.org/D22643

llvm-svn: 276330

97b68c5e

add vector tests and a simpler version of the negative tests · 9eec550a
Sanjay Patel authored Jul 21, 2016
```
llvm-svn: 276328
```
9eec550a

[docs] Move GitHub to GitHubSubMod · b120088d

Renato Golin authored Jul 21, 2016

Given that other proposals are making their way through, it's better if we
specify what GitHub proposal this is, in case there are others that also
involve GitHub, but not sub-modules.

llvm-svn: 276325

b120088d

Transfer ownership of the XCore backend. · 15e5c886
Richard Osborne authored Jul 21, 2016
```
llvm-svn: 276321
```
15e5c886
Revert "Invariant start/end intrinsics overloaded for address space" · c858faa2
Anna Thomas authored Jul 21, 2016
```
This reverts commit r276316.

llvm-svn: 276320
```
c858faa2
[IndVars] Reflow oddly formatted condition; NFC · ff9eea22
Sanjoy Das authored Jul 21, 2016
```
llvm-svn: 276319
```
ff9eea22

Invariant start/end intrinsics overloaded for address space · 29b24dfe

Anna Thomas authored Jul 21, 2016

Summary:
The llvm.invariant.start and llvm.invariant.end intrinsics currently
support specifying invariant memory objects only in the default address space.

With this change, these intrinsics are overloaded for any adddress space for memory objects
and we can use these llvm invariant intrinsics in non-default address spaces.

Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr)

This overloaded intrinsic is needed for representing final or invariant memory in managed languages.

Reviewers: tstellarAMD, reames, apilipenko

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22519

llvm-svn: 276316

29b24dfe

make InstCombine compare helper functions private; NFC · 43395060
Sanjay Patel authored Jul 21, 2016
```
Also, rename some of them for consistency and to follow current conventions.

llvm-svn: 276312
```
43395060
Avoid a string copy, NFC · cd32eba6
Vedant Kumar authored Jul 21, 2016
```
llvm-svn: 276310
```
cd32eba6
[IRTranslator] Add G_SUB opcode. · 2b59eab7
Quentin Colombet authored Jul 21, 2016
```
This commit adds a generic SUB opcode to global-isel.

llvm-svn: 276308
```
2b59eab7

[llvm-config][GlobalISel] Canonicalize LLVM_HAS_GLOBAL_ISEL on ON/OFF. · a4bcc3f0

Quentin Colombet authored Jul 21, 2016

Previously LLVM_HAS_GLOBAL_ISEL would directly get the value of
LLVM_BUILD_GLOBAL_ISEL. This could be any integer value and not just ON
and OFF. The problem is that lit.cfg was checking for ON to define that
global-isel was supported, thus if we were setting
LLVM_BUILD_GLOBAL_ISEL with an integer value, say 1, this test would
fail whereas we do build global-isel and want to test it.

llvm-svn: 276307

a4bcc3f0

[CMake][GlobalISel] Turn LLVM_BUILD_GLOBAL_ISEL into an option. NFC. · c8df88c9

Quentin Colombet authored Jul 21, 2016

Previously LLVM_BUILD_GLOBAL_ISEL was a boolean variable and although,
this is strictly identical to an option, it did not convey the
information that the user may set it. Options are here for that.

llvm-svn: 276306

c8df88c9

[IRTranslator] Add comments to explain the ordering of the switch. NFC. · 19df8a1a
Quentin Colombet authored Jul 21, 2016
```
Group arithmetic operations, bitwise operations, and branch operations.

llvm-svn: 276305
```
19df8a1a

[InstCombine] break up visitICmpInstWithInstAndIntCst(); NFCI · 1710e7cf

Sanjay Patel authored Jul 21, 2016

Making smaller pieces out of some of these ~1000 line functions should make
it easier to incrementally upgrade them to handle vector types.

llvm-svn: 276304

1710e7cf

Adding RELEASE_TESTERS.TXT · 999dd2b2
Renato Golin authored Jul 21, 2016
```
llvm-svn: 276302
```
999dd2b2
[AMDGPU] Emit read-only data to .rodata for hsa · 3c0d8d22
Konstantin Zhuravlyov authored Jul 21, 2016
```
Differential Revision: https://reviews.llvm.org/D22538

llvm-svn: 276298
```
3c0d8d22
[IRTranslator] Add G_AND opcode. · 7bcc921d
Quentin Colombet authored Jul 21, 2016
```
This commit adds a generic AND opcode to global-isel.

llvm-svn: 276297
```
7bcc921d
AMDGPU/SI: Add support for R_AMDGPU_ABS32 · 15562623
Konstantin Zhuravlyov authored Jul 21, 2016
```
Differential Revision: https://reviews.llvm.org/D21646

llvm-svn: 276294
```
15562623

[AArch64] Load/store opt: Don't count transient instructions towards search limits. · 4ff2e36d

Geoff Berry authored Jul 21, 2016

Summary:
This change also changes findMatchingInsn and
findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account
when tracking register defs and uses, which could potentially inhibit
these optimizations in the presence of debug information.

Reviewers: mcrosier

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22582

llvm-svn: 276293

4ff2e36d

Weaken ThreadSafeRefCountedBase atomics. · 3f1edd7e

Benjamin Kramer authored Jul 21, 2016

Doesn't make a difference on x86, but avoids memory barriers on
weakly-ordered archs like PowerPC and ARM.

llvm-svn: 276291

3f1edd7e

[X86][SSE] Allow folding of store/zext with PEXTRW of 0'th element · 88e0940d

Simon Pilgrim authored Jul 21, 2016

Under normal circumstances we prefer the higher performance MOVD to extract the 0'th element of a v8i16 vector instead of PEXTRW.

But as detailed on PR27265, this prevents the SSE41 implementation of PEXTRW from folding the store of the 0'th element. Additionally it prevents us from making use of the fact that the (SSE2) reg-reg version of PEXTRW implicitly zero-extends the i16 element to the i32/i64 destination register.

This patch only preferentially lowers to MOVD if we will not be zero-extending the extracted i16, nor prevent a store from being folded (on SSSE41).

Fix for PR27265.

Differential Revision: https://reviews.llvm.org/D22509

llvm-svn: 276289

88e0940d

Fixed line endings · 4caefdf8
Simon Pilgrim authored Jul 21, 2016
```
llvm-svn: 276287
```
4caefdf8

[X86][SSE] Pull out duplicate EXTRW lowering code. NFCI. · b11bdd95

Simon Pilgrim authored Jul 21, 2016

As requested on D22509, I've pulled out the v8i16 extraction lowering as the SSE41 and pre-SSE41 implementations are effectively the same.

llvm-svn: 276285

b11bdd95

[profdata] Remove constructor that MSVC 2013 pretends to not understand. · 929e7dbb
Benjamin Kramer authored Jul 21, 2016
```
No functionality change intended.

llvm-svn: 276284
```
929e7dbb

[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 · c8e20b11

Simon Pilgrim authored Jul 21, 2016

As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector.

This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.

We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).

Differential Revision: https://reviews.llvm.org/D22460

llvm-svn: 276281

c8e20b11

[DemandedBits] Reduce number of duplicated DenseMap lookups. · a9e477b2
Benjamin Kramer authored Jul 21, 2016
```
No functionality change intended.

llvm-svn: 276278
```
a9e477b2

[DenseMap] Add a C++17-style try_emplace method. · 857754a1

Benjamin Kramer authored Jul 21, 2016

This provides an elegant pattern to solve the "construct if not in map
already" problem we have many times in LLVM. Without try_emplace we
either have to rely on a sentinel value (nullptr) or do two lookups.

llvm-svn: 276277

857754a1

Rename StringMap::emplace_second to try_emplace. · eab3d367

Benjamin Kramer authored Jul 21, 2016

Coincidentally this function maps to the C++17 try_emplace. Rename it
for consistentcy with C++17 std::map. NFC.

llvm-svn: 276276

eab3d367

[AMDGPU] Some code cleaning in SIRegisterInfo.td · 6c4efcad

Sam Kolton authored Jul 21, 2016

Reviewers: tstellarAMD, vpykhtin

Subscribers: arsenm, kzhuravl

Differential Revision: https://reviews.llvm.org/D22620

llvm-svn: 276274

6c4efcad

ExecutionDepsFix - Fix bug in clearance calculation · c1fa1633

Marina Yatsina authored Jul 21, 2016

The clearance calculation did not take into account registers defined as outputs or clobbers in inline assembly machine instructions because these register defs are implicit.

Differential Revision: http://reviews.llvm.org/D22580

llvm-svn: 276266

c1fa1633

[GCOV] Remove a layer of indirection. · 2a185a25

Benjamin Kramer authored Jul 21, 2016

StringMap is designed to hold large values. No functionality change
intended.

llvm-svn: 276265

2a185a25

[docs] Update release docs · 470172a4
Renato Golin authored Jul 21, 2016
```
llvm-svn: 276264
```
470172a4
AMDGPU: Fix phis from blocks split due to register indexing · f0ba86a4
Matt Arsenault authored Jul 21, 2016
```
llvm-svn: 276257
```
f0ba86a4

[GVNHoist] Preserve optimization hints which agree · 825e4ab9

David Majnemer authored Jul 21, 2016

If we have optimization hints with agree with each other along different
paths, preserve them.

llvm-svn: 276248

825e4ab9

[GVNHoist] Don't wrongly preserve TBAA · 4808f264

David Majnemer authored Jul 21, 2016

We hoisted loads/stores without taking into account which can cause
miscompiles.

llvm-svn: 276240

4808f264