Commits · aa05cf99800bf21cadd3be103b537c7b690c7c83 · Roger Ferrer / llvm-epi

Apr 01, 2016

Fix buildbot lldb-amd64-ninja-netbsd7 failure · 92c2eae4
Rong Xu authored Apr 01, 2016
```
llvm-svn: 265180
```
92c2eae4
[X86][SSE] Regenerated the vec_insert tests. · f739d8a2
Simon Pilgrim authored Apr 01, 2016
```
llvm-svn: 265179
```
f739d8a2

Remove useless check for ThreadModel==Single in ARMISelLowering. NFC. · e6a46463

James Y Knight authored Apr 01, 2016

ThreadModel::Single is already handled already by ARMPassConfig adding
LowerAtomicPass to the pass list, which lowers all atomics to non-atomic
ops and deletes fences.

So by the time we get to ISel, there's no atomic fences left, so they
don't need special handling.

llvm-svn: 265178

e6a46463

LowerBitSets: Move declarations to separate namespace. · dd711b93
Peter Collingbourne authored Apr 01, 2016
```
Should fix modules build.

llvm-svn: 265176
```
dd711b93
[libfuzzer] adding license headers to cpp files · f13cbee1
Mike Aizatsky authored Apr 01, 2016
```
Differential Revision: http://reviews.llvm.org/D18705

llvm-svn: 265174
```
f13cbee1
[X86][SSE] Regenerated vec_partial tests. · 61211186
Simon Pilgrim authored Apr 01, 2016
```
llvm-svn: 265173
```
61211186

[x86] add an SSE2 + fast-unaligned accesses run for memset nonzero tests · 9b5b5c82

Sanjay Patel authored Apr 01, 2016

Was there really no other way to splat a byte in SSE2?
    punpcklbw {{.*#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
    pshuflw {{.*#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7]
    pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1]

llvm-svn: 265172

9b5b5c82

[X86][SSE] Regenerated vec_logical tests. · 85819464
Simon Pilgrim authored Apr 01, 2016
```
llvm-svn: 265171
```
85819464

AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2} · 354a43c7

Tom Stellard authored Apr 01, 2016

Summary:
Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.

32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.

Patch by: Vedran Miletić

Reviewers: arsenm, tstellarAMD, nhaehnle

Subscribers: jvesely, scchan, kanarayan, arsenm

Differential Revision: http://reviews.llvm.org/D17280

llvm-svn: 265170

354a43c7

[X86][SSE] Regenerated vector sdiv to shifts tests · 1b140824
Simon Pilgrim authored Apr 01, 2016
```
Added SSE + AVX1 tests as well as AVX2

llvm-svn: 265169
```
1b140824
[sancov] save entry block from pruning (it is always full dominator) · 01c0f8d8
Mike Aizatsky authored Apr 01, 2016
```
llvm-svn: 265168
```
01c0f8d8

[x86] add an SSE1 run for these tests · d3e3d48c

Sanjay Patel authored Apr 01, 2016

Note however that this is identical to the existing SSE2 run.
What we really want is yet another run for an SSE2 machine that
also has fast unaligned 16-byte accesses.

llvm-svn: 265167

d3e3d48c

[X86][SSE] Regenerated vec_setcc tests. · b8283631
Simon Pilgrim authored Apr 01, 2016
```
llvm-svn: 265164
```
b8283631
[X86][SSE] Regenerated the vec_set tests. · 41e31ff6
Simon Pilgrim authored Apr 01, 2016
```
Replaced lots of dodgy greps with actual codegen

llvm-svn: 265163
```
41e31ff6

[x86] avoid intermediate splat for non-zero memsets (PR27100) · 9f413364

Sanjay Patel authored Apr 01, 2016

Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 -
where we noticed that an intermediate splat was being generated for memsets of
non-zero chars.

That was because we told getMemsetStores() to use a 32-bit vector element type,
and it happily obliged by producing that constant using an integer multiply.

The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2
(no splats, just a vector load), but we have PR27141 to track that splat difference.

Note that the SSE1 path is not changed in this patch. That can be a follow-up.
This patch should resolve PR27100.

llvm-svn: 265161

9f413364

[AArch64] Fix a typo. NFC. · 8787a810
Chad Rosier authored Apr 01, 2016
```
llvm-svn: 265160
```
8787a810
[InstCombine] Don't sink an instr after a catchswitch · fe3f9d17
David Majnemer authored Apr 01, 2016
```
A catchswitch is a terminator, instructions cannot be inserted after it.

llvm-svn: 265158
```
fe3f9d17

[SLPVectorizer] Don't insert an extractelement before a catchswitch · 6f1f85f0

David Majnemer authored Apr 01, 2016

A catchswitch cannot be preceded by another instruction in the same
basic block (other than a PHI node).

Instead, insert the extract element right after the materialization of
the vectorized value.  This isn't optimal but is a reasonable compromise
given the constraints of WinEH.

This fixes PR27163.

llvm-svn: 265157

6f1f85f0

[PGO] Refactor PGOFuncName meta data code to be used in clang · 8e8fe859

Rong Xu authored Apr 01, 2016

Refactor the code that gets and creates PGOFuncName meta data so that it can be
used in clang's value profile annotation.

Differential Revision: http://reviews.llvm.org/D18623

llvm-svn: 265149

8e8fe859

[x86] avoid intermediate splat for non-zero memsets (PR27100) · a05e0ff2

Sanjay Patel authored Apr 01, 2016

Follow-up to D18566 - where we noticed that an intermediate splat was being
generated for memsets of non-zero chars.

That was because we told getMemsetStores() to use a 32-bit vector element type,
and it happily obliged by producing that constant using an integer multiply.

The tests that were added in the last patch are now equivalent for AVX1 and AVX2
(no splats, just a vector load), but we have PR27141 to track that splat difference.
In the new tests, the splat via shuffling looks ok to me, but there might be some
room for improvement depending on uarch there.

Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up.
This patch should resolve PR27100.

Differential Revision: http://reviews.llvm.org/D18676

llvm-svn: 265148

a05e0ff2

[ADT] Make StringMap's tombstone aligned. · 99c67b31

Benjamin Kramer authored Apr 01, 2016

This avoids undefined behavior when casting pointers to it. Also make
sure that we don't cast to a derived StringMapEntry before checking for
tombstone, as that may have different alignment requirements.

llvm-svn: 265145

99c67b31

[PGOProfile] Rename a test to make it more reusable, NFC · 77841717
Vedant Kumar authored Apr 01, 2016
```
llvm-svn: 265144
```
77841717
[AMDGPU] fix MADAK/MADMK instructions operand namings to match encoding fields. · 5b3559c1
Valery Pykhtin authored Apr 01, 2016
```
$vsrc1 -> $src1, $k -> $imm

Differential Revision: http://reviews.llvm.org/D18659

llvm-svn: 265141
```
5b3559c1

[x86] Remove redundant call to setTargetDAGCombine for BUILD_VECTOR node type. · 8c488419

Andrea Di Biagio authored Apr 01, 2016

Since revision 235394, we no longer perform target specific combines on
build_vector nodes. No functional change intended.

llvm-svn: 265138

8c488419

[X86][AVX512] Regenerated intrinsics tests · 7ec092d0
Simon Pilgrim authored Apr 01, 2016
```
llvm-svn: 265135
```
7ec092d0

[MIPS][LLVM-MC] Fix JR encoding for MIPSR6 ISA · 48973d21

Sagar Thakur authored Apr 01, 2016

Summary: The assembler was picking the wrong JR variant because the pre-R6 one was still enabled at R6.

Author: nitesh.jain
Reviewers: vkalintiris, dsanders
Subscribers: dsanders, llvm-commits, mohit.bhakkad, sagar, bhushan, jaydeep
Differential: D18387
llvm-svn: 265134

48973d21

[ThinLTO] Fix uninitialized flags. · 398e95c1
Benjamin Kramer authored Apr 01, 2016
```
Found by msan. Patch by Adrian Kuegel!

llvm-svn: 265133
```
398e95c1

[X86] Introduce Lakemont CPU. · 958eb464

Andrey Turetskiy authored Apr 01, 2016

Add a new Intel MCU CPU Lakemont, which doesn't support X87.

Differential Revision: http://reviews.llvm.org/D18650

llvm-svn: 265128

958eb464

Fix for pr24346: arm asm label calculation error in sub · b876c72b

James Molloy authored Apr 01, 2016

Some ARM instructions encode 32-bit immediates as a 8-bit integer (0-255)
and a 4-bit rotation (0-30, even) in its least significant 12 bits. The
original fixup, FK_Data_4, patches the instruction by the value bit-to-bit,
regardless of the encoding. For example, assuming the label L1 and L2 are
0x0 and 0x104 respectively, the following instruction:

  add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260

would be assembled to the following, which adds 1 to r0, instead of 260:

  e2800104 add r0, r0, #4, 2 ; equivalently 1

The new fixup kind fixup_arm_mod_imm takes care of the encoding:

  e2800f41 add r0, r0, #260

Patch by Ting-Yuan Huang!

llvm-svn: 265122

b876c72b

[AArch64] Better errors for out-of-range fixups · a5520b02

Oliver Stannard authored Apr 01, 2016

When a fixup that can be resolved by the assembler is out of range, we should
report an error in the source, rather than crashing.

Differential Revision: http://reviews.llvm.org/D18402

llvm-svn: 265120

a5520b02

ThinLTO: move ObjCARCContractPass in the CodeGen pipeline · 215d59e7
Mehdi Amini authored Apr 01, 2016
```
This is to be coherent with Full LTO.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265118
```
215d59e7

[OCaml] Use LLVMCreateMessage with constant strings when calling llvm_raise · b7f633b8

Jeroen Ketema authored Apr 01, 2016

The llvm_string_of_message function, called by llvm_raise, calls
LLVMDisposeMessage, which expects the message to be dynamically
allocated; it fails freeing the message otherwise. So always
dynamically allocate with LLVMCreateMessage.

Differential Revision: http://reviews.llvm.org/D18675

llvm-svn: 265116

b7f633b8

[OCaml] Reinstate data_layout · c110fbc2

Jeroen Ketema authored Apr 01, 2016

Expose LLVMCreateTargetMachineData as data_layout.

As r263530 did for go. From that commit: "LLVMGetTargetDataLayout was
removed from the C API, and then TargetMachine.TargetData was removed.
Later, LLVMCreateTargetMachineData was added to the C API"

Differential Revision: http://reviews.llvm.org/D18677

llvm-svn: 265115

c110fbc2

Add a libLTO API to stop/restart ThinLTO between optimizations and CodeGen · 43b657b5

Mehdi Amini authored Apr 01, 2016

This allows the linker to instruct ThinLTO to perform only the
optimization part or only the codegen part of the process.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265113

43b657b5

[PPC64] Bug fix: when enabling sibling-call-opt and shrink-wrapping, the tail... · f8b592f2

Chuang-Yu Cheng authored Apr 01, 2016

[PPC64] Bug fix: when enabling sibling-call-opt and shrink-wrapping, the tail call branch instruction might disappear

Bug Pattern:
    # BB#0:                                 # %entry
	    cmpldi	 3, 0
	    beq-	 0, .LBB0_2
    # BB#1:                                 # %exit
	    lwz 4, 0(3)
	    #TC_RETURNd8 LVComputationKind 0
    .LBB0_2:                                # %cond.false
	    mflr 0
	    std 0, 16(1)
	    stdu 1, -96(1)
    .Ltmp0:
	    .cfi_def_cfa_offset 96
    .Ltmp1:
	    .cfi_offset lr, 16
	    bl __assert_fail
	    nop

The branch instruction for tail call return is not generated, because the
shrink-wrapping pass choosing a new Restore Point: %cond.false, so %exit
block is not sent to emitEpilogue, that's why the branch is not generated.

Thanks Kit's opinions!
Reviewers: nemanjai hfinkel tjablin kbarton

http://reviews.llvm.org/D17606

llvm-svn: 265112

f8b592f2

Add a module Hash in the bitcode and the combined index, implementing a kind of "build-id" · d7ad221c

Mehdi Amini authored Apr 01, 2016

This is intended to be used for ThinLTO incremental build.

Differential Revision: http://reviews.llvm.org/D18213

This is a recommit of r265095 after fixing the Windows issues.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265111

d7ad221c

Fix MSVC warning "comparison of integers of different signs" (NFC) · eed26932
Mehdi Amini authored Apr 01, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265110
```
eed26932
Fix S390 big endian detection · 180441f0
Mehdi Amini authored Apr 01, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265109
```
180441f0
Const correctness in raw_sha1_ostream (NFC) · 7ef783d1
Mehdi Amini authored Apr 01, 2016
```
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265108
```
7ef783d1

Add support for computing SHA1 in LLVM · 4cd57025

Mehdi Amini authored Apr 01, 2016

Provide a class to generate a SHA1 from a sequence of bytes, and
a convenience raw_ostream adaptor.
This will be used to provide a "build-id" by hashing the Module
block when writing bitcode. ThinLTO will use this information for
incremental build.

Reapply r265094 which was reverted in r265102 because it broke
MSVC bots (constexpr is not supported).

http://reviews.llvm.org/D16325

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 265107

4cd57025