- Apr 01, 2016
-
-
Rong Xu authored
llvm-svn: 265180
-
Simon Pilgrim authored
llvm-svn: 265179
-
James Y Knight authored
ThreadModel::Single is already handled already by ARMPassConfig adding LowerAtomicPass to the pass list, which lowers all atomics to non-atomic ops and deletes fences. So by the time we get to ISel, there's no atomic fences left, so they don't need special handling. llvm-svn: 265178
-
Peter Collingbourne authored
Should fix modules build. llvm-svn: 265176
-
Mike Aizatsky authored
Differential Revision: http://reviews.llvm.org/D18705 llvm-svn: 265174
-
Simon Pilgrim authored
llvm-svn: 265173
-
Sanjay Patel authored
Was there really no other way to splat a byte in SSE2? punpcklbw {{.*#+}} xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7] pshuflw {{.*#+}} xmm0 = xmm0[0,0,0,0,4,5,6,7] pshufd {{.*#+}} xmm0 = xmm0[0,0,1,1] llvm-svn: 265172
-
Simon Pilgrim authored
llvm-svn: 265171
-
Tom Stellard authored
Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170
-
Simon Pilgrim authored
Added SSE + AVX1 tests as well as AVX2 llvm-svn: 265169
-
Mike Aizatsky authored
llvm-svn: 265168
-
Sanjay Patel authored
Note however that this is identical to the existing SSE2 run. What we really want is yet another run for an SSE2 machine that also has fast unaligned 16-byte accesses. llvm-svn: 265167
-
Simon Pilgrim authored
llvm-svn: 265164
-
Simon Pilgrim authored
Replaced lots of dodgy greps with actual codegen llvm-svn: 265163
-
Sanjay Patel authored
Follow-up to http://reviews.llvm.org/D18566 and http://reviews.llvm.org/D18676 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The 16-byte test that was added in D18566 is now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. Note that the SSE1 path is not changed in this patch. That can be a follow-up. This patch should resolve PR27100. llvm-svn: 265161
-
Chad Rosier authored
llvm-svn: 265160
-
David Majnemer authored
A catchswitch is a terminator, instructions cannot be inserted after it. llvm-svn: 265158
-
David Majnemer authored
A catchswitch cannot be preceded by another instruction in the same basic block (other than a PHI node). Instead, insert the extract element right after the materialization of the vectorized value. This isn't optimal but is a reasonable compromise given the constraints of WinEH. This fixes PR27163. llvm-svn: 265157
-
Rong Xu authored
Refactor the code that gets and creates PGOFuncName meta data so that it can be used in clang's value profile annotation. Differential Revision: http://reviews.llvm.org/D18623 llvm-svn: 265149
-
Sanjay Patel authored
Follow-up to D18566 - where we noticed that an intermediate splat was being generated for memsets of non-zero chars. That was because we told getMemsetStores() to use a 32-bit vector element type, and it happily obliged by producing that constant using an integer multiply. The tests that were added in the last patch are now equivalent for AVX1 and AVX2 (no splats, just a vector load), but we have PR27141 to track that splat difference. In the new tests, the splat via shuffling looks ok to me, but there might be some room for improvement depending on uarch there. Note that the SSE1/2 paths are not changed in this patch. That can be a follow-up. This patch should resolve PR27100. Differential Revision: http://reviews.llvm.org/D18676 llvm-svn: 265148
-
Benjamin Kramer authored
This avoids undefined behavior when casting pointers to it. Also make sure that we don't cast to a derived StringMapEntry before checking for tombstone, as that may have different alignment requirements. llvm-svn: 265145
-
Vedant Kumar authored
llvm-svn: 265144
-
Valery Pykhtin authored
$vsrc1 -> $src1, $k -> $imm Differential Revision: http://reviews.llvm.org/D18659 llvm-svn: 265141
-
Andrea Di Biagio authored
Since revision 235394, we no longer perform target specific combines on build_vector nodes. No functional change intended. llvm-svn: 265138
-
Simon Pilgrim authored
llvm-svn: 265135
-
Sagar Thakur authored
Summary: The assembler was picking the wrong JR variant because the pre-R6 one was still enabled at R6. Author: nitesh.jain Reviewers: vkalintiris, dsanders Subscribers: dsanders, llvm-commits, mohit.bhakkad, sagar, bhushan, jaydeep Differential: D18387 llvm-svn: 265134
-
Benjamin Kramer authored
Found by msan. Patch by Adrian Kuegel! llvm-svn: 265133
-
Andrey Turetskiy authored
Add a new Intel MCU CPU Lakemont, which doesn't support X87. Differential Revision: http://reviews.llvm.org/D18650 llvm-svn: 265128
-
James Molloy authored
Some ARM instructions encode 32-bit immediates as a 8-bit integer (0-255) and a 4-bit rotation (0-30, even) in its least significant 12 bits. The original fixup, FK_Data_4, patches the instruction by the value bit-to-bit, regardless of the encoding. For example, assuming the label L1 and L2 are 0x0 and 0x104 respectively, the following instruction: add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260 would be assembled to the following, which adds 1 to r0, instead of 260: e2800104 add r0, r0, #4, 2 ; equivalently 1 The new fixup kind fixup_arm_mod_imm takes care of the encoding: e2800f41 add r0, r0, #260 Patch by Ting-Yuan Huang! llvm-svn: 265122
-
Oliver Stannard authored
When a fixup that can be resolved by the assembler is out of range, we should report an error in the source, rather than crashing. Differential Revision: http://reviews.llvm.org/D18402 llvm-svn: 265120
-
Mehdi Amini authored
This is to be coherent with Full LTO. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265118
-
Jeroen Ketema authored
The llvm_string_of_message function, called by llvm_raise, calls LLVMDisposeMessage, which expects the message to be dynamically allocated; it fails freeing the message otherwise. So always dynamically allocate with LLVMCreateMessage. Differential Revision: http://reviews.llvm.org/D18675 llvm-svn: 265116
-
Jeroen Ketema authored
Expose LLVMCreateTargetMachineData as data_layout. As r263530 did for go. From that commit: "LLVMGetTargetDataLayout was removed from the C API, and then TargetMachine.TargetData was removed. Later, LLVMCreateTargetMachineData was added to the C API" Differential Revision: http://reviews.llvm.org/D18677 llvm-svn: 265115
-
Mehdi Amini authored
This allows the linker to instruct ThinLTO to perform only the optimization part or only the codegen part of the process. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265113
-
Chuang-Yu Cheng authored
[PPC64] Bug fix: when enabling sibling-call-opt and shrink-wrapping, the tail call branch instruction might disappear Bug Pattern: # BB#0: # %entry cmpldi 3, 0 beq- 0, .LBB0_2 # BB#1: # %exit lwz 4, 0(3) #TC_RETURNd8 LVComputationKind 0 .LBB0_2: # %cond.false mflr 0 std 0, 16(1) stdu 1, -96(1) .Ltmp0: .cfi_def_cfa_offset 96 .Ltmp1: .cfi_offset lr, 16 bl __assert_fail nop The branch instruction for tail call return is not generated, because the shrink-wrapping pass choosing a new Restore Point: %cond.false, so %exit block is not sent to emitEpilogue, that's why the branch is not generated. Thanks Kit's opinions! Reviewers: nemanjai hfinkel tjablin kbarton http://reviews.llvm.org/D17606 llvm-svn: 265112
-
Mehdi Amini authored
This is intended to be used for ThinLTO incremental build. Differential Revision: http://reviews.llvm.org/D18213 This is a recommit of r265095 after fixing the Windows issues. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265111
-
Mehdi Amini authored
From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265110
-
Mehdi Amini authored
From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265109
-
Mehdi Amini authored
From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265108
-
Mehdi Amini authored
Provide a class to generate a SHA1 from a sequence of bytes, and a convenience raw_ostream adaptor. This will be used to provide a "build-id" by hashing the Module block when writing bitcode. ThinLTO will use this information for incremental build. Reapply r265094 which was reverted in r265102 because it broke MSVC bots (constexpr is not supported). http://reviews.llvm.org/D16325 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 265107
-