- Sep 09, 2015
-
-
Matthias Braun authored
This introduces a check that the MBBIndexList is sorted as proposed in http://reviews.llvm.org/D12443 but split up into a separate commit. llvm-svn: 247166
-
Matt Arsenault authored
Instead of extracting both 32-bit components from the 128-bit register. This produces fewer copies and is easier for the copy peephole optimizer to understand and see the actual uses as extracts from a reg_sequence. This avoids needing to handle subregister composing in the PeepholeOptimizer's ValueTracker for this case. llvm-svn: 247162
-
Matt Arsenault authored
llvm-svn: 247161
-
Dan Gohman authored
llvm-svn: 247158
-
Tom Stellard authored
Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12256 llvm-svn: 247157
-
Silviu Baranga authored
Summary: We are not scalarizing the wide selects in codegen for i16 and i32 and therefore we can remove the amortization factor. We still have issues with i64 vectors in codegen though. Reviewers: mcrosier Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12724 llvm-svn: 247156
-
Sanjay Patel authored
llvm-svn: 247154
-
Dan Gohman authored
llvm-svn: 247152
-
Sanjay Patel authored
llvm-svn: 247150
-
Igor Breger authored
vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247149
-
Sanjay Patel authored
llvm-svn: 247148
-
Zoran Jovanovic authored
Differential Revision: http://reviews.llvm.org/D11178 llvm-svn: 247146
-
Alex Lorenz authored
llvm-svn: 247145
-
James Molloy authored
We called a variable ExitCount, stored the backedge count in it, then redefined it to be the exit count again. llvm-svn: 247140
-
James Molloy authored
Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions. Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do. llvm-svn: 247139
-
Daniel Sanders authored
Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 llvm-svn: 247128
-
Chandler Carruth authored
nothing broke. llvm-svn: 247127
-
Zoran Jovanovic authored
Differential Revision: http://reviews.llvm.org/D11628 llvm-svn: 247125
-
Matt Arsenault authored
Broken by r247074. Should include an assembler test, but the assembler is currently broken for VOP3b apparently. llvm-svn: 247123
-
Sanjoy Das authored
IRCE was just using INITIALIZE_PASS(), which is incorrect. llvm-svn: 247122
-
Lang Hames authored
llvm-svn: 247119
-
Dan Gohman authored
llvm-svn: 247118
-
Matt Arsenault authored
Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113
-
Dan Gohman authored
llvm-svn: 247110
-
Matt Arsenault authored
llvm-svn: 247109
-
Reid Kleckner authored
Typically these are catchpads, which hold data used to decide whether to catch the exception or continue unwinding. We also shouldn't create MBBs for catchendpads, cleanupendpads, or terminatepads, since no real code can live in them. This fixes a problem where MI passes (like the register allocator) would try to put code into catchpad blocks, which are not executed by the runtime. In the new world, blocks ending in invokes now have many possible successors. llvm-svn: 247102
-
Peter Collingbourne authored
llvm-svn: 247095
-
Reid Kleckner authored
Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092
-
Peter Collingbourne authored
members." as it causes test failures on a number of bots. llvm-svn: 247088
-
Eric Christopher authored
read CTR and count them as reading the CTR. llvm-svn: 247083
-
- Sep 08, 2015
-
-
Peter Collingbourne authored
This change extends the bitset lowering pass to support bitsets that may contain either functions or global variables. A function bitset is lowered to a jump table that is laid out before one of the functions in the bitset. Also add support for non-string bitset identifier names. This allows for distinct metadata nodes to stand in for names with internal linkage, as done in D11857. Differential Revision: http://reviews.llvm.org/D11856 llvm-svn: 247080
-
Ivan Krasin authored
Summary: Add a test for a data followed by 4-byte hash value. I use a slightly modified Jenkins hash function, as described in https://en.wikipedia.org/wiki/Jenkins_hash_function The modification is to ensure that hash(zeros) != 0. Reviewers: kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12648 llvm-svn: 247076
-
Matt Arsenault authored
Adds vcc to output string input for e32. Allows option of using e64 encoding with assembler. Also fixes these instructions not implicitly reading exec. llvm-svn: 247074
-
Artem Belevich authored
The pass is needed to remove __nvvm_reflect calls when we link in libdevice bitcode that comes with CUDA. Differential Revision: http://reviews.llvm.org/D11663 llvm-svn: 247072
-
Derek Schuff authored
Summary: This patch modifies X86TargetLowering::LowerVASTART so that struct va_list is initialized with 32 bit pointers in x32. It also includes tests that call @llvm.va_start() for x32. Patch by João Porto Subscribers: llvm-commits, hjl.tools Differential Revision: http://reviews.llvm.org/D12346 llvm-svn: 247069
-
Kostya Serebryany authored
llvm-svn: 247067
-
Kostya Serebryany authored
[libFuzzer] be more robust when dealing with files on disk (e.g. don't crash if a file was there but disappeared) llvm-svn: 247066
-
Dan Gohman authored
This allows backends which don't use a traditional register allocator, but do need PHI lowering and other passes, to use the default TargetPassConfig::addFastRegAlloc and TargetPassConfig::addOptimizedRegAlloc implementations. Differential Revision: http://reviews.llvm.org/D12691 llvm-svn: 247065
-
Sanjay Patel authored
llvm-svn: 247061
-
Matt Arsenault authored
These were marked as WriteSALU, which is low latency. I'm guessing at the value to use, but it should probably be considered the highest latency instruction. I'm not sure this has any actual effect since hasSideEffects probably is preventing any moving of these. llvm-svn: 247060
-