- Dec 10, 2017
-
-
Simon Pilgrim authored
llvm-svn: 320328
-
Craig Topper authored
This matches AVX512 version and is more consistent overall. And improves our scheduler models. In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses. llvm-svn: 320325
-
Simon Pilgrim authored
llvm-svn: 320322
-
Simon Pilgrim authored
Split off some 'n' instruction versions to make it clearer when WAIT is being inserted llvm-svn: 320321
-
Craig Topper authored
[X86] Rename some instructions from 'rb' to 'rrb' to make 'b' a proper suffix. Fix the scheduling information for some of them. Some of the scheduling information was only present for the 'rb' version' and not the 'rr' version. Now we match 'rr(b?)' llvm-svn: 320320
-
Craig Topper authored
llvm-svn: 320319
-
Simon Pilgrim authored
Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder. llvm-svn: 320308
-
Simon Pilgrim authored
llvm-svn: 320307
-
Simon Pilgrim authored
llvm-svn: 320305
-
Craig Topper authored
Based on the fact that the 'Y' version of the instruction is next to this, I assume Z256 is the intended value. llvm-svn: 320295
-
Craig Topper authored
The VEX versions were present but not the legacy SSE versions. llvm-svn: 320294
-
Craig Topper authored
[X86] Correct the _Int part of more scheduler model instrexes. Put _b in the correct order relative to _Int llvm-svn: 320282
-
Craig Topper authored
llvm-svn: 320280
-
Craig Topper authored
[X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions If the question mark is inside the parentheses it only applies to the single character proceeding it. I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this. llvm-svn: 320279
-
Craig Topper authored
llvm-svn: 320268
-
- Dec 09, 2017
-
-
Craig Topper authored
[X86] Improve lowering of vXi1 insert_subvectors to better utilize (insert_subvector zero, vec, 0) for zeroing upper bits. This can be better recognized during isel when the producer already zeroed the upper bits. llvm-svn: 320267
-
Craig Topper authored
llvm-svn: 320260
-
Craig Topper authored
[X86] When inserting into the upper bits of a vXi1 vector, make sure we shift enough bits if we widened the vector. We may need to widen the vector to make the shifts legal, but if we do that we need to make sure we shift left/right after accounting for the new size. If not we can't guarantee we are shifting in zeros. The test cases affected actually show cases where we should move the shifts all together, but that's another problem. llvm-svn: 320248
-
Craig Topper authored
We were previously using kunpck with zero inputs unnecessarily. And we had cases where we would insert into a zero vector and then insert into larger zero vector incurring two sets of shifts. llvm-svn: 320244
-
Paul Robinson authored
MachineSink attempts to place instructions near the basic blocks where they are needed. Once an instruction has been sunk, its location relative to other instructions no longer is consistent with the original source code. In order to ensure correct stepping in the debugger, the debug location for sunk instructions is either merged with the insertion point or erased if the target successor block is empty. Originally submitted as r318679, revised to fix sanitizer failure and improve testing. Patch by Matthew Voss! Differential Revision: https://reviews.llvm.org/D39933 llvm-svn: 320216
-
- Dec 08, 2017
-
-
Craig Topper authored
[X86] Teach lowering to only let through (insert_subvector (vXi1 zeros), subvec, 0) for vector sizes that have native KSHIFT support. For narrow sizes we'll widen the zero vector and widen the insert. Then do an extract_subvector to get back down to correct size. This allows us to remove some patterns from the isel table that had to COPY_TO_REGCLASS to an oversized register, do the shift and then COPY_TO_REGCLASS back to the narrow register. Now this is represented explicitly in the DAG. This seems to have perturbed the register allocation in one of the tests, but the number of instructions didn't change. llvm-svn: 320190
-
Simon Pilgrim authored
llvm-svn: 320189
-
Simon Pilgrim authored
Put these under VecIMul itinerary classes for now - seems to be a good average value llvm-svn: 320161
-
Gadi Haber authored
Updated the scheduling information for the Haswell subtarget with the following changes: Regrouped the instructions after adding appropriate load + store latencies. Added scheduling for missing instructions such as the GATHER instrs. The changes were made after revisiting the latencies impact of all memory uOps. Reviewers: RKSimon, zvi, craig.topper, apilipenko Differential Revision: https://reviews.llvm.org/D40021 Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7 llvm-svn: 320137
-
Craig Topper authored
[X86] Handle alls version of vXi1 insert_vector_elt with a constant index without falling back to shuffles. We previously only supported inserting to the LSB or MSB where it was easy to zero to perform an OR to insert. This change effectively extracts the old value and the new value, xors them together and then xors that single bit with the correct location in the original vector. This will cancel out the old value in the first xor leaving the new value in the position. The way I've implemented this uses 3 shifts and two xors and uses an additional register. We can avoid the additional register at the cost of another shift. llvm-svn: 320120
-
- Dec 07, 2017
-
-
Craig Topper authored
[X86] Fix InsertBitToMaskVector to only issue KSHIFTS of native size so that upper bits are properly zeroed. There's no v2i1 or v4i1 kshift, and v8i1 is only supported with AVXDQ. Isel has fake patterns to extend these types to native shifts, but makes no guarantees about the value of any bits shifted in when shifting right. This patch promotes the vector to a type that supports a native shift first and only allows inserting into the msb of a native sized shift. I've constructed this in a way that doesn't do the promotion if we're going to fallback to using a xmm/ymm/zmm shuffle. I think I have a plan to remove the shuffle fall back entirely. In which case we this can be simplified, but I wanted to fix the correctness issue first. llvm-svn: 320081
-
Simon Pilgrim authored
Put these under UNARY/BINOP ALU itinerary classes for now - seems to be a good average value llvm-svn: 320064
-
Simon Pilgrim authored
llvm-svn: 320062
-
Craig Topper authored
llvm-svn: 320059
-
Simon Pilgrim authored
Treat these the same as LAHF/SAHF (although its not a x86_64 instruction) llvm-svn: 320055
-
Simon Pilgrim authored
llvm-svn: 320054
-
Simon Pilgrim authored
llvm-svn: 320052
-
Simon Pilgrim authored
Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well llvm-svn: 320051
-
Sanjay Patel authored
I noticed this pattern in D38316 / D38388. We failed to combine a shuffle that is either repeating a scalar insertion at the same position in a vector or translated to a different element index. Like the earlier patch, this could be an instcombine too, but since we opted to make this a DAG transform earlier, I've made this one a DAG patch too. We do not need any legality checking because the new insert is identical to the existing insert except that it may have a different constant insertion operand. The constant insertion test in test/CodeGen/X86/vector-shuffle-combining.ll was the motivation for D38756. Differential Revision: https://reviews.llvm.org/D40209 llvm-svn: 320050
-
Simon Pilgrim authored
llvm-svn: 320048
-
Simon Pilgrim authored
llvm-svn: 320045
-
Simon Pilgrim authored
llvm-svn: 320042
-
Simon Pilgrim authored
llvm-svn: 320040
-
Simon Pilgrim authored
llvm-svn: 320039
-
Andrew V. Tischenko authored
Differential Revision: https://reviews.llvm.org/D40345 llvm-svn: 320034
-