- Feb 10, 2020
-
-
Johannes Doerfert authored
This is a minimal but important advancement over the existing code. A cast with an operand that is only used in the cast retains the no-alias property of the operand.
-
Johannes Doerfert authored
-
Djordje Todorovic authored
Fix the added License. Differential Revision: https://reviews.llvm.org/D74207
-
Amara Emerson authored
I'm not sure there's a test case for this, but it's better to be safe.
-
Johannes Doerfert authored
Traversing PHI nodes is natural with the genericValueTraversal but also a bit tricky. The problem is similar to the ones we have seen in AAAlign and AADereferenceable, namely that we continue to increase the range in each iteration. We use a pessimistic approach here to stop the iterations. Nevertheless, optimistic information can now be propagated through a PHI node.
-
Johannes Doerfert authored
The change is performed as stated by the FIXME and the tests are adjusted. All changes look fine to me and values can be inferred as undef without it being an error.
-
Johannes Doerfert authored
The genericValueTraversal will already handle SelectInst properly and we just needed to allow them in the initialize method.
-
Johannes Doerfert authored
Casts can be handled natively by the ConstantRange class. We do limit it to extends for now as we assume an integer type in different locations. A TODO and a test case with a FIXME was added to remove that restriction in the future.
-
Johannes Doerfert authored
We now call the base class method as we should.
-
Craig Topper authored
-
Johannes Doerfert authored
Inspired by https://llvm.discourse.group/t/impossible-condition-optimization/461
-
Johannes Doerfert authored
-
Craig Topper authored
[X86] Make (insert_vector_elt (v8i16 zerovec), i16 %x, 0) generate the same code as (v8i16 (build_vector %x, 0, 0, 0, 0, 0, 0, 0)). Instead of using a insrw to element 0, use movzx and movd. Same for v16i8.
-
Michael Liao authored
-
Michael Liao authored
- Lifetime intrinsics expect the pointer directly from alloca. Need extra handling for targets with alloca on non-default (or non-zero) address space.
-
Craig Topper authored
-
Craig Topper authored
Using sign extend forces the adjacent element to either all zeros or all ones. But all ones is a NAN. So that doesn't seem like a great idea. Trying to work on supporting this with strict FP where NAN would definitely be bad.
-
Shiva Chen authored
When the FP exists, the FP base CFI directive offset should take the size of variable arguments into account. Differential Revision: https://reviews.llvm.org/D73862
-
Fangrui Song authored
Similar to D67797 (DataExtractor).
-
Matt Arsenault authored
Narrow these for 64-bit VALU for AMDGPU.
-
Matt Arsenault authored
-
Matt Arsenault authored
The result type is separate from the source type.
-
Matt Arsenault authored
Vector indexing with a constant index should be folded out in the legalizer, but this was accidentally falling through. This would produce the indexing operation with $noreg. Handle this case as a dynamic index just in case a bug like this happens again in the future.
-
Matt Arsenault authored
We were failing to find constants that were casted. I feel like the artifact combiner should have folded the constant in the trunc before the custom lowering, but that doesn't happen.
-
- Feb 09, 2020
-
-
Matt Arsenault authored
At one point a custom node was used for kill handling, but now the intrinsic is directly selected. Remove leftover pattern machinery.
-
Matt Arsenault authored
Reverts part of 6524a7a2. Since that commit, the expansion was ignoring the actual save exec register produced by the instruction, and looking at other instructions. I do not understand why it was looking at other instructions, but relying on this scan was wrong. Fixes verifier errors after SI_IF is tail duplicated, which should be correct to do. The results were fed into a phi, which was lowered to the S_MOV_B64_term instructions.
-
Simon Pilgrim authored
Fix issue mentioned on rGe82e17d4d4ca - non-AVX512BW targets failed to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).
-
Craig Topper authored
We were using MOV32r0 and an extract_subreg as an input. By using custom isel we can move the extract_subreg to after the SBB instead of on the input.
-
Craig Topper authored
The flag isn't used, but I believe this matches the MOV32r0 that would be created by the table emitter. This should allow this node to be CSEed with any others created by the table.
-
Simon Pilgrim authored
As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets. This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway. There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch. Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).
-
Craig Topper authored
Not sure if this really matters. The VT isn't really used after this point. At best it might affect CSE.
-
Craig Topper authored
A vselect+strictfp node is not equivalent to a masked operation. The exceptions of the strictfp node are not masked by a vselect after it so we can't match it to a masked operation. We already had a hack in IsLegalToFold to prevent these patterns from matching. This patch removes that hack and removes the patterns.
-
Jan Vesely authored
Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks. Reviewer: awatry Differential Revision: https://reviews.llvm.org/D74016
-
Jan Vesely authored
Reviewer: awatry Differential Revision: https://reviews.llvm.org/D74013
-
Simon Pilgrim authored
Helps with bit rotation test coverage for PR44379
-
Simon Pilgrim authored
-
Simon Pilgrim authored
A matchShuffleAsBitRotate variant will be added soon and we need to make the difference more obvious.
-
Jan Kratochvil authored
-
Kamil Rytarowski authored
-
LLVM GN Syncbot authored
-