[SystemZ] Improve handling of ZERO_EXTEND_VECTOR_INREG.
Instead of doing multiple unpacks when zero extending vectors (e.g. v2i16 -> v2i64), benchmarks have shown that it is better to do a VPERM (vector permute) since that is only one sequential instruction on the critical path. This patch achieves this by 1. Expand ZERO_EXTEND_VECTOR_INREG into a vector shuffle with a zero vector instead of (multiple) unpacks. 2. Improve SystemZ::GeneralShuffle to perform a single unpack as the last operation if Bytes matches it. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D78486
Loading
Please sign in to comment