[AArch64][SVE] Improve code generation for VLS i1 masks
This patch partially resolves an issue for VLS code generation where a mask is generated from a smaller width integer comparison than the instruction using the mask requires. Instead of sign extending a p register by converting it to a z register, extending that, and converting back, we instead just do an unpack of the p register. A separate issue causes the code generation to still be poor when the mask generation would fit in a neon register, as we then use a neon comparison operation and have to convert that to a p register. This will be resolved in a separate patch. Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D111221
Loading
Please sign in to comment