[X86][AVX] Combine 128/256-bit lane shuffles with zeroable upper subvectors to...
[X86][AVX] Combine 128/256-bit lane shuffles with zeroable upper subvectors to EXTRACT_SUBVECTOR (PR40720) As explained on PR40720, EXTRACTF128 is always as good/better than VPERM2F128/SHUF128, and we can use the implicit zeroing of the uppers.
Loading
Please register or sign in to comment