Commit ea685e10 authored Sep 09, 2021 by Simon Pilgrim

[X86][AVX] Update _mm256_loadu2_m128* intrinsics to use _mm256_set_m128* (PR51796)

As reported on PR51796, the _mm256_loadu2_m128i in particular was inserting bitcasts and shuffles with different types making it trickier for some combines, and prevented the value tracker from identifying the shuffle sequences as a single insert_subvector style concat_vectors pattern.

This patch instead concatenate the 128-bit unaligned loads with _mm256_set_m128*, which was written to avoid the unnecessary bitcasts and only emits a single shuffle.

Differential Revision: https://reviews.llvm.org/D109497

parent dd662f0f

Show whitespace changes

Inline Side-by-side

Please to comment