Commit 267d6d66 authored Mar 29, 2023 by Lawrence Benson Committed by David Green Mar 29, 2023

[AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask.

When using Clang's __builtin_shufflevector with a 16xi8 or 8xi8 source and
runtime mask on an AArch64 target, LLVM currently generates 16 or 8
extract+and+insert operations. This patch replaces these inserts with (a vector
AND +) NEON's tbl1 intruction.

Issue: https://github.com/llvm/llvm-project/issues/60515

Differential Revision: https://reviews.llvm.org/D146212

parent 0b57d47b

Show whitespace changes

Inline Side-by-side

Please to comment