Skip to content
Commit 2b46417a authored by Simon Pilgrim's avatar Simon Pilgrim
Browse files

[X86][SSE] Attempt to lower vec_reduce_add patterns with PSADBW for zero-extended vXi8 sources

For i16/32/64 vectors, if the upper bits are known to be zero, then we can try to truncate to vXi8 (if its worth it) and perform this as a PSADBW to add+zext each v4i8 subvector to a i64 sum, which we can then reduce together.

This addresses some of the PR42674 test cases where the source data was vXi8 but had been extended to match a wider unsigned integer accumulator.

Differential Revision: https://reviews.llvm.org/D120193
parent acb96ffd
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment