Skip to content
Commit 32b7c2fa authored by Simon Pilgrim's avatar Simon Pilgrim
Browse files

[X86][SSE] Support variable-index float/double vector insertion on SSE41+ targets (PR47924)

Extends D95779 to permit insertion into float/doubles vectors while avoiding a lot of aliased memory traffic.

The scalar value is already on the simd unit, so we only need to transfer and splat the index value, then perform the select.

SSE4 codegen is a little bulky due to the tied register requirements of (non-VEX) BLENDPS/PD but the extra moves are cheap so shouldn't be an actual problem.

Differential Revision: https://reviews.llvm.org/D95866
parent 7a45f27b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment