Commit f2f043ac authored Jun 05, 2018 by Simon Pilgrim

[X86][SSE] Use multiplication scale factors for v8i16 SHL on pre-AVX2 targets.

Similar to v4i32 SHL, convert v8i16 shift amounts to scale factors instead to improve performance and reduce instruction count. We were already doing this for constant shifts, this adds variable shift support.

Reduces the serial nature of the codegen, which relies on chains of plendvb/pand+pandn+por shifts.

This is a step towards adding support for vXi16 vector rotates.

Differential Revision: https://reviews.llvm.org/D47546

llvm-svn: 334023

parent 05b58910

Show whitespace changes

Inline Side-by-side

Please to comment