Skip to content
Commit 579812c0 authored by Ivan Chikish's avatar Ivan Chikish Committed by Simon Pilgrim
Browse files

[X86] LowerRotate: prefer unpack-based algorithm

Splitting and improving from the https://reviews.llvm.org/D146357

When running tests for LowerShift, I discovered some poor codegen in rotate and funnel shift tests. This patch attempts to address some of them.

Using unpack for splitting and using double-bitwidth shifts may improve performance according to https://uica.uops.info tests.

    No cross-lane shuffles
    No dirtying double-width registers
    Massive improvement for AVX2 rotates in some cases (var_funnnel_v8i16, var_funnnel_v16i16) — because unpack is currently only used for vXi8 vectors.

Differential Revision: https://reviews.llvm.org/D149071
parent b5d1ea9d
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment