[CostModel][X86] Improve accuracy of vXi64 vector non-uniform shift costs on AVX2+ targets
rG1ad4f887bd7692a9e63fb42586f0ece366f2fe01 incorrectly assumed that vXi64 non-uniform shifts were slow like vXi32 were - but llvm-mca (+Agner) both confirm that Haswell/Broadwell are full rate.
Loading
Please sign in to comment