[X86] Prefer vmovmsk instead of vtest for alderlake.
On alderlake E-core, the latency of VMOVMSKPS is 5 for YMM/XMM. The latency of VPTESTPS is 7 for YMM and is 5 for XMM. Since alderlake use the P-core schedule model, we can't determine which one better based on the latency information of schedule model. Alternatively we add an tuning feature for alderlake and select VMOVMSKPS with the indication for the tuning feature. In the case of "vmovmskps + test + jcc", the test and jcc can be fused, while vtest and jcc can't. Differential Revision: https://reviews.llvm.org/D152227
Loading
Please sign in to comment