[CostModel][X86] Improve accuracy of vXi64 MUL costs on AVX2/AVX512 targets
By llvm-mca analysis, Haswell/Broadwell has the worst v4i64 recip-throughput cost of the AVX2 targets at 6 (vs the currently used cost of 8). Similarly SkylakeServer (our only AVX512 target model) implements PMULLQ with an average cost of 1.5 (rounded up to 2.0), and the PMULUDQ-sequence (without AVX512DQ) as a cost of 6.
Loading
Please sign in to comment