Skip to content
Commit fff3e1db authored by Sanjay Patel's avatar Sanjay Patel
Browse files

[x86] enable fast sqrtss/sqrtps tuning for AMD Zen cores

As discussed in D118534, all of the recent AMD CPUs have
relatively fast (<14 cycle latency) "sqrtss" and "sqrtps"
instructions:
https://uops.info/table.html?search=sqrtps&cb_lat=on&cb_tp=on&cb_SNB=on&cb_SKL=on&cb_ZENp=on&cb_ZEN2=on&cb_ZEN3=on&cb_measurements=on&cb_avx=on&cb_sse=on

So we should set this tuning flag to alter codegen of plain
"sqrt(X)" expansion (as opposed to reciprocal-sqrt - there
is other test coverage for that pattern). The expansion is
both slower and less accurate than the hardware instruction.

Differential Revision: https://reviews.llvm.org/D119001
parent dbed14d2
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment