[AMDGPU] Prefer v_fmac over v_fma only when no source modifiers are used
v_fmac with source modifiers forces VOP3 encoding, but it is strictly better to use the VOP3-only v_fma instead, because $dst and $src2 are not tied so it gives the register allocator more freedom and avoids a copy in some cases. This is the same strategy we already use for v_mad vs v_mac and v_fma_legacy vs v_fmac_legacy. Differential Revision: https://reviews.llvm.org/D110070
Loading
Please register or sign in to comment