AMDGPU/SDAG: Refine the fold to v_mad_[iu]64_[iu]32
Only fold for uniform values on pre-GFX9 chips. GFX9+ allow us to keep the calculation entirely on the SALU. For subtargets where integer multiplication isn't full-rate, avoid folding if the multiply has too many uses. Finally, we expand 64x32 and 64x64 multiplies here as well, if they feed into an addition. This results in better code generation than the generic expansion for such multiplies because we end up using the accumulator of the MAD instructions. Differential Revision: https://reviews.llvm.org/D123835
Loading
Please sign in to comment