[AMDGPU] Prevent aliasing of SrcC and Dst in MAI
Form the MAI spec: It’s ok that Src_C and vDst are the exact same VGPRs or Src_C and vDst are completely separated. The case that Src_C and vDst are overlapping should be avoid as new value could be written to accumulator input before it gets read. Note that this inevitably increases register pressure to the point where some programs will become uncompilable. This patch separates MAC and FMA versions of MFMA instructions using either tied dst and src2 or earlyclobber dst. Fixes: SWDEV-318900 Differential Revision: https://reviews.llvm.org/D117844
Loading
Please sign in to comment