[mlir][nvgpu] Fix 'warpgroup.mma.store' index calculation (#78413)
This PR fixes the 'nvgpu.warpgroup.mma.store' index calculation. When the destionation memref and current accumulator matrix were small, the previous code was reaching out of range.
Loading
Please sign in to comment