[mlir][gpu][nvvm] fixed bug with literal for inline asm for mma instruction
The 'mma.sp.sync.aligned' family of instructions expects the sparsity selector as a direct literal (0x0 or 0x1). The current MLIR inline asm passed this as a value in register, which broke the downstream assemblers This is a small step towards supporting 2:4 sparsity on NVidia GPUs in the sparse compiler of MLIR. Reviewed By: ThomasRaoux, guraypp Differential Revision: https://reviews.llvm.org/D146110
Loading
Please sign in to comment