[MLIR][GPU][NVVM] Add warp synchronous matrix-multiply accumulate ops (875eb523) · Commits · Lorenzo Albano / LLVM bpEVL

Commit 875eb523 authored May 06, 2021 by Navdeep Kumar Committed by Uday Bondhugula May 06, 2021

[MLIR][GPU][NVVM] Add warp synchronous matrix-multiply accumulate ops

Add warp synchronous matrix-multiply accumulate ops in GPU and NVVM
dialect. Add following three ops to GPU dialect :-
  1.) subgroup_mma_load_matrix
  2.) subgroup_mma_store_matrix
  3.) subgroup_mma_compute
Add following three ops to NVVM dialect :-
  1.) wmma.m16n16k16.load.[a,b,c].[f16,f32].row.stride
  2.) wmma.m16n16k16.store.d.[f16,f32].row.stride
  3.) wmma.m16n16k16.mma.row.row.[f16,f32].[f16,f32]

Reviewed By: bondhugula, ftynse, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D95330

parent 16c78297

Hide whitespace changes

Inline Side-by-side

Please register or to comment