[AMDGPU] Add WMMA clang builtins
Add WMMA clang builtins and tests. Extra changes in code are needed to handle function overloads. WavefrontSize 32: __builtin_amdgcn_wmma_f32_16x16x16_f16_w32 __builtin_amdgcn_wmma_f32_16x16x16_bf16_w32 __builtin_amdgcn_wmma_f16_16x16x16_f16_w32 __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32 __builtin_amdgcn_wmma_i32_16x16x16_iu8_w32 __builtin_amdgcn_wmma_i32_16x16x16_iu4_w32 WavefrontSize 64: __builtin_amdgcn_wmma_f32_16x16x16_f16_w64 __builtin_amdgcn_wmma_f32_16x16x16_bf16_w64 __builtin_amdgcn_wmma_f16_16x16x16_f16_w64 __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64 __builtin_amdgcn_wmma_i32_16x16x16_iu8_w64 __builtin_amdgcn_wmma_i32_16x16x16_iu4_w64 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D128952
Loading
Please sign in to comment