Skip to content
Commit 98798073 authored by Christopher Bate's avatar Christopher Bate
Browse files

[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types

Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand operand) types.
Fixes logic in one of the helper functions for converting the result
of a mma.sync operation with multiple 8x256bit output tiles, which is
the case for f32 outputs.

Differential Revision: https://reviews.llvm.org/D124533
parent c207e360
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment