[mlir][nvvm] Implement `mbarrier.init`
NV GPUs provides split arrive/wait barriers that one can syncronize a subgroup of threads in CTA. It is particularly important for Hopper GPUs and allows tracking engines like TMA. See for more details: https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-mbarrier This initial implementation sets the foundation for future enhancements and additions. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D151334
Loading
Please sign in to comment