[mlir] add a simple gpu barrier elimination mechanism
GPU code generation, and specifically the shared memory copy insertion may introduce spurious barriers guarding read-after-read dependencies or read-after-write on non-aliasing data, which degrades performance due to unnecessary synchronization. Add a pattern and transform op that removes such barriers by analyzing memory effects that the barrier actually guards that are not also guarded by other barriers. The code is adapted from the Polygeist incubator project. Co-authored-by:William Moses <gh@wsmoses.com> Co-authored-by:
Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp> Reviewed By: nicolasvasilache, wsmoses Differential Revision: https://reviews.llvm.org/D154720
Loading
Please sign in to comment