[MLIR] [affine-loop-fusion] Fix a bug about non-result ops in affine-loop-fusion
This patch fixes the following bug when calling --affine-loop-fusion Input program: ```mlir func @should_not_fuse_since_top_level_non_affine_non_result_users( %in0 : memref<32xf32>, %in1 : memref<32xf32>) { %c0 = constant 0 : index %cst_0 = constant 0.000000e+00 : f32 affine.for %d = 0 to 32 { %lhs = affine.load %in0[%d] : memref<32xf32> %rhs = affine.load %in1[%d] : memref<32xf32> %add = addf %lhs, %rhs : f32 affine.store %add, %in0[%d] : memref<32xf32> } store %cst_0, %in0[%c0] : memref<32xf32> affine.for %d = 0 to 32 { %lhs = affine.load %in0[%d] : memref<32xf32> %rhs = affine.load %in1[%d] : memref<32xf32> %add = addf %lhs, %rhs: f32 affine.store %add, %in0[%d] : memref<32xf32> } return } ``` call --affine-loop-fusion, we got an incorrect output: ```mlir func @should_not_fuse_since_top_level_non_affine_non_result_users(%arg0: memref<32xf32>, %arg1: memref<32xf32>) { %c0 = constant 0 : index %cst = constant 0.000000e+00 : f32 store %cst, %arg0[%c0] : memref<32xf32> affine.for %arg2 = 0 to 32 { %0 = affine.load %arg0[%arg2] : memref<32xf32> %1 = affine.load %arg1[%arg2] : memref<32xf32> %2 = addf %0, %1 : f32 affine.store %2, %arg0[%arg2] : memref<32xf32> %3 = affine.load %arg0[%arg2] : memref<32xf32> %4 = affine.load %arg1[%arg2] : memref<32xf32> %5 = addf %3, %4 : f32 affine.store %5, %arg0[%arg2] : memref<32xf32> } return } ``` This happened because when analyzing the source and destination nodes, affine loop fusion ignored non-result ops sandwitched between them. In other words, the MemRefDependencyGraph in the affine loop fusion ignored these non-result ops. This patch solves the issue by adding these non-result ops to the MemRefDependencyGraph. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D95668
Loading
Please register or sign in to comment