[mlir][ArmSME] Calculate correct tile mask when lowering arm_sme.zero
This patch updates the lowering of the arm_sme.zero to intrinsics so that it calculates the correct mask for the tile to zero. The zero instruction takes an 8-bit mask which specifies which 64-bit tiles to zero, ZA0.D to ZA7.D correspond to bits 0 to 7. To zero tiles with element sizes of 8-bit to 32-bit just requires zeroing the right 64-bit tiles. This is quite easy to calculate, each size has a "base mask" which can be shifted left by the tile ID to get the mask for that tile. base_mask << tile_id After tile allocation, this will be folded to a constant mask. Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D157902
Loading
Please sign in to comment