[LV] Emit @llvm.get.active.mask for tail-folded loops
This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised loops if the intrinsic is supported by the backend, which is checked by querying TargetTransform hook emitGetActiveLaneMask. This intrinsic creates a mask representing active and inactive vector lanes, which is used by the masked load/store instructions that are created for tail-folded loops. The semantics of @llvm.get.active.mask are described here in LangRef: https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics This intrinsic is also used to provide a hint to the backend. That is, the second argument of the intrinsic represents the back-edge taken count of the loop. For MVE, for example, we use that to set up tail-predication, which is a new form of predication in MVE for vector loops that implicitely predicates the last vector loop iteration by implicitely setting active/inactive lanes, i.e. the tail loop is predicated. In order to set up a tail-predicated vector loop, we need to know the number of data elements processed by the vector loop, which corresponds the the tripcount of the scalar loop, which we can now reconstruct using @llvm.get.active.mask. Differential Revision: https://reviews.llvm.org/D79100
Loading
Please register or sign in to comment