[ARM,MVE] Add intrinsics for the VQDMLAD family.
Summary: This is another set of instructions too complicated to be sensibly expressed in IR by anything short of a target-specific intrinsic. Given input vectors a,b, the instruction generates intermediate values 2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high half of each double-width values, and overwrites half the lanes in the output vector c, which you therefore have to provide the input value of. Optionally you can swap the elements of b so that the are things like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when taking the high half; and optionally you can take the difference rather than sum of the two products. Finally, saturation is applied when converting back to a single-width vector lane. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76359
Loading
Please register or sign in to comment