[AArch64][SME] Support NEON scalar FP instructions in streaming mode
The following scalar FP instructions are legal in streaming mode: 0101 1110 xx1x xxxx 11x1 11xx xxxx xxxx # FMULX/FRECPS/FRSQRTS (scalar) 0101 1110 x10x xxxx 00x1 11xx xxxx xxxx # FMULX/FRECPS/FRSQRTS (scalar, FP16) 01x1 1110 1x10 0001 11x1 10xx xxxx xxxx # FRECPE/FRSQRTE/FRECPX (scalar) 01x1 1110 1111 1001 11x1 10xx xxxx xxxx # FRECPE/FRSQRTE/FRECPX (scalar, FP16) Predicate them on `HasNEONorStreamingSVE`. Full list of affected instructions: FMULX16, FMULX32, FMULX64, FRECPS16, FRECPS32, FRECPS64, FRSQRTS16, FRSQRTS32, FRSQRTS64, FRECPEv1f16, FRECPEv1i32, FRECPEv1i64, FRECPXv1f16, FRECPXv1i32, FRECPXv1i64, FRSQRTEv1f16, FRSQRTEv1i32, FRSQRTEv1i64 Depends on D107902. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06/SIMD-FP-Instructions Execution of NEON instructions that are illegal in streaming mode will cause a trap or exception. Using FMULX [1] as an example, this check is at the top of the pseudocode: if elements == 1 then CheckFPEnabled64(); else CheckFPAdvSIMDEnabled64(); For the legal scalar variants it calls `CheckFPEnabled64`, whereas for the illegal vector variants it calls `CheckFPAdvSIMDEnabled64` which traps. This is useful for observing which instructions are/aren't legal in streaming mode. [1] https://developer.arm.com/documentation/ddi0602/2021-06/SIMD-FP-Instructions/FMULX--Floating-point-Multiply-extended- Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D108039
Loading
Please sign in to comment