[CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass
Code generated with -Ofast and -O3 -ffp-contract=fast (add -ffinite-math-only to enable vectorization) can differ significantly. Code compiled with -O3 can be deinterleaved using patterns as the instruction order is preserved. However, with the -Ofast flag, there can be multiple changes in the computation sequence, and even the real and imaginary parts may not be calculated in parallel. For more details, refer to llvm/test/CodeGen/AArch64/complex-deinterleaving-*-fast.ll and llvm/test/CodeGen/AArch64/complex-deinterleaving-*-contract.ll tests. This patch implements a more general approach and enables handling most -Ofast cases. Differential Revision: https://reviews.llvm.org/D148558
Loading
Please sign in to comment