[AArch64][SVE] Fix bad PTEST(X, X) optimization
AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a predicate generating operation that identically sets flags (implictly). When the mask is the same as the input predicate the PTEST is currently removed. This is incorrect since the mask for the implicit PTEST performed by the flag-setting instruction differs from the mask specified to the explicit PTEST and could set different flags. For example, consider PG=<1, 1, x, x> Z0=<1, 2, x, x> Z1=<2, 1, x, x> X=CMPLE(PG, Z0, Z1) =<0, 1, x, x> NZCV=0xxx PTEST(X, X), NZCV=1xxx where the first active flag (bit 'N' in NZCV) is set by the explicit PTEST, but not by the implicit PTEST as part of the compare. Given the PTEST mask and source are the same however, first is equivalent to any, so the PTEST could be removed if the condition is changed. The same applies to last active. It is safe to remove the PTEST for any active, but this information isn't available in the current optimization. This patch fixes the bad optimization, a later patch will implement the optimization proposed above and fix the any active case. Reviewed By: bsmith Differential Revision: https://reviews.llvm.org/D137717
Loading
Please sign in to comment