[X86][SSE] Fold BITOP(MOVMSK(X),MOVMSK(Y)) -> MOVMSK(BITOP(X,Y))
Reduce XMM->GPR traffic by performing bitops on the vectors, and using a single MOVMSK call. This requires us to use vectors of the same size and element width, but we can mix fp/int type equivalents with suitable bitcasting.
Loading
Please sign in to comment