[X86] Improve mul x, 2^N +/- 2 pattern by making the +/- 2x compute independently to x << N
Previous pattern was omitting ops in sequence which just increases the latency (to 3c, same as imul!) i.e: `(add/sub (add/sub (shl x, N), x), x)` Better is to compute 2x indepedently so x << N for better ULP i.e: `(add/sub (shl x, N), (add x, x))` Reviewed By: pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D141113
Loading
Please sign in to comment