[RISCV][ISel] Remove redundant vmerge for the vwadd. (#78403)
This patch is aiming at resolving the below missed-optimization case. ### Code ``` define <8 x i64> @vwadd_mask_v8i32(<8 x i32> %x, <8 x i64> %y) { %mask = icmp slt <8 x i32> %x, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42> %a = select <8 x i1> %mask, <8 x i32> %x, <8 x i32> zeroinitializer %sa = sext <8 x i32> %a to <8 x i64> %ret = add <8 x i64> %sa, %y ret <8 x i64> %ret } ``` ### Before this patch [Compiler Explorer](https://godbolt.org/z/cd1bKTrx6) ``` vwadd_mask_v8i32: li a0, 42 vsetivli zero, 8, e32, m2, ta, ma vmslt.vx v0, v8, a0 vmv.v.i v10, 0 vmerge.vvm v16, v10, v8, v0 vwadd.wv v8, v12, v16 ret ``` ### After this patch ``` vwadd_mask_v8i32: li a0, 42 vsetivli zero, 8, e32, m2, ta, ma vmslt.vx v0, v8, a0 vsetvli zero, zero, e32, m2, tu, mu vwadd.wv v12, v12, v8, v0.t vmv4r.v v8, v12 ret ``` This pattern could be found in a reduction with a widening destination Specifically, we first do a fold like `(vwadd.wv y, (vmerge cond, x, 0)) -> (vwadd.wv y, x, y, cond)`, then do pattern matching on it.
Loading
Please sign in to comment