Skip to content
Unverified Commit 3855757f authored by Chia's avatar Chia Committed by GitHub
Browse files

[RISCV][ISel] Remove redundant vmerge for the vwadd. (#78403)

This patch is aiming at resolving the below missed-optimization case. 

### Code
```
define <8 x i64> @vwadd_mask_v8i32(<8 x i32> %x, <8 x i64> %y) {
    %mask = icmp slt <8 x i32> %x, <i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42, i32 42>
    %a = select <8 x i1> %mask, <8 x i32> %x, <8 x i32> zeroinitializer
    %sa = sext <8 x i32> %a to <8 x i64>
    %ret = add <8 x i64> %sa, %y
    ret <8 x i64> %ret
}
```

### Before this patch
[Compiler Explorer](https://godbolt.org/z/cd1bKTrx6)
```
vwadd_mask_v8i32:
        li      a0, 42
        vsetivli        zero, 8, e32, m2, ta, ma
        vmslt.vx        v0, v8, a0
        vmv.v.i v10, 0
        vmerge.vvm      v16, v10, v8, v0
        vwadd.wv        v8, v12, v16
        ret
```

### After this patch
```
vwadd_mask_v8i32:
        li a0, 42
        vsetivli zero, 8, e32, m2, ta, ma
        vmslt.vx v0, v8, a0
        vsetvli zero, zero, e32, m2, tu, mu
        vwadd.wv v12, v12, v8, v0.t
        vmv4r.v v8, v12
        ret
```
This pattern could be found in a reduction with a widening destination

Specifically, we first do a fold like `(vwadd.wv y, (vmerge cond, x, 0))
-> (vwadd.wv y, x, y, cond)`, then do pattern matching on it.
parent 608d6028
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment