[TTI] Add costing for vp.strided.load and vp.strided.store (#80360)
The primary motivation of this patch is to add testing infrastructure atop the recently landed 8ad14b6d, so that we can separate the costing aspects of strided memory operations from the SLP implementation details. I want to be clear that I am *not* proposing that we use the vp.strided.* forms as our canonical IR representation. I'm merely using them as a testing vehicle to exercise the costing machinery. The canonical IR form remains a masked.gather or masked.scatter. I do want to explore adding a non-vp strided load/store intrinsic, but that's a separate line of work. There is one costing change included in this. As I wrote my test, I discovered that the default implementation was scalarized (if invoked via generic routines such as getInstructionCost), and when adding the call into the strided specific costing discovered that we hadn't modeled the fallback to scalarization properly in the initial patch. After fixing that, there is a minor difference in scalarization cost reported for the unaligned case but I believe that to be uninteresting. For the record, I did confirm that vp.strided.store is lowered to a strided store on RISCV. :)
Loading
Please sign in to comment