[AArch64] Cost-model vector splat LD1Rs to avoid unprofitable SLP vectorisation
This slightly increases the costs of InsertElement instructions that are part of a vector splat sequence, i.e. a load, InsertElement and a shuffle (load + dup). The resulting LD1R is a high latency instruction, and this slight increase in costs avoids SLP vectorisation for a couple of cases where this isn't profitable. Fixes: https://github.com/llvm/llvm-project/issues/61047 Differential Revision: https://reviews.llvm.org/D145578
Loading
Please sign in to comment