[SLP] Cluster ordering for loads
Given a load without a better order, this patch partially sorts the elements to form clusters of adjacent elements in memory. These clusters can potentially be loaded in fewer loads, meaning less overall shuffling (for example loading v4i8 clusters of a v16i8 as a single f32 loads, as opposed to multiple independent bytes loads and inserts). Differential Revision: https://reviews.llvm.org/D122145
Loading
Please sign in to comment