Reland [X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again
Instead of handling power-of-two sized vector chunks, try handling the large vector in a stream mode, decreasing the operational vector size once it no longer works for the elements left to process. Notably, this improves costs for overaligned loads - loading padding is fine. This more directly tracks when we need to insert/extract the YMM/XMM subvector, some costs fluctuate because of that. This was initially landed in c02476f3, but reverted in 5fddc331, because the code made some very optimistic assumptions about invariants that didn't hold in practice. Reviewed By: RKSimon, ABataev Differential Revision: https://reviews.llvm.org/D100684
Loading
Please sign in to comment