[mlir][Linalg] Improve conv vectorization for the stride==1 case.
In the stride == 1 case, conv1d reads contiguous data along the input dimension. This can be advantageaously used to bulk memory transfers and compute while avoiding unrolling. Experimentally, this can yield speedups of up to 50%. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D112139
Loading
Please register or sign in to comment