[mlir][linalg] Vectorize tensor.extract using contiguous loads
This patch implements vectorization of tensor.extract for n-D tensor (n >= 2) using contiguous load operations, i.e. `vector.transfer_read`. This is a follow-up of https://reviews.llvm.org/D137660 in which gather loads were used, i.e. `vector.gather`. It is always safe to use gather load operations when the underlying memory pattern is contiguous, but not vice-verse. At the moment, the following conditions have to be met for contiguous loads to be generated: 1. The _output tensor_ must be a 1-D vector with the trailing dim > 1, e.g. `tensor<1x1x4xi32`, 2. The trailing dim in the _input tensor_ must be > 1, e.g. `tensor<1x1x4i32>` would be fine, but not `tensor<1x4x1xi32>`. If these conditions are not satisfied, gather loads are generated instead. Condition 1 guarantees that the iteration space of the corresponding `linalg.generic` Op is relatively simple. That makes analysing the indices for `tensor.extract` rather straightforward. Condition 2 is mostly there to avoid weird vectorisation patterns resulting in vectors like: `vector<1x1x1xi32>`. In practice, tensors like `tensor<1x4x1xi32>` should be collapsed to `tensor<1x4xi32>` before vectorisation, but that's beyond the scope of this patch. If needed, both conditions can be relaxed. I've not been able to find a good motivating example for these, hence skipping. For reference, `tosa.resize` (lowered to Linalg) was the driving example used here. As a bonus, the test from "vectorization-unsupported.mlir" is moved to "vectorization.mlir" with proper CHECK lines added. Differential Revision: https://reviews.llvm.org/D141998 Co-authored-by:Diego Caballero <diegocaballero@google.com>
Loading
Please sign in to comment