Commit 89b144ec authored Dec 29, 2022 by Andrzej Warzynski

[mlir][linalg] Vectorize tensor.extract using contiguous loads

This patch implements vectorization of tensor.extract for n-D tensor (n
>= 2) using contiguous load operations, i.e. `vector.transfer_read`. This
is a follow-up of https://reviews.llvm.org/D137660 in which gather loads
were used, i.e. `vector.gather`.

It is always safe to use gather load operations when the underlying
memory pattern is contiguous, but not vice-verse. At the moment, the
following conditions have to be met for contiguous loads to be
generated:
  1. The _output tensor_ must be a 1-D vector with the trailing dim > 1,
     e.g. `tensor<1x1x4xi32`,
  2. The trailing dim in the _input tensor_ must be > 1, e.g.
     `tensor<1x1x4i32>` would be fine, but not `tensor<1x4x1xi32>`.
If these conditions are not satisfied, gather loads are generated
instead.

Condition 1 guarantees that the iteration space of the corresponding
`linalg.generic` Op is relatively simple. That makes analysing the
indices for `tensor.extract` rather straightforward.

Condition 2 is mostly there to avoid weird vectorisation patterns
resulting in vectors like: `vector<1x1x1xi32>`. In practice, tensors
like `tensor<1x4x1xi32>` should be collapsed to `tensor<1x4xi32>` before
vectorisation, but that's beyond the scope of this patch.

If needed, both conditions can be relaxed. I've not been able to find a
good motivating example for these, hence skipping. For reference,
`tosa.resize` (lowered to Linalg) was the driving example used here.

As a bonus, the test from "vectorization-unsupported.mlir" is moved to
"vectorization.mlir" with proper CHECK lines added.

Differential Revision: https://reviews.llvm.org/D141998



Co-authored-by: Diego Caballero <diegocaballero@google.com>

parent f7b7c698

Show whitespace changes

Inline Side-by-side

Please to comment