Unverified Commit 8e64e9c3 authored Sep 28, 2023 by Cullen Rhodes Committed by GitHub Sep 28, 2023

[mlir][ArmSME] Add support for vector.transfer_read with transpose (#67527)

This patch adds support for lowering a vector.transfer_read with a
transpose permutation map to a vertical tile load, for example:

  vector.transfer_read ...  permutation_map: (d0, d1) -> (d1, d0)

is converted to:

  arm_sme.tile_load ... <vertical>

On SME the transpose can be done in-flight, rather than as a separate
operation as in the TransferReadPermutationLowering, which would do the
following:

  %0 = vector.transfer_read ...
  vector.transpose %0, [1, 0] ...

The lowering doesn't support masking yet and the transfer_read must be
in-bounds. It also intentionally doesn't handle simple loads as
transfer_write currently does, as the generic
TransferReadToVectorLoadLowering can lower these to simple vector.load
ops, which can already be lowered to ArmSME.

A subsequent patch will update the existing transfer_write lowering,
this is a separate patch as there is currently no lowering for
vector.transfer_read.

parent 8e353fb6

Show whitespace changes

Inline Side-by-side

Please to comment