Skip to content
Unverified Commit 95ef8e38 authored by Cullen Rhodes's avatar Cullen Rhodes Committed by GitHub
Browse files

[mlir][ArmSME] Support 2-way widening outer products (#78975)

This patch introduces support for 2-way widening outer products. This
enables the fusion of 2 'arm_sme.outerproduct' operations that are
chained via the accumulator into a 2-way widening outer product
operation.

Changes:

- Add 'llvm.aarch64.sme.[us]mop[as].za32' intrinsics for 2-way variants.
  These map to instruction variants added in SME2 and use different
  intrinsics. Intrinsics are already implemented for widening variants
  from SME1.
- Adds the following operations:
  - fmopa_2way, fmops_2way
  - smopa_2way, smops_2way
  - umopa_2way, umops_2way
- Implements conversions for the above ops to intrinsics in
ArmSMEToLLVM.
- Adds a pass 'arm-sme-outer-product-fusion'  that fuses
  'arm_sme.outerproduct' operations.

For a detailed description of these operations see the
'arm_sme.fmopa_2way' description.

The reason for introducing many operations rather than one is the
signed/unsigned variants can't be distinguished with types (e.g., ui16,
si16) since 'arith.extui' and 'arith.extsi' only support signless
integers. A single operation would require this information and an
attribute (for example) for the sign doesn't feel right if
floating-point types are also supported where this wouldn't apply.
Furthermore, the SME FP8 extensions (FEAT_SME_F8F16, FEAT_SME_F8F32)
introduce FMOPA 2-way (FP8 to FP16) and 4-way (FP8 to FP32) variants but
no subtract variant. Whilst these are not supported in this patch, it
felt simpler to have separate ops for add/subtract given this.
parent 488f88b8
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment