Skip to content
Unverified Commit 0d87e257 authored by Spenser Bauman's avatar Spenser Bauman Committed by GitHub
Browse files

[mlir][tosa] Improve lowering to tosa.fully_connected (#73049)

The current lowering of tosa.fully_connected produces a linalg.matmul
followed by a linalg.generic to add the bias. The IR looks like the
following:

    %init = tensor.empty()
    %zero = linalg.fill ins(0 : f32) outs(%init)
    %prod = linalg.matmul ins(%A, %B) outs(%zero)

    // Add the bias
    %initB = tensor.empty()
    %result = linalg.generic ins(%prod, %bias) outs(%initB) {
       // add bias and product
    }

This has two down sides:

1. The tensor.empty operations typically result in additional
allocations after bufferization
2. There is a redundant traversal of the data to add the bias to the
matrix product.

This extra work can be avoided by leveraging the out-param of
linalg.matmul. The new IR sequence is:

    %init = tensor.empty()
    %broadcast = linalg.broadcast ins(%bias) outs(%init)
    %prod = linalg.matmul ins(%A, %B) outs(%broadcast)

In my experiments, this eliminates one loop and one allocation (post
bufferization) from the generated code.
parent faebb1b2
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment