Skip to content
Commit 4d977174 authored by Slava Zakharin's avatar Slava Zakharin
Browse files

[flang] Improved performance of runtime Matmul/MatmulTranspose.

This patch mostly affects performance of the code produced by
HLIFR lowering. If MATMUL argument is an array slice, then
HLFIR lowering passes the slice to the runtime, whereas
FIR lowering would create a contiguous temporary for the slice.
Performance might be better than the generic implementation
for cases where the leading dimension is contiguous.
This patch improves CPU2000/178.galgel making HLFIR version
faster than FIR version (due to avoiding the temporary copies
for MATMUL arguments).

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D159134
parent 8f48392b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment