Skip to content
Commit 86ad0af8 authored by Eugene Zhulenev's avatar Eugene Zhulenev
Browse files

[mlir:Async] Implement recursive async work splitting for scf.parallel...

[mlir:Async] Implement recursive async work splitting for scf.parallel operation (async-parallel-for pass)

Depends On D104780

Recursive work splitting instead of sequential async tasks submission gives ~20%-30% speedup in microbenchmarks.

Algorithm outline:
1. Collapse scf.parallel dimensions into a single dimension
2. Compute the block size for the parallel operations from the 1d problem size
3. Launch parallel tasks
4. Each parallel task reconstructs its own bounds in the original multi-dimensional iteration space
5. Each parallel task computes the original parallel operation body using scf.for loop nest

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D104850
parent d43b2360
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment