Skip to content
Unverified Commit 763109e3 authored by Guray Ozen's avatar Guray Ozen Committed by GitHub
Browse files

[mlir][gpu] Use `known_block_size` to set `maxntid` for NVVM target (#77301)

Setting thread block size with `maxntid` on the kernel has great
performance benefits. In this way, downstream PTX compiler can do better
register allocation.

MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute
(`known_block_size`) that keeps the thread block size when it is known.
This PR simply uses this attribute to set `maxntid`.
parent 2edce427
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment