Unverified Commit 763109e3 authored Jan 08, 2024 by Guray Ozen Committed by GitHub Jan 08, 2024

[mlir][gpu] Use `known_block_size` to set `maxntid` for NVVM target (#77301)

Setting thread block size with `maxntid` on the kernel has great
performance benefits. In this way, downstream PTX compiler can do better
register allocation.

MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute
(`known_block_size`) that keeps the thread block size when it is known.
This PR simply uses this attribute to set `maxntid`.

parent 2edce427

Show whitespace changes

Inline Side-by-side

Please to comment