[mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. (#65398)
Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the GPU running the binary must be an exact match of the arch of the target, or else the runtime throws an error due to the arch mismatch. This patch adds a call to `fatbinary`, creating a fat binary with the cubin object and the PTX code, allowing the driver to JIT the PTX at runtime if there's an arch mismatch.
Loading
Please sign in to comment