Unverified Commit c16adb0d authored Sep 07, 2023 by Fabian Mora Committed by GitHub Sep 07, 2023

[mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. (#65398)

Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the
GPU running the binary must be an exact match of the arch of the target,
or else the runtime throws an error due to the arch mismatch.

This patch adds a call to `fatbinary`, creating a fat binary with the
cubin object and the PTX code, allowing the driver to JIT the PTX at
runtime if there's an arch mismatch.

parent 43c20367

Show whitespace changes

Inline Side-by-side

Please to comment