[CUDA] Allow targeting NVPTX directly without a host toolchain
Currently, the NVPTX compilation toolchain can only be invoked either through CUDA or OpenMP via `--offload-device-only`. This is because we cannot build a CUDA toolchain without an accompanying host toolchain for the offloading. When using `--target=nvptx64-nvidia-cuda` this results in generating calls to the GNU assembler and linker, leading to errors. This patch abstracts the portions of the CUDA toolchain that are independent of the host toolchain or offloading kind into a new base class called `NVPTXToolChain`. We still need to read the host's triple to build the CUDA installation, so if not present we just assume it will match the host's system for now, or the user can provide the path explicitly. This should allow the compiler driver to create NVPTX device images directly from C/C++ code. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D140158
Loading
Please sign in to comment