CUDA/HIP: Use kernel name to map to symbol
Currently CGCUDANV uses an llvm::Function as a key to map kernels to a symbol in host code. HIP adds one level of indirection and uses the llvm::Function to map to a global variable that will be initialized to the kernel stub ptr. Unfortunately there is no garantee that the llvm::Function created by GetOrCreateLLVMFunction will be the same. In fact, the first time we encounter GetOrCrateLLVMFunction for a kernel, the type might not be completed yet, and the type of llvm::Function will be a generic {}, since the complete type is not required to get a symbol to a function. In this case we end up creating two global variables, one for the llvm::Function with the incomplete type and one for the function with the complete type. The first global variable will be declared by not defined, resulting in a linking error. This change uses the mangled name of the llvm::Function as key in the KernelHandles map, in this way the same llvm::Function will be associated to the same kernel handle even if they types are different. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D140663
Loading
Please sign in to comment