[CUDA][HIP] Fix host/device based overload resolution
Currently clang fails to compile the following CUDA program in device compilation: __host__ int foo(int x) { return 1; } template<class T> __device__ __host__ int foo(T x) { return 2; } __device__ __host__ int bar() { return foo(1); } __global__ void test(int *a) { *a = bar(); } This is due to foo is resolved to the __host__ foo instead of __device__ __host__ foo. This seems to be a bug since __device__ __host__ foo is a viable callee for foo whereas clang is unable to choose it. This patch fixes that. Differential Revision: https://reviews.llvm.org/D77954
Loading
Please register or sign in to comment