Skip to content
Commit de0df639 authored by Yaxun (Sam) Liu's avatar Yaxun (Sam) Liu
Browse files

[CUDA][HIP] Fix overloading resolution in global variable initializer

Currently, clang does not resolve certain overloaded functions correctly in the initializer
of global variables, e.g.

template<typename T1, typename U>
T1 mypow(T1, U);

__attribute__((device)) double mypow(double, int);

double t_extent = mypow(1.0, 2);

In the above example, mypow is supposed to resolve to the host version
but clang resolves it to the device version instead, and emits an error
(https://godbolt.org/z/17xxzaa67).

However, if the variable is assigned in a host function, there is no error.
The discrepancy in overloading resolution inside and outside of
a function is due to clang not accounting for the host/device target
when resolving functions called in the initializer of a global variable.

This patch introduces a global host/device target context for CUDA/HIP
for functions called outside of functions. For global variable initialization,
it is determined by the host/device attribute of the variable. For other
situations, a default value of host_device is sufficient.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D158247

Fixes: SWDEV-416731
parent 1c5fd153
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment