Skip to content
Commit fb2c42df authored by Johannes Doerfert's avatar Johannes Doerfert
Browse files

[OpenMP] Improve AMDGPU Plugin

With this patch we:
- pick more sensible defaults for the number of teams, inspired by the
  old plugin, and configured via LIBOMPTARGET_AMDGPU_TEAMS_PER_CU.
- check the input signal of a kernel launch late, after the queue lock
  was taken, to avoid a barrier packet more often.
- copy the kernel arguments in one swoop into the appropriate memory.
- manually specialize the callbacks to avoid potential indirect calls.
parent ab17a08d
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment