Skip to content
Unverified Commit e876250b authored by Jan Patrick Lehr's avatar Jan Patrick Lehr Committed by GitHub
Browse files

[OpenMP][libomptarget] Enable parallel copies via multiple SDMA engines (#71801)

This enables the AMDGPU plugin to use a new ROCm 5.7 interface to
dispatch asynchronous data transfers across SDMA engines.

The default functionality stays unchanged, meaning that all data
transfers are enqueued into a H2D queue or an D2H queue, depending on
transfer direction, via the HSA interface used previously.

The new interface can be enabled via the environment variable
`LIBOMPTARGET_AMDGPU_USE_MULTIPLE_SDMA_ENGINES=true` when libomptarget
is built against a recent ROCm version (5.7 and later).
As of now, requests are distributed in a round-robin fashion across
available SDMA engines.
parent a7bbcc46
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment