[AMDGPU] Add pseudo wavemode to optimize strict_wqm
Strict WQM does not require a WQM transistion if it occurs within an existing WQM section. This occurs heavily in GFX11 pixel shaders with LDS_PARAM_LOAD. Which leads to unnecessary EXEC mask manipulation. To avoid these transitions, detect WQM -> Strict WQM -> WQM and substitute new ENTER_PSEUDO_WM/EXIT_PSEUDO_WM markers instead. These are treat similarly by WWM register pre-allocation pass, but do not manipulate EXEC or use registers to save EXEC state. Reviewed By: piotr Differential Revision: https://reviews.llvm.org/D136813
Loading
Please sign in to comment