[AMDGPU] Introduce more scratch registers in the ABI.
The AMDGPU target has a convention that defined all VGPRs (execept the initial 32 argument registers) as callee-saved. This convention is not efficient always, esp. when the callee requiring more registers, ended up emitting a large number of spills, even though its caller requires only a few. This patch revises the ABI by introducing more scratch registers that a callee can freely use. The 256 vgpr registers now become: 32 argument registers 112 scratch registers and 112 callee saved registers. The scratch registers and the CSRs are intermixed at regular intervals (a split boundary of 8) to obtain a better occupancy. Reviewers: arsenm, t-tye, rampitec, b-sumner, mjbedy, tpr Reviewed By: arsenm, t-tye Differential Revision: https://reviews.llvm.org/D76356
Loading
Please sign in to comment