[AMDGPU] Reintroduce CC exception for non-inlined functions in Promote Alloca limits
This is basically a partial revert of https://reviews.llvm.org/D145586 ( fd1d6087 ) D145586 was originally introduced to help with SWDEV-363662, and it did, but it also caused a 25% drop in performance in some MIOpen benchmarks where, it seems, functions are inlined more conservatively. This patch restores the pre-D145586 behavior for PromoteAlloca: functions with a non-entry CC have a 32 VGPRs threshold, but only if the function is not marked with "alwaysinline". A good number of AMDGPU code makes uses of the AMDGPUAlwaysInline pass anyway, so in our backend "alwaysinline" seems very common. This change does not affect SWDEV-363662 (the motivating issue for introducing D145586). Fixes SWDEV-399519 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D150551
Loading
Please sign in to comment