AMDGPU: Run SIInsertWaits as pre-emit pass
Running this after the scheduler enables scheduling waits later so other ALU instructions can run while this would be waiting. When combined with enabling the post-RA scheduler, this gives about a ~20% improvement on sgemm. llvm-svn: 241473
Loading
Please sign in to comment