Commits · 582a5237f95a3852cead5208f28a84b4cab0efb2 · Roger Ferrer / llvm-epi

Feb 15, 2017

[AMDGPU] Revert failed scheduling · 582a5237

Stanislav Mekhanoshin authored Feb 15, 2017

This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.

In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.

Differential Revision: https://reviews.llvm.org/D29971

llvm-svn: 295206

582a5237

Feb 08, 2017
- [AMDGPU] Move register related queries to subtarget class · e03b1d7b
  Konstantin Zhuravlyov authored Feb 08, 2017
```
Differential Revision: https://reviews.llvm.org/D29318

llvm-svn: 294440
```
  e03b1d7b
Feb 07, 2017

[AMDGPU] Fix GCNSchedStrategy.cpp debug output · 99be1aff

Stanislav Mekhanoshin authored Feb 06, 2017

There is typo in the debug output: top and bottom candidates are switched.

Differential Revision: https://reviews.llvm.org/D29608

llvm-svn: 294257

99be1aff

Feb 01, 2017

[AMDGPU] Account workgroup size in LDS occupancy limits · 2b913b1f

Stanislav Mekhanoshin authored Feb 01, 2017

Functions matching LDS use to occupancy return results for a workgroup
of 64 workitems. The numbers has to be adjusted for bigger workgroups.
For example a workgroup of size 256 already occupies 4 waves just by
itself. Given that all numbers of LDS use in the compiler are per
workgroup, occupancy shall be multiplied by 4 in this case. Each 64
workitems still limited by the same number, but 4 subrgoups 64 workitems
each can afford 4 times more LDS to get the same occupancy.

In addition change initializes LDS size in the subtarget to a real value
for SI+ targets. This is required since LDS size is a variable in these
calculations.

Differential Revision: https://reviews.llvm.org/D29423

llvm-svn: 293837

2b913b1f

Jan 26, 2017
- [AMDGPU] Fix typo in GCNSchedStrategy · 75d1de90
  Valery Pykhtin authored Jan 26, 2017
```
Differential revision: https://reviews.llvm.org/D28980

llvm-svn: 293171
```
  75d1de90
Dec 09, 2016

AMDGPU/SI: Allow using SGPRs 96-101 on VI · 91f22fbf

Marek Olsak authored Dec 09, 2016

Summary:
There is no point in setting SGPRS=104, because VI allocates SGPRs
in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs
for general purposes.

Reviewers: tstellarAMD

Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27149

llvm-svn: 289260

91f22fbf

Nov 01, 2016
- AMDGPU: Whitespace fixes · f3dd8630
  Matt Arsenault authored Nov 01, 2016
```
llvm-svn: 285659
```
  f3dd8630
Sep 06, 2016

[AMDGPU] Wave and register controls · 1d65026c

Konstantin Zhuravlyov authored Sep 06, 2016

- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562

llvm-svn: 280747

1d65026c

Aug 29, 2016

AMDGPU/SI: Implement a custom MachineSchedStrategy · 0d23ebe8

Tom Stellard authored Aug 29, 2016

Summary:
GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
a different method to compute the excess and critical register
pressure limits.

It's not enabled by default, to enable it you need to pass -misched=gcn
to llc.

Shader DB stats:

32464 shaders in 17874 tests
Totals:
SGPRS: 1542846 -> 1643125 (6.50 %)
VGPRS: 1005595 -> 904653 (-10.04 %)
Spilled SGPRs: 29929 -> 27745 (-7.30 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 36688188 -> 37034900 (0.95 %) bytes
LDS: 1913 -> 1913 (0.00 %) blocks
Max Waves: 254101 -> 265125 (4.34 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 1338220 -> 1438499 (7.49 %)
VGPRS: 886221 -> 785279 (-11.39 %)
Spilled SGPRs: 29869 -> 27685 (-7.31 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 34315716 -> 34662428 (1.01 %) bytes
LDS: 1551 -> 1551 (0.00 %) blocks
Max Waves: 188127 -> 199151 (5.86 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23688

llvm-svn: 279995

0d23ebe8