Skip to content
  1. Feb 08, 2017
  2. Feb 07, 2017
  3. Feb 01, 2017
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Account workgroup size in LDS occupancy limits · 2b913b1f
      Stanislav Mekhanoshin authored
      Functions matching LDS use to occupancy return results for a workgroup
      of 64 workitems. The numbers has to be adjusted for bigger workgroups.
      For example a workgroup of size 256 already occupies 4 waves just by
      itself. Given that all numbers of LDS use in the compiler are per
      workgroup, occupancy shall be multiplied by 4 in this case. Each 64
      workitems still limited by the same number, but 4 subrgoups 64 workitems
      each can afford 4 times more LDS to get the same occupancy.
      
      In addition change initializes LDS size in the subtarget to a real value
      for SI+ targets. This is required since LDS size is a variable in these
      calculations.
      
      Differential Revision: https://reviews.llvm.org/D29423
      
      llvm-svn: 293837
      2b913b1f
  4. Jan 26, 2017
  5. Dec 09, 2016
    • Marek Olsak's avatar
      AMDGPU/SI: Allow using SGPRs 96-101 on VI · 91f22fbf
      Marek Olsak authored
      Summary:
      There is no point in setting SGPRS=104, because VI allocates SGPRs
      in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs
      for general purposes.
      
      Reviewers: tstellarAMD
      
      Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
      
      Differential Revision: https://reviews.llvm.org/D27149
      
      llvm-svn: 289260
      91f22fbf
  6. Nov 01, 2016
  7. Sep 06, 2016
    • Konstantin Zhuravlyov's avatar
      [AMDGPU] Wave and register controls · 1d65026c
      Konstantin Zhuravlyov authored
      - Implemented amdgpu-flat-work-group-size attribute
      - Implemented amdgpu-num-active-waves-per-eu attribute
      - Implemented amdgpu-num-sgpr attribute
      - Implemented amdgpu-num-vgpr attribute
      - Dynamic LDS constraints are in a separate patch
      
      Patch by Tom Stellard and Konstantin Zhuravlyov
      
      Differential Revision: https://reviews.llvm.org/D21562
      
      llvm-svn: 280747
      1d65026c
  8. Aug 29, 2016
    • Tom Stellard's avatar
      AMDGPU/SI: Implement a custom MachineSchedStrategy · 0d23ebe8
      Tom Stellard authored
      Summary:
      GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
      a different method to compute the excess and critical register
      pressure limits.
      
      It's not enabled by default, to enable it you need to pass -misched=gcn
      to llc.
      
      Shader DB stats:
      
      32464 shaders in 17874 tests
      Totals:
      SGPRS: 1542846 -> 1643125 (6.50 %)
      VGPRS: 1005595 -> 904653 (-10.04 %)
      Spilled SGPRs: 29929 -> 27745 (-7.30 %)
      Spilled VGPRs: 334 -> 352 (5.39 %)
      Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
      Code Size: 36688188 -> 37034900 (0.95 %) bytes
      LDS: 1913 -> 1913 (0.00 %) blocks
      Max Waves: 254101 -> 265125 (4.34 %)
      Wait states: 0 -> 0 (0.00 %)
      
      Totals from affected shaders:
      SGPRS: 1338220 -> 1438499 (7.49 %)
      VGPRS: 886221 -> 785279 (-11.39 %)
      Spilled SGPRs: 29869 -> 27685 (-7.31 %)
      Spilled VGPRs: 334 -> 352 (5.39 %)
      Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
      Code Size: 34315716 -> 34662428 (1.01 %) bytes
      LDS: 1551 -> 1551 (0.00 %) blocks
      Max Waves: 188127 -> 199151 (5.86 %)
      Wait states: 0 -> 0 (0.00 %)
      
      Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick
      
      Subscribers: arsenm, kzhuravl, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D23688
      
      llvm-svn: 279995
      0d23ebe8
Loading