Skip to content
  1. Apr 28, 2016
  2. Apr 26, 2016
  3. Apr 19, 2016
    • Nicolai Haehnle's avatar
      AMDGPU/SI: SGPR accounting in getSIProgramInfo must ignore exec_lo/hi · 7483937b
      Nicolai Haehnle authored
      Summary:
      A shader stored the live mask (initial exec mask) in an SGPR which was then
      spilled during register allocation. The allocator quite reasonably
      optimized turned the spill into
      
        v_writelane_b32 %vgpr, exec_lo, N
        v_writelane_b32 %vgpr, exec_hi, N+1
      
      at the beginning of the shader, confusing the SGPR accounting.
      
      No test case, because si-sgpr-spill.ll together with an upcoming patch for
      WQM handling exhibits the problem.
      
      Reviewers: arsenm, tstellarAMD
      
      Subscribers: arsenm, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D19199
      
      llvm-svn: 266824
      7483937b
  4. Apr 15, 2016
  5. Apr 13, 2016
  6. Apr 06, 2016
  7. Apr 05, 2016
  8. Mar 30, 2016
  9. Mar 01, 2016
  10. Feb 12, 2016
    • Matt Arsenault's avatar
      AMDGPU: Set element_size in private resource descriptor · 24ee0785
      Matt Arsenault authored
      Introduce a subtarget feature for this, and leave the default with
      the current behavior which assumes up to 16-byte loads/stores can
      be used. The field also seems to have the ability to be set to 2 bytes,
      but I'm not sure what that would be used for.
      
      llvm-svn: 260651
      24ee0785
  11. Jan 28, 2016
  12. Jan 14, 2016
  13. Jan 13, 2016
    • Marek Olsak's avatar
      AMDGPU/SI: Add new target attribute InitialPSInputAddr · fccabaf5
      Marek Olsak authored
      Summary:
      This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM.
      The register assigns VGPR locations to PS inputs, while the ENA register
      determines whether or not they are loaded.
      
      Mesa needs to set some inputs as not-movable, so that a pixel shader prolog
      binary appended at the beginning can assume where some inputs are.
      
      v2: Make PSInputAddr private, because there is never enough silly getters
          and setters for people to read.
      
      Reviewers: tstellarAMD, arsenm
      
      Subscribers: arsenm
      
      Differential Revision: http://reviews.llvm.org/D16030
      
      llvm-svn: 257591
      fccabaf5
  14. Jan 12, 2016
  15. Jan 08, 2016
  16. Jan 07, 2016
    • Nicolai Haehnle's avatar
      AMDGPU/SI: xnack_mask is always reserved on VI · 3c05d6d3
      Nicolai Haehnle authored
      Summary:
      Somehow, I first interpreted the docs as saying space for xnack_mask is only
      reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and
      went back to actually test what is happening, and it turns out that xnack_mask
      is always reserved at least on Tonga and Carrizo, in the sense that flat_scr
      is always fixed below the SGPRs that are used to implement xnack_mask, whether
      or not they are actually used.
      
      I confirmed this by writing a shader using inline assembly to tease out the
      aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where
      we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so
      xnack_mask is s[76:77] and vcc is s[78:79]).
      
      This patch changes both the calculation of the total number of SGPRs and the
      various register reservations to account for this.
      
      It ought to be possible to use the gap left by xnack_mask when the feature
      isn't used, but this patch doesn't try to do that. (Note that the same applies
      to vcc.)
      
      Note that previously, even before my earlier change in r256794, the SGPRs that
      alias to xnack_mask could end up being used as well when flat_scr was unused
      and the total number of SGPRs happened to fall on the right alignment
      (e.g. highest regular SGPR being used s29 and VCC used would lead to number
      of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there
      were some conflict due to such aliasing, we should have noticed that already.
      
      Reviewers: arsenm, tstellarAMD
      
      Subscribers: arsenm, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D15898
      
      llvm-svn: 257073
      3c05d6d3
  17. Jan 05, 2016
    • Nicolai Haehnle's avatar
      AMDGPU: add +xnack feature · 5b504976
      Nicolai Haehnle authored
      Summary:
      Enabling this feature will account for the two SGPRs used by the hardware
      to store the XNACK_MASK physically.
      
      The hardware only requires this reservation when the XNACK feature is
      explicitly enabled. At some point, HSA will probably want to do that, but
      it does increase SGPR register pressure, so leave it disabled by default
      for now (but do add a small test).
      
      Reviewers: arsenm, tstellarAMD
      
      Subscribers: arsenm, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D15869
      
      llvm-svn: 256794
      5b504976
  18. Dec 17, 2015
  19. Dec 16, 2015
  20. Dec 15, 2015
  21. Dec 10, 2015
  22. Dec 03, 2015
  23. Dec 02, 2015
  24. Nov 30, 2015
    • Matt Arsenault's avatar
      AMDGPU: Error if too many user SGPRs used · 41003af2
      Matt Arsenault authored
      llvm-svn: 254332
      41003af2
    • Matt Arsenault's avatar
      AMDGPU: Rework how private buffer passed for HSA · 26f8f3db
      Matt Arsenault authored
      If we know we have stack objects, we reserve the registers
      that the private buffer resource and wave offset are passed
      and use them directly.
      
      If not, reserve the last 5 SGPRs just in case we need to spill.
      After register allocation, try to pick the next available registers
      instead of the last SGPRs, and then insert copies from the inputs
      to the reserved registers in the progloue.
      
      This also only selectively enables all of the input registers
      which are really required instead of always enabling them.
      
      llvm-svn: 254331
      26f8f3db
  25. Nov 26, 2015
  26. Nov 11, 2015
  27. Nov 06, 2015
  28. Nov 05, 2015
  29. Oct 01, 2015
  30. Sep 22, 2015
  31. Aug 15, 2015
  32. Aug 12, 2015
  33. Jun 26, 2015
Loading