Skip to content
  1. Feb 15, 2017
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] Revert failed scheduling · 582a5237
      Stanislav Mekhanoshin authored
      This patch reverts region's scheduling to the original untouched state
      in case if we have have decreased occupancy.
      
      In addition it switches to use TargetRegisterInfo occupancy callback
      for pressure limits instead of gradually increasing limits which were
      just passed by. We are going to stay with the best schedule so we do
      not need to tolerate worsened scheduling anymore.
      
      Differential Revision: https://reviews.llvm.org/D29971
      
      llvm-svn: 295206
      582a5237
  2. Jan 24, 2017
  3. Dec 09, 2016
  4. May 05, 2016
  5. Apr 14, 2016
    • Matt Arsenault's avatar
      AMDGPU: Run SIFoldOperands after PeepholeOptimizer · 3d1c1deb
      Matt Arsenault authored
      PeepholeOptimizer cleans up redundant copies, which makes
      the operand folding more effective.
      
      shader-db stats:
      
      Totals:
      SGPRS: 34200 -> 34336 (0.40 %)
      VGPRS: 22118 -> 21655 (-2.09 %)
      Code Size: 632144 -> 633460 (0.21 %) bytes
      LDS: 11 -> 11 (0.00 %) blocks
      Scratch: 10240 -> 11264 (10.00 %) bytes per wave
      Max Waves: 8822 -> 8918 (1.09 %)
      Wait states: 0 -> 0 (0.00 %)
      
      Totals from affected shaders:
      SGPRS: 7704 -> 7840 (1.77 %)
      VGPRS: 5169 -> 4706 (-8.96 %)
      Code Size: 234444 -> 235760 (0.56 %) bytes
      LDS: 2 -> 2 (0.00 %) blocks
      Scratch: 0 -> 1024 (0.00 %) bytes per wave
      Max Waves: 1188 -> 1284 (8.08 %)
      Wait states: 0 -> 0 (0.00 %)
      
      Increases:
      SGPRS: 35 (0.01 %)
      VGPRS: 1 (0.00 %)
      Code Size: 59 (0.02 %)
      LDS: 0 (0.00 %)
      Scratch: 1 (0.00 %)
      Max Waves: 48 (0.02 %)
      Wait states: 0 (0.00 %)
      
      Decreases:
      SGPRS: 26 (0.01 %)
      VGPRS: 54 (0.02 %)
      Code Size: 68 (0.03 %)
      LDS: 0 (0.00 %)
      Scratch: 0 (0.00 %)
      Max Waves: 4 (0.00 %)
      Wait states: 0 (0.00 %)
      
      llvm-svn: 266378
      3d1c1deb
  6. Jun 13, 2015
  7. Feb 27, 2015
    • David Blaikie's avatar
      [opaque pointer type] Add textual IR support for explicit type parameter to load instruction · a79ac14f
      David Blaikie authored
      Essentially the same as the GEP change in r230786.
      
      A similar migration script can be used to update test cases, though a few more
      test case improvements/changes were required this time around: (r229269-r229278)
      
      import fileinput
      import sys
      import re
      
      pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")
      
      for line in sys.stdin:
        sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))
      
      Reviewers: rafael, dexonsmith, grosser
      
      Differential Revision: http://reviews.llvm.org/D7649
      
      llvm-svn: 230794
      a79ac14f
    • David Blaikie's avatar
      [opaque pointer type] Add textual IR support for explicit type parameter to... · 79e6c749
      David Blaikie authored
      [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
      
      One of several parallel first steps to remove the target type of pointers,
      replacing them with a single opaque pointer type.
      
      This adds an explicit type parameter to the gep instruction so that when the
      first parameter becomes an opaque pointer type, the type to gep through is
      still available to the instructions.
      
      * This doesn't modify gep operators, only instructions (operators will be
        handled separately)
      
      * Textual IR changes only. Bitcode (including upgrade) and changing the
        in-memory representation will be in separate changes.
      
      * geps of vectors are transformed as:
          getelementptr <4 x float*> %x, ...
        ->getelementptr float, <4 x float*> %x, ...
        Then, once the opaque pointer type is introduced, this will ultimately look
        like:
          getelementptr float, <4 x ptr> %x
        with the unambiguous interpretation that it is a vector of pointers to float.
      
      * address spaces remain on the pointer, not the type:
          getelementptr float addrspace(1)* %x
        ->getelementptr float, float addrspace(1)* %x
        Then, eventually:
          getelementptr float, ptr addrspace(1) %x
      
      Importantly, the massive amount of test case churn has been automated by
      same crappy python code. I had to manually update a few test cases that
      wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
      python script just massages stdin and writes the result to stdout, I
      then wrapped that in a shell script to handle replacing files, then
      using the usual find+xargs to migrate all the files.
      
      update.py:
      import fileinput
      import sys
      import re
      
      ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
      normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
      
      def conv(match, line):
        if not match:
          return line
        line = match.groups()[0]
        if len(match.groups()[5]) == 0:
          line += match.groups()[2]
        line += match.groups()[3]
        line += ", "
        line += match.groups()[1]
        line += "\n"
        return line
      
      for line in sys.stdin:
        if line.find("getelementptr ") == line.find("getelementptr inbounds"):
          if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
            line = conv(re.match(ibrep, line), line)
        elif line.find("getelementptr ") != line.find("getelementptr ("):
          line = conv(re.match(normrep, line), line)
        sys.stdout.write(line)
      
      apply.sh:
      for name in "$@"
      do
        python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
        rm -f "$name.tmp"
      done
      
      The actual commands:
      From llvm/src:
      find test/ -name *.ll | xargs ./apply.sh
      From llvm/src/tools/clang:
      find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
      From llvm/src/tools/polly:
      find test/ -name *.ll | xargs ./apply.sh
      
      After that, check-all (with llvm, clang, clang-tools-extra, lld,
      compiler-rt, and polly all checked out).
      
      The extra 'rm' in the apply.sh script is due to a few files in clang's test
      suite using interesting unicode stuff that my python script was throwing
      exceptions on. None of those files needed to be migrated, so it seemed
      sufficient to ignore those cases.
      
      Reviewers: rafael, dexonsmith, grosser
      
      Differential Revision: http://reviews.llvm.org/D7636
      
      llvm-svn: 230786
      79e6c749
  8. Jan 27, 2015
  9. Jan 06, 2015
    • Tom Stellard's avatar
      R600/SI: Add a stub GCNTargetMachine · 49f8bfdc
      Tom Stellard authored
      This is equivalent to the AMDGPUTargetMachine now, but it is the
      starting point for separating R600 and GCN functionality into separate
      targets.
      
      It is recommened that users start using the gcn triple for GCN-based
      GPUs, because using the r600 triple for these GPUs will be deprecated in
      the future.
      
      llvm-svn: 225277
      49f8bfdc
  10. Nov 05, 2014
    • Tom Stellard's avatar
      R600/SI: Change all instruction assembly names to lowercase. · 326d6ece
      Tom Stellard authored
      This matches the format produced by the AMD proprietary driver.
      
      //==================================================================//
      // Shell script for converting .ll test cases: (Pass the .ll files
         you want to convert to this script as arguments).
      //==================================================================//
      
      ; This was necessary on my system so that A-Z in sed would match only
      ; upper case.  I'm not sure why.
      export LC_ALL='C'
      
      TEST_FILES="$*"
      
      MATCHES=`grep -v Patterns SIInstructions.td | grep -o '"[A-Z0-9_]\+["e]' | grep -o '[A-Z0-9_]\+' | sort -r`
      
      for f in $TEST_FILES; do
        # Check that there are SI tests:
        grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f
        if [ $? -eq 0 ]; then
          for match in $MATCHES; do
            sed -i -e "s/\([ :]$match\)/\L\1/" $f
          done
      
          # Try to get check lines with partial instruction names
          sed -i 's/\(;[ ]*SI[A-Z\\-]*: \)\([A-Z_0-9]\+\)/\1\L\2/' $f
        fi
      done
      
      sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll
      sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.*32.ll ../../../test/CodeGen/R600/sext-in-reg.ll
      sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll
      sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll
      sed -i 's/\(; CHECK[-NOT]*: \)\([A-Z_0-9]\+\)/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll
      
      //==================================================================//
      // Shell script for converting .td files (run this last)
      //==================================================================//
      
      export LC_ALL='C'
      sed -i -e '/Patterns/!s/\("[A-Z0-9_]\+[ "e]\)/\L\1/g' SIInstructions.td
      sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td
      
      llvm-svn: 221350
      326d6ece
  11. Oct 01, 2014
  12. Sep 24, 2014
  13. Sep 04, 2014
  14. Jun 05, 2014
  15. Apr 11, 2014
  16. Mar 27, 2014
  17. Nov 12, 2013
    • Matt Arsenault's avatar
      R600/SI: Change formatting of printed registers. · 72b31eee
      Matt Arsenault authored
      Print the range of registers used with a single letter prefix.
      This better matches what the shader compiler produces and
      is overall less obnoxious than concatenating all of the
      subregister names together.
      
      Instead of SGPR0, it will print s0. Instead of SGPR0_SGPR1,
      it will print s[0:1] and so on.
      
      There doesn't appear to be a straightforward way
      to get the actual register info in the InstPrinter,
      so this parses the generated name to print with the
      new syntax.
      
      The required test changes are pretty nasty, and register
      matching regexes are now worse. Since there isn't a way to
      add to a variable in FileCheck, some of the tests now don't
      check the exact number of registers used, but I don't think that
      will be a real problem.
      
      llvm-svn: 194443
      72b31eee
  18. Oct 10, 2013
    • Tom Stellard's avatar
      R600/SI: Use -verify-machineinstrs for most tests · 70f13dba
      Tom Stellard authored
      We can't enable the verifier for tests with SI_IF and SI_ELSE, because
      these instructions are always followed by a COPY which copies their
      result to the next basic block.  This violates the machine verifier's
      rule that non-terminators can not folow terminators.
      
      Reviewed-by: Vincent Lejeune<vljn at ovi.com>
      llvm-svn: 192366
      70f13dba
  19. Jun 25, 2013
  20. May 10, 2013
Loading