Skip to content
  1. Oct 19, 2016
  2. Oct 18, 2016
    • Dehao Chen's avatar
      Using branch probability to guide critical edge splitting. · ea62ae98
      Dehao Chen authored
      Summary:
      The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass.
      
      The performance impact on speccpu2006 on Intel sandybridge machines:
      
      spec/2006/fp/C++/444.namd                  25.3  +0.26%
      spec/2006/fp/C++/447.dealII               45.96  -0.10%
      spec/2006/fp/C++/450.soplex               41.97  +1.49%
      spec/2006/fp/C++/453.povray               36.83  -0.96%
      spec/2006/fp/C/433.milc                   23.81  +0.32%
      spec/2006/fp/C/470.lbm                    41.17  +0.34%
      spec/2006/fp/C/482.sphinx3                48.13  +0.69%
      spec/2006/int/C++/471.omnetpp             22.45  +3.25%
      spec/2006/int/C++/473.astar               21.35  -2.06%
      spec/2006/int/C++/483.xalancbmk           36.02  -2.39%
      spec/2006/int/C/400.perlbench              33.7  -0.17%
      spec/2006/int/C/401.bzip2                  22.9  +0.52%
      spec/2006/int/C/403.gcc                   32.42  -0.54%
      spec/2006/int/C/429.mcf                   39.59  +0.19%
      spec/2006/int/C/445.gobmk                 26.98  -0.00%
      spec/2006/int/C/456.hmmer                 24.52  -0.18%
      spec/2006/int/C/458.sjeng                 28.26  +0.02%
      spec/2006/int/C/462.libquantum            55.44  +3.74%
      spec/2006/int/C/464.h264ref               46.67  -0.39%
      
      geometric mean                                   +0.20%
      
      Manually checked 473 and 471 to verify the diff is in the noise range.
      
      Reviewers: rengolin, davidxl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24818
      
      llvm-svn: 284541
      ea62ae98
  3. Apr 05, 2016
    • Chuang-Yu Cheng's avatar
      Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge · d3fb38ca
      Chuang-Yu Cheng authored
      Presently, CodeGenPrepare deletes all nearly empty (only phi and branch)
      basic blocks. This pass can delete loop preheaders which frequently creates
      critical edges. A preheader can be a convenient place to spill registers to
      the stack. If the entrance to a loop body is a critical edge, then spills
      may occur in the loop body rather than immediately before it. This patch
      protects loop preheaders from deletion in CodeGenPrepare even if they are
      nearly empty.
      
      Since the patch alters the CFG, it affects a large number of test cases.
      In most cases, the changes are merely cosmetic (basic blocks have different
      names or instruction orders change slightly). I am somewhat concerned about
      the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not
      deleted, then the MIPS backend does not take advantage of a branch delay
      slot. Consequently, I would like some close review by a MIPS expert.
      
      The patch also partially subsumes D16893 from George Burgess IV. George
      correctly notes that CodeGenPrepare does not actually preserve the dominator
      tree. I think the dominator tree was usually not valid when CodeGenPrepare
      ran, but I am using LoopInfo to mark preheaders, so the dominator tree is
      now always valid before CodeGenPrepare.
      
      Author: Tom Jablin (tjablin)
      Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng
      
      http://reviews.llvm.org/D16984
      
      llvm-svn: 265397
      d3fb38ca
  4. Feb 27, 2015
    • David Blaikie's avatar
      [opaque pointer type] Add textual IR support for explicit type parameter to load instruction · a79ac14f
      David Blaikie authored
      Essentially the same as the GEP change in r230786.
      
      A similar migration script can be used to update test cases, though a few more
      test case improvements/changes were required this time around: (r229269-r229278)
      
      import fileinput
      import sys
      import re
      
      pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")
      
      for line in sys.stdin:
        sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))
      
      Reviewers: rafael, dexonsmith, grosser
      
      Differential Revision: http://reviews.llvm.org/D7649
      
      llvm-svn: 230794
      a79ac14f
    • David Blaikie's avatar
      [opaque pointer type] Add textual IR support for explicit type parameter to... · 79e6c749
      David Blaikie authored
      [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
      
      One of several parallel first steps to remove the target type of pointers,
      replacing them with a single opaque pointer type.
      
      This adds an explicit type parameter to the gep instruction so that when the
      first parameter becomes an opaque pointer type, the type to gep through is
      still available to the instructions.
      
      * This doesn't modify gep operators, only instructions (operators will be
        handled separately)
      
      * Textual IR changes only. Bitcode (including upgrade) and changing the
        in-memory representation will be in separate changes.
      
      * geps of vectors are transformed as:
          getelementptr <4 x float*> %x, ...
        ->getelementptr float, <4 x float*> %x, ...
        Then, once the opaque pointer type is introduced, this will ultimately look
        like:
          getelementptr float, <4 x ptr> %x
        with the unambiguous interpretation that it is a vector of pointers to float.
      
      * address spaces remain on the pointer, not the type:
          getelementptr float addrspace(1)* %x
        ->getelementptr float, float addrspace(1)* %x
        Then, eventually:
          getelementptr float, ptr addrspace(1) %x
      
      Importantly, the massive amount of test case churn has been automated by
      same crappy python code. I had to manually update a few test cases that
      wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
      python script just massages stdin and writes the result to stdout, I
      then wrapped that in a shell script to handle replacing files, then
      using the usual find+xargs to migrate all the files.
      
      update.py:
      import fileinput
      import sys
      import re
      
      ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
      normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
      
      def conv(match, line):
        if not match:
          return line
        line = match.groups()[0]
        if len(match.groups()[5]) == 0:
          line += match.groups()[2]
        line += match.groups()[3]
        line += ", "
        line += match.groups()[1]
        line += "\n"
        return line
      
      for line in sys.stdin:
        if line.find("getelementptr ") == line.find("getelementptr inbounds"):
          if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
            line = conv(re.match(ibrep, line), line)
        elif line.find("getelementptr ") != line.find("getelementptr ("):
          line = conv(re.match(normrep, line), line)
        sys.stdout.write(line)
      
      apply.sh:
      for name in "$@"
      do
        python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
        rm -f "$name.tmp"
      done
      
      The actual commands:
      From llvm/src:
      find test/ -name *.ll | xargs ./apply.sh
      From llvm/src/tools/clang:
      find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
      From llvm/src/tools/polly:
      find test/ -name *.ll | xargs ./apply.sh
      
      After that, check-all (with llvm, clang, clang-tools-extra, lld,
      compiler-rt, and polly all checked out).
      
      The extra 'rm' in the apply.sh script is due to a few files in clang's test
      suite using interesting unicode stuff that my python script was throwing
      exceptions on. None of those files needed to be migrated, so it seemed
      sufficient to ignore those cases.
      
      Reviewers: rafael, dexonsmith, grosser
      
      Differential Revision: http://reviews.llvm.org/D7636
      
      llvm-svn: 230786
      79e6c749
  5. Jul 14, 2013
    • Stephen Lin's avatar
      Mass update to CodeGen tests to use CHECK-LABEL for labels corresponding to... · d24ab20e
      Stephen Lin authored
      Mass update to CodeGen tests to use CHECK-LABEL for labels corresponding to function definitions for more informative error messages. No functionality change and all updated tests passed locally.
      
      This update was done with the following bash script:
      
        find test/CodeGen -name "*.ll" | \
        while read NAME; do
          echo "$NAME"
          if ! grep -q "^; *RUN: *llc.*debug" $NAME; then
            TEMP=`mktemp -t temp`
            cp $NAME $TEMP
            sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \
            while read FUNC; do
              sed -i '' "s/;\(.*\)\([A-Za-z0-9_-]*\):\( *\)$FUNC: *\$/;\1\2-LABEL:\3$FUNC:/g" $TEMP
            done
            sed -i '' "s/;\(.*\)-LABEL-LABEL:/;\1-LABEL:/" $TEMP
            sed -i '' "s/;\(.*\)-NEXT-LABEL:/;\1-NEXT:/" $TEMP
            sed -i '' "s/;\(.*\)-NOT-LABEL:/;\1-NOT:/" $TEMP
            sed -i '' "s/;\(.*\)-DAG-LABEL:/;\1-DAG:/" $TEMP
            mv $TEMP $NAME
          fi
        done
      
      llvm-svn: 186280
      d24ab20e
  6. Jan 07, 2012
    • Evan Cheng's avatar
      Added a late machine instruction copy propagation pass. This catches · 00b1a3cd
      Evan Cheng authored
      opportunities that only present themselves after late optimizations
      such as tail duplication .e.g.
      ## BB#1:
              movl    %eax, %ecx
              movl    %ecx, %eax
              ret
      
      The register allocator also leaves some of them around (due to false
      dep between copies from phi-elimination, etc.)
      
      This required some changes in codegen passes. Post-ra scheduler and the
      pseudo-instruction expansion passes have been moved after branch folding
      and tail merging. They were before branch folding before because it did
      not always update block livein's. That's fixed now. The pass change makes
      independently since we want to properly schedule instructions after
      branch folding / tail duplication.
      
      rdar://10428165
      rdar://10640363
      
      llvm-svn: 147716
      00b1a3cd
  7. Mar 11, 2011
  8. Mar 02, 2011
  9. Sep 27, 2010
  10. Sep 24, 2010
  11. Aug 17, 2010
Loading