Skip to content
  1. Oct 19, 2016
  2. Oct 18, 2016
    • Dehao Chen's avatar
      Using branch probability to guide critical edge splitting. · ea62ae98
      Dehao Chen authored
      Summary:
      The original heuristic to break critical edge during machine sink is relatively conservertive: when there is only one instruction sinkable to the critical edge, it is likely that the machine sink pass will not break the critical edge. This leads to many speculative instructions executed at runtime. However, with profile info, we could model the splitting benefits: if the critical edge has 50% taken rate, it would always be beneficial to split the critical edge to avoid the speculated runtime instructions. This patch uses profile to guide critical edge splitting in machine sink pass.
      
      The performance impact on speccpu2006 on Intel sandybridge machines:
      
      spec/2006/fp/C++/444.namd                  25.3  +0.26%
      spec/2006/fp/C++/447.dealII               45.96  -0.10%
      spec/2006/fp/C++/450.soplex               41.97  +1.49%
      spec/2006/fp/C++/453.povray               36.83  -0.96%
      spec/2006/fp/C/433.milc                   23.81  +0.32%
      spec/2006/fp/C/470.lbm                    41.17  +0.34%
      spec/2006/fp/C/482.sphinx3                48.13  +0.69%
      spec/2006/int/C++/471.omnetpp             22.45  +3.25%
      spec/2006/int/C++/473.astar               21.35  -2.06%
      spec/2006/int/C++/483.xalancbmk           36.02  -2.39%
      spec/2006/int/C/400.perlbench              33.7  -0.17%
      spec/2006/int/C/401.bzip2                  22.9  +0.52%
      spec/2006/int/C/403.gcc                   32.42  -0.54%
      spec/2006/int/C/429.mcf                   39.59  +0.19%
      spec/2006/int/C/445.gobmk                 26.98  -0.00%
      spec/2006/int/C/456.hmmer                 24.52  -0.18%
      spec/2006/int/C/458.sjeng                 28.26  +0.02%
      spec/2006/int/C/462.libquantum            55.44  +3.74%
      spec/2006/int/C/464.h264ref               46.67  -0.39%
      
      geometric mean                                   +0.20%
      
      Manually checked 473 and 471 to verify the diff is in the noise range.
      
      Reviewers: rengolin, davidxl
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24818
      
      llvm-svn: 284541
      ea62ae98
  3. Sep 26, 2015
  4. Sep 22, 2015
    • Ahmed Bougacha's avatar
      [ARM] Emit clrex in the expanded cmpxchg fail block. · 81616a72
      Ahmed Bougacha authored
      ARM counterpart to r248291:
      
      In the comparison failure block of a cmpxchg expansion, the initial
      ldrex/ldxr will not be followed by a matching strex/stxr.
      On ARM/AArch64, this unnecessarily ties up the execution monitor,
      which might have a negative performance impact on some uarchs.
      
      Instead, release the monitor in the failure block.
      The clrex instruction was designed for this: use it.
      
      Also see ARMARM v8-A B2.10.2:
      "Exclusive access instructions and Shareable memory locations".
      
      Differential Revision: http://reviews.llvm.org/D13033
      
      llvm-svn: 248294
      81616a72
  5. Aug 21, 2014
  6. Jul 21, 2014
    • Logan Chien's avatar
      Replace the result usages while legalizing cmpxchg. · 63bee2a2
      Logan Chien authored
      We should update the usages to all of the results;
      otherwise, we might get assertion failure or SEGV during
      the type legalization of ATOMIC_CMP_SWAP_WITH_SUCCESS
      with two or more illegal types.
      
      For example, in the following sequence, both i8 and i1
      might be illegal in some target, e.g. armv5, mipsel, mips64el,
      
          %0 = cmpxchg i8* %ptr, i8 %desire, i8 %new monotonic monotonic
          %1 = extractvalue { i8, i1 } %0, 1
      
      Since both i8 and i1 should be legalized, the corresponding
      ATOMIC_CMP_SWAP_WITH_SUCCESS dag will be checked/replaced/updated
      twice.
      
      If we don't update the usage to *ALL* of the results in the
      first round, the DAG for extractvalue might be processed earlier.
      The GetPromotedInteger() will result in assertion failure,
      because its operand (i.e. the success bit of cmpxchg) is not
      promoted beforehand.
      
      llvm-svn: 213569
      63bee2a2
Loading