Skip to content
  1. Feb 06, 2014
    • Puyan Lotfi's avatar
      Yet another patch to reduce compile time for small programs: · efbcf494
      Puyan Lotfi authored
      The aim in this patch is to reduce work that VirtRegRewriter needs to do when
      telling MachineRegisterInfo which physregs are in use. Up until now
      VirtRegRewriter::rewrite has been doing rewriting and populating def info and
      then proceeding to set whether a physreg is used based this info for every
      physreg that the target provides. This can be expensive when a target has an
      unusually high number of supported physregs, and is a noticeable chunk of
      compile time for small programs on such targets.
      
      So to reduce compile time, this patch simply adds the use of a SparseSet to the
      rewrite function that is used to flag each physreg that is encountered in a
      MachineFunction. Afterward, rather than iterating over the set of all physregs
      for a given target to set the physregs used in MachineRegisterInfo, the new way
      is to iterate over the set of physregs that were actually encountered and set
      in the SparseSet. This improves compile time because the existing rewrite
      function was iterating over all MachineOperands already, and because the
      iterations afterward to setPhysRegUsed is reduced by use of the SparseSet data.
      
      llvm-svn: 200919
      efbcf494
    • Puyan Lotfi's avatar
      The following patch' purpose is to reduce compile time for compilation of small · 5eb10048
      Puyan Lotfi authored
      programs on targets with large register files. The root of the compile time
      overhead was in the use of llvm::SmallVector to hold PhysRegEntries, which
      resulted in slow-down from calling llvm::SmallVector::assign(N, 0). In contrast
      std::vector uses the faster __platform_bzero to zero out primitive buffers when
      assign is called, while SmallVector uses an iterator.
      
      The fix for this was simply to replace the SmallVector with a dynamically
      allocated buffer and to initialize or reinitialize the buffer based on the
      total registers that the target architecture requires. The changes support
      cases where a pass manager may be reused for different targets, and note that
      the PhysRegEntries is allocated using calloc mainly for good for, and also to
      quite tools like Valgrind (see comments for more info on this).
      
      There is an rdar to track the fact that SmallVector doesn't have platform
      specific speedup optimizations inside of it for things like this, and I'll
      create a bugzilla entry at some point soon as well.
      
      TL;DR: This fix replaces the expensive llvm::SmallVector<unsigned
      char>::assign(N, 0) with a call to calloc for N bytes which is much faster
      because SmallVector's assign uses iterators.
      
      llvm-svn: 200917
      5eb10048
    • Puyan Lotfi's avatar
      This small change reduces compile time for small programs on targets that have · 12ae04bd
      Puyan Lotfi authored
      large register files. The omission of Queries.clear() is perfectly safe because
      LiveIntervalUnion::Query doesn't contain any data that needs freeing and
      because LiveRegMatrix::runOnFunction happens to reset the OwningArrayPtr
      holding Queries every time it is run, so there's no need to zero out the
      queries either. Not having to do this for very large numbers of physregs
      is a noticeable constant cost reduction in compilation of small programs.
      
      llvm-svn: 200913
      12ae04bd
    • Juergen Ributzka's avatar
      [DAG] Don't pull the binary operation though the shift if the operands have opaque constants. · fa0eba6c
      Juergen Ributzka authored
      During DAGCombine visitShiftByConstant assumes that certain binary operations
      with only constant operands can always be folded successfully. This is no longer
      true when the constant is opaque. This commit fixes visitShiftByConstant by not
      performing the optimization for opaque constants. Otherwise we would end up in
      an infinite DAGCombine loop.
      
      llvm-svn: 200900
      fa0eba6c
    • Matt Arsenault's avatar
      Pass address space to allowsUnalignedMemoryAccesses · 1b55dd9a
      Matt Arsenault authored
      llvm-svn: 200888
      1b55dd9a
    • Matt Arsenault's avatar
      Add address space argument to allowsUnalignedMemoryAccess. · 25793a3f
      Matt Arsenault authored
      On R600, some address spaces have more strict alignment
      requirements than others.
      
      llvm-svn: 200887
      25793a3f
  2. Feb 05, 2014
    • Quentin Colombet's avatar
      [RegAlloc] Add a last chance recoloring mechanism when everything else failed to · 87769713
      Quentin Colombet authored
      find a register.
      
      The idea is to choose a color for the variable that cannot be allocated and
      recolor its interferences around. Unlike the current register allocation scheme,
      it is allowed to change the color of an already assigned (but maybe not
      splittable or spillable) live interval while propagating this change to its
      neighbors.
      In other word, there are two things that may help finding an available color:
      - Already assigned variables (RS_Done) can be recolored to different color.
      - The recoloring allows to catch solutions that needs to touch more that just
        the neighbors of the current allocated variable.
      
      E.g.,
      vA can use {R1, R2    }
      vB can use {    R2, R3}
      vC can use {R1        }
      Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and
      they all interfere.
      
      vA is assigned R1
      vB is assigned R2
      vC tries to evict vA but vA is already done.
      => Regular register allocation heuristic fails.
      
      Last chance recoloring kicks in:
      vC does as if vA was evicted => vC uses R1.
      vC is marked as fixed.
      vA needs to find a color.
      None are available.
      vA cannot evict vC: vC is a fixed virtual register now.
      vA does as if vB was evicted => vA uses R2.
      vB needs to find a color.
      R3 is available.
      Recoloring => vC = R1, vA = R2, vB = R3.
      
      <rdar://problem/15947839>
      
      llvm-svn: 200883
      87769713
    • Rafael Espindola's avatar
      Remove support for not using .loc directives. · b4eec1da
      Rafael Espindola authored
      Clang itself was not using this. The only way to access it was via llc.
      
      llvm-svn: 200862
      b4eec1da
    • Craig Topper's avatar
  3. Feb 04, 2014
  4. Feb 03, 2014
  5. Feb 01, 2014
  6. Jan 31, 2014
    • Paul Robinson's avatar
      If we're not producing DWARF accel tables, don't waste memory · 3878a781
      Paul Robinson authored
      keeping track of those entries.
      
      llvm-svn: 200572
      3878a781
    • Eric Christopher's avatar
      Add support for DW_FORM_flag and DW_FORM_flag_present to the DIE hashing · 4b1cf580
      Eric Christopher authored
      algorithm. Sink the 'A' + Attribute hash into each form so we don't
      have to check valid forms before deciding whether or not we're going
      to hash which will let the default be to return without doing anything.
      
      llvm-svn: 200571
      4b1cf580
    • David Blaikie's avatar
      DebugInfo: Flag type unit references as declarations · 322d79b4
      David Blaikie authored
      This ensures DWARF consumers don't confuse these references for
      definitions. I'd argue it might be nice to improve debuggers so we don't
      need this, but it's just one field in an abbreviation anyway - so it
      doesn't seem worth the fight.
      
      llvm-svn: 200569
      322d79b4
    • Manman Ren's avatar
      This patch teaches the DAGCombiner how to fold insert_subvector nodes · 413a6cb4
      Manman Ren authored
      when the input is a concat_vectors and the insert replaces one of the
      concat halves:
      
      Lower half: fold (insert_subvector (concat_vectors X, Y), Z) ->
      (concat_vectors Z, Y)
      Upper half: fold (insert_subvector (concat_vectors X, Y), Z) ->
      (concat_vectors X, Z)
      
      This can be seen with the following IR:
      
      define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x
      float> %v3) {
        %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32
      0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
        %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x
      float> %1, <4 x float> %v3, i8 0)
      
      The vinsertf128 intrinsic is converted into an insert_subvector node
      in SelectionDAGBuilder.cpp.
      
      Using AVX, without the patch this generates two vinsertf128 instructions:
      
      vinsertf128 $1, %xmm1, %ymm0, %ymm0
      vinsertf128 $0, %xmm2, %ymm0, %ymm0
      
      With the patch this is optimized into:
      
      vinsertf128 $1, %xmm1, %ymm2, %ymm0
      
      Patch by Robert Lougher.
      
      llvm-svn: 200506
      413a6cb4
    • Owen Anderson's avatar
    • Manman Ren's avatar
      PGO branch weight: update edge weights in SelectionDAGBuilder. · 4ece7452
      Manman Ren authored
      When converting from "or + br" to two branches, or converting from
      "and + br" to two branches, we correctly update the edge weights of
      the two branches.
      
      The previous attempt at r200431 was reverted at r200434 because of
      two testing case failures. I modified my patch a little, but forgot
      to re-run "make check-all".
      
      Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of
      the patch's impact on branch probability which causes changes in
      spill placement.
      
      llvm-svn: 200502
      4ece7452
  7. Jan 30, 2014
  8. Jan 29, 2014
  9. Jan 28, 2014
Loading