Skip to content
  1. Sep 20, 2013
  2. Sep 19, 2013
  3. Sep 17, 2013
    • Arnold Schwaighofer's avatar
      Costmodel: Add support for horizontal vector reductions · cae8735a
      Arnold Schwaighofer authored
      Upcoming SLP vectorization improvements will want to be able to estimate costs
      of horizontal reductions. Add infrastructure to support this.
      
      We model reductions as a series of (shufflevector,add) tuples ultimately
      followed by an extractelement. For example, for an add-reduction of <4 x float>
      we could generate the following sequence:
      
       (v0, v1, v2, v3)
         \   \  /  /
           \  \  /
             +  +
      
       (v0+v2, v1+v3, undef, undef)
          \      /
       ((v0+v2) + (v1+v3), undef, undef)
      
       %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
                                 <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
       %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
       %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
                                <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
       %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
       %r = extractelement <4 x float> %bin.rdx8, i32 0
      
      This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)"
      that will allow clients to ask for the cost of such a reduction (as backends
      might generate more efficient code than the cost of the individual instructions
      summed up). This interface is excercised by the CostModel analysis pass which
      looks for reduction patterns like the one above - starting at extractelements -
      and if it sees a matching sequence will call the cost model interface.
      
      We will also support a second form of pairwise reduction that is well supported
      on common architectures (haddps, vpadd, faddp).
      
       (v0, v1, v2, v3)
        \   /    \  /
       (v0+v1, v2+v3, undef, undef)
          \     /
       ((v0+v1)+(v2+v3), undef, undef, undef)
      
        %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
              <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
        %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
              <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
        %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
        %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
              <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
        %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
              <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
        %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
        %r = extractelement <4 x float> %bin.rdx.1, i32 0
      
      llvm-svn: 190876
      cae8735a
    • Serge Pavlov's avatar
      Added documentation to getMemsetStores. · 8ec39992
      Serge Pavlov authored
      llvm-svn: 190866
      8ec39992
    • Quentin Colombet's avatar
      [SelectionDAG] Teach the vector scalarizer about TRUNCATE. · d30a9585
      Quentin Colombet authored
      When a truncate node defines a legal vector type but uses an illegal
      vector type, the legalization process was splitting the vector until
      <1 x vector> type, but then it was failing to scalarize the node because
      it did not know how to handle TRUNCATE.
      
      <rdar://problem/14989896>
      
      llvm-svn: 190830
      d30a9585
    • Adrian Prantl's avatar
      Debug info: Fix PR16736 and rdar://problem/14990587. · db3e26d1
      Adrian Prantl authored
      A DBG_VALUE is register-indirect iff the first operand is a register
      _and_ the second operand is an immediate.
      
      llvm-svn: 190821
      db3e26d1
    • Jakub Staszak's avatar
      Use reference instead of copy. · ec2ffa92
      Jakub Staszak authored
      llvm-svn: 190813
      ec2ffa92
  4. Sep 16, 2013
  5. Sep 15, 2013
  6. Sep 13, 2013
    • Quentin Colombet's avatar
      [Peephole] Rewrite copies to avoid cross register banks copies. · cf71c632
      Quentin Colombet authored
      By definition copies across register banks are not coalescable. Still, it may be
      possible to get rid of such a copy when the value is available in another
      register of the same register file.
      Consider the following example, where capital and lower letters denote different
      register file:
      b = copy A <-- cross-bank copy
      ...
      C = copy b <-- cross-bank copy
      
      This could have been optimized this way:
      b = copy A  <-- cross-bank copy
      ...
      C = copy A <-- same-bank copy
      
      Note: b and C's definitions may be in different basic blocks.
      
      This patch adds a peephole optimization that looks through a chain of copies
      leading to a cross-bank copy and reuses a source that is on the same register
      file if available.
      
      This solution could also be used to get rid of some copies (e.g., A could have
      been used instead of C). However, we do not do so because:
      - It may over constrain the coloring of the source register for coalescing.
      - The register allocator may not be able to find a nice split point for the
        longer live-range, leading to more spill.
      
      <rdar://problem/14742333>
      
      llvm-svn: 190713
      cf71c632
    • Eric Christopher's avatar
      Add initial support for handling gnu style pubnames accepted by some · dd1a0120
      Eric Christopher authored
      versions of gold. This support is designed to allow gold to produce
      gdb_index sections similar to the accelerator tables and consumable
      by gdb.
      
      llvm-svn: 190649
      dd1a0120
    • Eric Christopher's avatar
      Reformat and hoist section grabbing to top level. · 8b3737fb
      Eric Christopher authored
      llvm-svn: 190648
      8b3737fb
  7. Sep 12, 2013
    • Joey Gouly's avatar
      Add an instruction deprecation feature to TableGen. · 0e76fa7d
      Joey Gouly authored
      The 'Deprecated' class allows you to specify a SubtargetFeature that the
      instruction is deprecated on.
      
      The 'ComplexDeprecationPredicate' class allows you to define a custom
      predicate that is called to check for deprecation.
      For example:
        ComplexDeprecationPredicate<"MCR">
      
      would mean you would have to define the following function:
        bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                                   std::string &Info)
      
      Which returns 'false' for not deprecated, and 'true' for deprecated
      and store the warning message in 'Info'.
      
      The MCTargetAsmParser constructor was chaned to take an extra argument of
      the MCInstrInfo class, so out-of-tree targets will need to be changed.
      
      llvm-svn: 190598
      0e76fa7d
    • Hal Finkel's avatar
      Fix crash in AggressiveAntiDepBreaker with empty CriticalPathSet · 6f1ff8e1
      Hal Finkel authored
      If no register classes are added to CriticalPathRCs, then the CriticalPathSet
      bitmask will be empty. In that case, ExcludeRegs must remain NULL or else this
      line will cause a segfault:
      
        } else if ((ExcludeRegs != NULL) && ExcludeRegs->test(AntiDepReg)) {
      
      I have no in-tree test case.
      
      llvm-svn: 190584
      6f1ff8e1
    • Matt Arsenault's avatar
      Remove pointless assertion after r190376 · bc08ddba
      Matt Arsenault authored
      llvm-svn: 190565
      bc08ddba
  8. Sep 11, 2013
  9. Sep 10, 2013
    • Eric Christopher's avatar
      Hoist section call out of loop. · 13b99d2a
      Eric Christopher authored
      llvm-svn: 190440
      13b99d2a
    • Manman Ren's avatar
      Debug Info: create scope children DIEs when the scope DIE is not null. · 2312ed35
      Manman Ren authored
      We try to create the scope children DIEs after we create the scope DIE. But
      to avoid emitting empty lexical block DIE, we first check whether a scope
      DIE is going to be null, then create the scope children if it is not null.
      From the number of children, we decide whether to actually create the scope DIE.
      
      This patch also removes an early exit which checks for a special condition.
      It also removes deletion of un-used children DIEs that are generated
      because we used to generate children DIEs before the scope DIE.
      
      Deletion of un-used children DIEs may cause problem because we sometimes keep
      created DIEs in a member variable of a CU.
      
      llvm-svn: 190421
      2312ed35
    • Manman Ren's avatar
      Debug Info: define a DIRef template. · 34b3dcc3
      Manman Ren authored
      Specialize the constructors for DIRef<DIScope> and DIRef<DIType> to make sure
      the Value is indeed a scope ref and a type ref.
      
      Use DIScopeRef for DIScope::getContext and DIType::getContext and use DITypeRef
      for getContainingType and getClassType.
      
      DIScope::generateRef now returns a DIScopeRef instead of a "Value *" for
      readability and type safety.
      
      llvm-svn: 190418
      34b3dcc3
    • Matt Arsenault's avatar
      Don't use getSetCCResultType for creating a vselect · d232222f
      Matt Arsenault authored
      The vselect mask isn't a setcc.
      
      This breaks in the case when the result of getSetCCResultType
      is larger than the vector operands
      
      e.g. %tmp = select i1 %cmp <2 x i8> %a, <2 x i8> %b
      when getSetCCResultType returns <2 x i32>, the assertion
      that the (MaskTy.getSizeInBits() == Op1.getValueType().getSizeInBits())
      is hit.
      
      No test since I don't think I can hit this with any of the current
      targets. The R600/SI implementation would break, since it returns a
      vector of i1 for this, but it doesn't reach ExpandSELECT for other
      reasons.
      
      llvm-svn: 190376
      d232222f
    • Andrew Trick's avatar
      Enable -misched-cyclicpath by default. · 6c88b350
      Andrew Trick authored
      llvm-svn: 190367
      6c88b350
Loading