Skip to content
  1. Sep 25, 2013
  2. Sep 24, 2013
  3. Sep 23, 2013
  4. Sep 22, 2013
  5. Sep 20, 2013
  6. Sep 19, 2013
  7. Sep 18, 2013
  8. Sep 17, 2013
    • Arnold Schwaighofer's avatar
      Costmodel: Add support for horizontal vector reductions · cae8735a
      Arnold Schwaighofer authored
      Upcoming SLP vectorization improvements will want to be able to estimate costs
      of horizontal reductions. Add infrastructure to support this.
      
      We model reductions as a series of (shufflevector,add) tuples ultimately
      followed by an extractelement. For example, for an add-reduction of <4 x float>
      we could generate the following sequence:
      
       (v0, v1, v2, v3)
         \   \  /  /
           \  \  /
             +  +
      
       (v0+v2, v1+v3, undef, undef)
          \      /
       ((v0+v2) + (v1+v3), undef, undef)
      
       %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
                                 <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
       %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
       %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
                                <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
       %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
       %r = extractelement <4 x float> %bin.rdx8, i32 0
      
      This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)"
      that will allow clients to ask for the cost of such a reduction (as backends
      might generate more efficient code than the cost of the individual instructions
      summed up). This interface is excercised by the CostModel analysis pass which
      looks for reduction patterns like the one above - starting at extractelements -
      and if it sees a matching sequence will call the cost model interface.
      
      We will also support a second form of pairwise reduction that is well supported
      on common architectures (haddps, vpadd, faddp).
      
       (v0, v1, v2, v3)
        \   /    \  /
       (v0+v1, v2+v3, undef, undef)
          \     /
       ((v0+v1)+(v2+v3), undef, undef, undef)
      
        %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
              <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
        %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
              <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
        %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
        %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
              <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
        %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
              <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
        %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
        %r = extractelement <4 x float> %bin.rdx.1, i32 0
      
      llvm-svn: 190876
      cae8735a
    • Ben Langmuir's avatar
      Add llvm.x86.* intrinsics for Intel SHA Extensions · de39520f
      Ben Langmuir authored
      Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as
      well as tests. Also remove mayLoad and hasSideEffects, which can be inferred
      from the instruction patterns.
      
      llvm-svn: 190864
      de39520f
    • Craig Topper's avatar
      79d1bff2
    • Adrian Prantl's avatar
      simplify expression · 35c88587
      Adrian Prantl authored
      llvm-svn: 190826
      35c88587
    • Adrian Prantl's avatar
      Debug info: Fix PR16736 and rdar://problem/14990587. · db3e26d1
      Adrian Prantl authored
      A DBG_VALUE is register-indirect iff the first operand is a register
      _and_ the second operand is an immediate.
      
      llvm-svn: 190821
      db3e26d1
    • Matt Arsenault's avatar
      MemCpyOptimizer: Use max legal int size instead of pointer size · 899f7d2b
      Matt Arsenault authored
      If there are no legal integers, assume 1 byte.
      
      This makes more sense than using the pointer size as
      a guess for the maximum GPR width.
      
      It is conceivable to want to use some 64-bit pointers
      on a target where 64-bit integers aren't legal.
      
      llvm-svn: 190817
      899f7d2b
  9. Sep 16, 2013
Loading