Skip to content
  1. Aug 28, 2010
  2. Aug 27, 2010
  3. Aug 26, 2010
  4. Aug 25, 2010
  5. Aug 24, 2010
  6. Aug 23, 2010
  7. Aug 21, 2010
  8. Aug 17, 2010
    • Anton Korobeynikov's avatar
      More fixes for win64: · 231ab847
      Anton Korobeynikov authored
        - Do not clobber al during variadic calls, this is AMD64 ABI-only feature
        - Emit wincall64, where necessary
      Patch by Cameron Esfahani!
      
      llvm-svn: 111289
      231ab847
  9. Aug 14, 2010
  10. Aug 13, 2010
  11. Aug 12, 2010
  12. Aug 11, 2010
    • Dan Gohman's avatar
      Use ISD::ADD instead of ISD::SUB with a negated constant. This · 5531aa4d
      Dan Gohman authored
      avoids trouble if the return type of TD->getPointerSize() is
      changed to something which doesn't promote to a signed type,
      and is simpler anyway.
      
      Also, use getCopyFromReg instead of getRegister to read a
      physical register's value.
      
      llvm-svn: 110835
      5531aa4d
    • Bruno Cardoso Lopes's avatar
      Add AVX matching patterns to Packed Bit Test intrinsics. · 91d61df3
      Bruno Cardoso Lopes authored
      Apply the same approach of SSE4.1 ptest intrinsics but
      create a new x86 node "testp" since AVX introduces
      vtest{ps}{pd} instructions which set ZF and CF depending
      on sign bit AND and ANDN of packed floating-point sources.
      
      This is slightly different from what the "ptest" does.
      Tests comming with the other 256 intrinsics tests.
      
      llvm-svn: 110744
      91d61df3
  13. Aug 10, 2010
  14. Aug 06, 2010
  15. Aug 05, 2010
  16. Jul 30, 2010
  17. Jul 29, 2010
    • Jakob Stoklund Olesen's avatar
      Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. · ba0e124a
      Jakob Stoklund Olesen authored
      We do sometimes load from a too small stack slot when dealing with x86 arguments
      (varargs and smaller-than-32-bit args). It looks like we know what we are doing
      in those cases, so I am going to remove the assert instead of artifically
      enlarging stack slot sizes.
      
      The assert in storeRegToStackSlot stays in. We don't want to write beyond the
      bounds of a stack slot.
      
      llvm-svn: 109764
      ba0e124a
  18. Jul 28, 2010
    • Jakob Stoklund Olesen's avatar
      Create a fixed stack object for varargs that is as large as any register. · f2234fbe
      Jakob Stoklund Olesen authored
      The size of this object isn't used for anything - technically it is of variable
      size.
      
      This avoids a false positive from the assert in
      X86InstrInfo::loadRegFromStackSlot, and fixes PR7735.
      
      llvm-svn: 109652
      f2234fbe
    • Nate Begeman's avatar
      Implement a vectorized algorithm for <16 x i8> << <16 x i8> · 53afc8f0
      Nate Begeman authored
      This is about 4x faster and smaller than the existing scalarization.
      
      llvm-svn: 109566
      53afc8f0
    • Nate Begeman's avatar
      ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller... · 269a6da0
      Nate Begeman authored
      ~40% faster vector shl <4 x i32> on SSE 4.1  Larger improvements for smaller types coming in future patches.
      
      For:
      
      define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
      entry:
        %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
        %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
        ret <2 x i64> %tmp2
      }
      
      We get:
      
      _shl:                                   ## @shl
      	pslld	$23, %xmm1
      	paddd	LCPI0_0, %xmm1
      	cvttps2dq	%xmm1, %xmm1
      	pmulld	%xmm1, %xmm0
      	ret
      
      Instead of:
      
      _shl:                                   ## @shl
      	pshufd	$3, %xmm0, %xmm2
      	movd	%xmm2, %eax
      	pshufd	$3, %xmm1, %xmm2
      	movd	%xmm2, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm2
      	pshufd	$1, %xmm0, %xmm3
      	movd	%xmm3, %eax
      	pshufd	$1, %xmm1, %xmm3
      	movd	%xmm3, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm3
      	punpckldq	%xmm2, %xmm3
      	movd	%xmm0, %eax
      	movd	%xmm1, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm2
      	movhlps	%xmm0, %xmm0
      	movd	%xmm0, %eax
      	movhlps	%xmm1, %xmm1
      	movd	%xmm1, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm0
      	punpckldq	%xmm0, %xmm2
      	movdqa	%xmm2, %xmm0
      	punpckldq	%xmm3, %xmm0
      	ret
      
      llvm-svn: 109549
      269a6da0
  19. Jul 26, 2010
  20. Jul 24, 2010
    • Evan Cheng's avatar
      Add an ILP scheduler. This is a register pressure aware scheduler that's · 37b740c4
      Evan Cheng authored
      appropriate for targets without detailed instruction iterineries.
      The scheduler schedules for increased instruction level parallelism in
      low register pressure situation; it schedules to reduce register pressure
      when the register pressure becomes high.
      
      On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
      by 16%.
      
      llvm-svn: 109300
      37b740c4
  21. Jul 23, 2010
    • Dale Johannesen's avatar
      The only supported calling convention for X86-64 uses · f2d75670
      Dale Johannesen authored
      SSE, so we can't return floating point values if this
      is disabled.  Detect this error for clang.
      
      With SSE1 only, f64 is a problem; it can be done, but
      neither llvm-gcc nor clang has ever generated correct
      code for it.  Since nobody noticed this I think it's
      OK to treat it as an error for now.
      
      This also handles SSE-sized vectors of floating point.
      8207686, 8204109.
      
      llvm-svn: 109201
      f2d75670
  22. Jul 22, 2010
  23. Jul 21, 2010
Loading