Skip to content
  1. Jul 28, 2010
    • Nate Begeman's avatar
      ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller... · 269a6da0
      Nate Begeman authored
      ~40% faster vector shl <4 x i32> on SSE 4.1  Larger improvements for smaller types coming in future patches.
      
      For:
      
      define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
      entry:
        %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
        %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
        ret <2 x i64> %tmp2
      }
      
      We get:
      
      _shl:                                   ## @shl
      	pslld	$23, %xmm1
      	paddd	LCPI0_0, %xmm1
      	cvttps2dq	%xmm1, %xmm1
      	pmulld	%xmm1, %xmm0
      	ret
      
      Instead of:
      
      _shl:                                   ## @shl
      	pshufd	$3, %xmm0, %xmm2
      	movd	%xmm2, %eax
      	pshufd	$3, %xmm1, %xmm2
      	movd	%xmm2, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm2
      	pshufd	$1, %xmm0, %xmm3
      	movd	%xmm3, %eax
      	pshufd	$1, %xmm1, %xmm3
      	movd	%xmm3, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm3
      	punpckldq	%xmm2, %xmm3
      	movd	%xmm0, %eax
      	movd	%xmm1, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm2
      	movhlps	%xmm0, %xmm0
      	movd	%xmm0, %eax
      	movhlps	%xmm1, %xmm1
      	movd	%xmm1, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm0
      	punpckldq	%xmm0, %xmm2
      	movdqa	%xmm2, %xmm0
      	punpckldq	%xmm3, %xmm0
      	ret
      
      llvm-svn: 109549
      269a6da0
    • Howard Hinnant's avatar
      lookahead for ecma · c1124300
      Howard Hinnant authored
      llvm-svn: 109548
      c1124300
    • Gabor Greif's avatar
      recommit simplification (originally r109504, backed out in r109508) now that... · ef1ca24b
      Gabor Greif authored
      recommit simplification (originally r109504, backed out in r109508) now that problem in CallSiteBase is fixed
      
      llvm-svn: 109547
      ef1ca24b
    • Argyrios Kyrtzidis's avatar
      Merge PCHWriterDecl.cpp's isRequiredDecl and CodeGenModule::MayDeferGeneration into a new function, · 4fac2806
      Argyrios Kyrtzidis authored
      DeclIsRequiredFunctionOrFileScopedVar.
      
      This function is part of the public CodeGen interface since it's essentially a CodeGen predicate that is also
      needed by the PCH mechanism to determine whether a decl needs to be deserialized during PCH loading for codegen purposes.
      This fixes current (and avoids future) codegen-from-PCH bugs.
      
      llvm-svn: 109546
      4fac2806
  2. Jul 27, 2010
Loading