Skip to content
  1. Aug 26, 2010
    • Daniel Dunbar's avatar
      IRgen/NEON: Fix codegen of vzip and vzipq. · e3d87d21
      Daniel Dunbar authored
       - Will be adding an executable test case to test-suite repo.
      
      llvm-svn: 112126
      e3d87d21
    • Chris Lattner's avatar
      Finally pass "two floats in a 64-bit unit" as a <2 x float> instead of · 9f8b4518
      Chris Lattner authored
      as a double in the x86-64 ABI.  This allows us to generate much better
      code for certain things, e.g.:
      
      _Complex float f32(_Complex float A, _Complex float B) {
        return A+B;
      }
      
      Used to compile into (look at the integer silliness!):
      
      _f32:                                   ## @f32
      ## BB#0:                                ## %entry
      	movd	%xmm1, %rax
      	movd	%eax, %xmm1
      	movd	%xmm0, %rcx
      	movd	%ecx, %xmm0
      	addss	%xmm1, %xmm0
      	movd	%xmm0, %edx
      	shrq	$32, %rax
      	movd	%eax, %xmm0
      	shrq	$32, %rcx
      	movd	%ecx, %xmm1
      	addss	%xmm0, %xmm1
      	movd	%xmm1, %eax
      	shlq	$32, %rax
      	addq	%rdx, %rax
      	movd	%rax, %xmm0
      	ret
      
      Now we get:
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$16, %xmm2, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm0
      	addss	%xmm1, %xmm0
      	pshufd	$16, %xmm0, %xmm1
      	movdqa	%xmm2, %xmm0
      	unpcklps	%xmm1, %xmm0
      	ret
      
      and compile stuff like:
      
      extern float _Complex ccoshf( float _Complex ) ;
      float _Complex ccosf ( float _Complex z ) {
       float _Complex iz;
       (__real__ iz) = -(__imag__ z);
       (__imag__ iz) = (__real__ z);
       return ccoshf(iz);
      }
      
      into:
      
      _ccosf:                                 ## @ccosf
      ## BB#0:                                ## %entry
      	pshufd	$1, %xmm0, %xmm1
      	xorps	LCPI4_0(%rip), %xmm1
      	unpcklps	%xmm0, %xmm1
      	movaps	%xmm1, %xmm0
      	jmp	_ccoshf                 ## TAILCALL
      
      instead of:
      
      _ccosf:                                 ## @ccosf
      ## BB#0:                                ## %entry
      	movd	%xmm0, %rax
      	movq	%rax, %rcx
      	shlq	$32, %rcx
      	shrq	$32, %rax
      	xorl	$-2147483648, %eax      ## imm = 0xFFFFFFFF80000000
      	addq	%rcx, %rax
      	movd	%rax, %xmm0
      	jmp	_ccoshf                 ## TAILCALL
      
      
      There is still "stuff to be done" here for the struct case,
      but this resolves rdar://6379669 - [x86-64 ABI] Pass and return 
      _Complex float / double efficiently
      
      llvm-svn: 112111
      9f8b4518
  2. Aug 25, 2010
  3. Aug 24, 2010
  4. Aug 23, 2010
  5. Aug 22, 2010
  6. Aug 21, 2010
Loading