Skip to content
Commit ac27f0c8 authored by Bill Wendling's avatar Bill Wendling
Browse files

Replace the uint64_t -> double convertion algorithm with one that's more efficient.

This small bit of ASM code is sufficient to do what the old algorithm did:

     movq       %rax,  %xmm0
     punpckldq  (c0),  %xmm0  // c0: (uint4){ 0x43300000U, 0x45300000U, 0U, 0U }
     subpd      (c1),  %xmm0  // c1: (double2){ 0x1.0p52, 0x1.0p52 * 0x1.0p32 }
   #ifdef __SSE3__
     haddpd   %xmm0, %xmm0          
   #else
     pshufd   $0x4e, %xmm0, %xmm1 
     addpd    %xmm1, %xmm0
   #endif

It's arguably faster. One caveat, the 'haddpd' instruction isn't very fast on
all processors.
<rdar://problem/7719814>

llvm-svn: 147593
parent c1b312a5
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment