Skip to content
  • Bruno Cardoso Lopes's avatar
    Instead of always leaving the work to the generic legalizer when · 2e99f1b3
    Bruno Cardoso Lopes authored
    there is no support for native 256-bit shuffles, be more smart in some
    cases, for example, when you can extract specific 128-bit parts and use
    regular 128-bit shuffles for them. Example:
    
    For this shuffle:
      shufflevector <4 x i64> %a, <4 x i64> %b, <4 x i32>
                    <i32 1, i32 0, i32 7, i32 6>
    
    This was expanded to:
      vextractf128  $1, %ymm1, %xmm2
      vpextrq $0, %xmm2, %rax
      vmovd %rax, %xmm1
      vpextrq $1, %xmm2, %rax
      vmovd %rax, %xmm2
      vpunpcklqdq %xmm1, %xmm2, %xmm1
      vpextrq $0, %xmm0, %rax
      vmovd %rax, %xmm2
      vpextrq $1, %xmm0, %rax
      vmovd %rax, %xmm0
      vpunpcklqdq %xmm2, %xmm0, %xmm0
      vinsertf128 $1, %xmm1, %ymm0, %ymm0
      ret
    
    Now we get:
      vshufpd $1, %xmm0, %xmm0, %xmm0
      vextractf128  $1, %ymm1, %xmm1
      vshufpd $1, %xmm1, %xmm1, %xmm1
      vinsertf128 $1, %xmm1, %ymm0, %ymm0
    
    llvm-svn: 137733
    2e99f1b3
Loading