Skip to content
  1. Jan 30, 2012
  2. Jan 25, 2012
  3. Jan 24, 2012
  4. Jan 23, 2012
  5. Jan 22, 2012
  6. Jan 19, 2012
  7. Jan 01, 2012
  8. Dec 17, 2011
  9. Dec 11, 2011
  10. Dec 06, 2011
  11. Nov 30, 2011
  12. Nov 28, 2011
  13. Nov 26, 2011
  14. Nov 24, 2011
  15. Nov 21, 2011
  16. Nov 19, 2011
  17. Nov 02, 2011
  18. Sep 22, 2011
  19. Sep 13, 2011
  20. Sep 12, 2011
  21. Sep 09, 2011
  22. Sep 08, 2011
  23. Aug 17, 2011
    • Bruno Cardoso Lopes's avatar
      Introduce matching patterns for vbroadcast AVX instruction. The idea is to · be5e9873
      Bruno Cardoso Lopes authored
      match splats in the form (splat (scalar_to_vector (load ...))) whenever
      the load can be folded. All the logic and instruction emission is
      working but because of PR8156, there are no ways to match loads, cause
      they can never be folded for splats. Thus, the tests are XFAILed, but
      I've tested and exercised all the logic using a relaxed version for
      checking the foldable loads, as if the bug was already fixed. This
      should work out of the box once PR8156 gets fixed since MayFoldLoad will
      work as expected.
      
      llvm-svn: 137810
      be5e9873
  24. Aug 12, 2011
  25. Jul 29, 2011
  26. Jul 27, 2011
  27. Jul 26, 2011
  28. Jul 21, 2011
    • Bruno Cardoso Lopes's avatar
      Add support for 256-bit versions of VPERMIL instruction. This is a new · b878caa5
      Bruno Cardoso Lopes authored
      instruction introduced in AVX, which can operate on 128 and 256-bit vectors.
      It considers a 256-bit vector as two independent 128-bit lanes. It can permute
      any 32 or 64 elements inside a lane, and restricts the second lane to
      have the same permutation of the first one. With the improved splat support
      introduced early today, adding codegen for this instruction enable more
      efficient 256-bit code:
      
      Instead of:
        vextractf128  $0, %ymm0, %xmm0
        punpcklbw %xmm0, %xmm0
        punpckhbw %xmm0, %xmm0
        vinsertf128 $0, %xmm0, %ymm0, %ymm1
        vinsertf128 $1, %xmm0, %ymm1, %ymm0
        vextractf128  $1, %ymm0, %xmm1
        shufps  $1, %xmm1, %xmm1
        movss %xmm1, 28(%rsp)
        movss %xmm1, 24(%rsp)
        movss %xmm1, 20(%rsp)
        movss %xmm1, 16(%rsp)
        vextractf128  $0, %ymm0, %xmm0
        shufps  $1, %xmm0, %xmm0
        movss %xmm0, 12(%rsp)
        movss %xmm0, 8(%rsp)
        movss %xmm0, 4(%rsp)
        movss %xmm0, (%rsp)
        vmovaps (%rsp), %ymm0
      We get:
        vextractf128  $0, %ymm0, %xmm0
        punpcklbw %xmm0, %xmm0
        punpckhbw %xmm0, %xmm0
        vinsertf128 $0, %xmm0, %ymm0, %ymm1
        vinsertf128 $1, %xmm0, %ymm1, %ymm0
        vpermilps $85, %ymm0, %ymm0
      
      llvm-svn: 135662
      b878caa5
Loading