Skip to content
  1. Nov 03, 2013
  2. Nov 02, 2013
    • Michael Liao's avatar
      Fix PR17764 · b638d05e
      Michael Liao authored
      - When selecting BLEND from vselect, the operands need swapping as due to the
        difference between vselect and SSE/AVX's BLEND insn
      
      llvm-svn: 193900
      b638d05e
  3. Nov 01, 2013
  4. Oct 31, 2013
    • Dan Gohman's avatar
      Fix unused variable warnings. · 3e6f7aff
      Dan Gohman authored
      llvm-svn: 193823
      3e6f7aff
    • Chad Rosier's avatar
    • Andrew Trick's avatar
      Add new calling convention for WebKit Java Script. · a3a11ded
      Andrew Trick authored
      llvm-svn: 193812
      a3a11ded
    • Andrew Trick's avatar
      Add support for stack map generation in the X86 backend. · 153ebe6d
      Andrew Trick authored
      Originally implemented by Lang Hames.
      
      llvm-svn: 193811
      153ebe6d
    • Rui Ueyama's avatar
      Use StringRef::startswith_lower. No functionality change. · 29d29108
      Rui Ueyama authored
      llvm-svn: 193796
      29d29108
    • Chad Rosier's avatar
      [AArch64] Add support for NEON scalar shift immediate instructions. · 20e1f20d
      Chad Rosier authored
      llvm-svn: 193790
      20e1f20d
    • Roman Divacky's avatar
      SparcV9 doesnt have rem instruction either. · 2262cfaf
      Roman Divacky authored
      llvm-svn: 193789
      2262cfaf
    • Andrew Trick's avatar
      whitespace · d4d1d9c0
      Andrew Trick authored
      llvm-svn: 193765
      d4d1d9c0
    • Rafael Espindola's avatar
      Remove another unused flag. · 4b102d0e
      Rafael Espindola authored
      llvm-svn: 193756
      4b102d0e
    • Rafael Espindola's avatar
      Remove unused flag. · 74e1d0a0
      Rafael Espindola authored
      llvm-svn: 193752
      74e1d0a0
    • Cameron McInally's avatar
      Add AVX512 unmasked integer broadcast intrinsics and support. · 394d557f
      Cameron McInally authored
      llvm-svn: 193748
      394d557f
    • Elena Demikhovsky's avatar
      AVX-512: Implemented CMOV for 512-bit vectors · 49665690
      Elena Demikhovsky authored
      llvm-svn: 193747
      49665690
    • Richard Sandiford's avatar
      [SystemZ] Automatically detect zEC12 and z196 hosts · f834ea19
      Richard Sandiford authored
      As on other hosts, the CPU identification instruction is priveleged,
      so we need to look through /proc/cpuinfo.  I copied the PowerPC way of
      handling "generic".
      
      Several tests were implicitly assuming z10 and so failed on z196.
      
      llvm-svn: 193742
      f834ea19
    • Amara Emerson's avatar
      [AArch64] Make the use of FP instructions optional, but enabled by default. · f80f95fc
      Amara Emerson authored
      This adds a new subtarget feature called FPARMv8 (implied by NEON), and
      predicates the support of the FP instructions and registers on this feature.
      
      llvm-svn: 193739
      f80f95fc
    • Jim Grosbach's avatar
      Legalize: Improve legalization of long vector extends. · 72366786
      Jim Grosbach authored
      When an extend more than doubles the size of the elements (e.g., a zext
      from v16i8 to v16i32), the normal legalization method of splitting the
      vectors will run into problems as by the time the destination vector is
      legal, the source vector is illegal. The end result is the operation
      often becoming scalarized, with the typical horrible performance. For
      example, on x86_64, the simple input of:
      define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
        %tmp = zext <16 x i8> %a to <16 x i32>
        store <16 x i32> %tmp, <16 x i32>*%p
        ret void
      }
      
      Generates:
        .section  __TEXT,__text,regular,pure_instructions
        .section  __TEXT,__const
        .align  5
      LCPI0_0:
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .section  __TEXT,__text,regular,pure_instructions
        .globl  _bar
        .align  4, 0x90
      _bar:
        vpunpckhbw  %xmm0, %xmm0, %xmm1
        vpunpckhwd  %xmm0, %xmm1, %xmm2
        vpmovzxwd %xmm1, %xmm1
        vinsertf128 $1, %xmm2, %ymm1, %ymm1
        vmovaps LCPI0_0(%rip), %ymm2
        vandps  %ymm2, %ymm1, %ymm1
        vpmovzxbw %xmm0, %xmm3
        vpunpckhwd  %xmm0, %xmm3, %xmm3
        vpmovzxbd %xmm0, %xmm0
        vinsertf128 $1, %xmm3, %ymm0, %ymm0
        vandps  %ymm2, %ymm0, %ymm0
        vmovaps %ymm0, (%rdi)
        vmovaps %ymm1, 32(%rdi)
        vzeroupper
        ret
      
      So instead we can check if there are legal types that enable us to split
      more cleverly when the input vector is already legal such that we don't
      turn it into an illegal type. If the extend is such that it's more than
      doubling the size of the input we check if
        - the number of vector elements is even,
        - the source type is legal,
        - the type of a split source is illegal,
        - the type of an extended (by doubling element size) source is legal, and
        - the type of that extended source when split is legal.
      If the conditions are met, instead of just splitting both the
      destination and the source types, we create an extend that only goes up
      one "step" (doubling the element width), and the continue legalizing the
      rest of the operation normally. The result is that this operates as a
      new, more effecient, termination condition for the loop of "split the
      operation until the destination type is legal."
      
      With this change, the above example now compiles to:
      _bar:
        vpxor %xmm1, %xmm1, %xmm1
        vpunpcklbw  %xmm1, %xmm0, %xmm2
        vpunpckhwd  %xmm1, %xmm2, %xmm3
        vpunpcklwd  %xmm1, %xmm2, %xmm2
        vinsertf128 $1, %xmm3, %ymm2, %ymm2
        vpunpckhbw  %xmm1, %xmm0, %xmm0
        vpunpckhwd  %xmm1, %xmm0, %xmm3
        vpunpcklwd  %xmm1, %xmm0, %xmm0
        vinsertf128 $1, %xmm3, %ymm0, %ymm0
        vmovaps %ymm0, 32(%rdi)
        vmovaps %ymm2, (%rdi)
        vzeroupper
        ret
      
      This generalizes a custom lowering that was added a while back to the
      ARM backend. That lowering is no longer necessary, and is removed. The
      testcases for it, however, provide excellent ARM tests for this change
      and so remain.
      
      rdar://14735100
      
      llvm-svn: 193727
      72366786
    • Matt Arsenault's avatar
      Fix a few typos · 909d0c06
      Matt Arsenault authored
      llvm-svn: 193723
      909d0c06
  5. Oct 30, 2013
Loading