Skip to content
  1. Nov 04, 2013
  2. Nov 03, 2013
  3. Nov 02, 2013
    • Michael Liao's avatar
      Fix PR17764 · b638d05e
      Michael Liao authored
      - When selecting BLEND from vselect, the operands need swapping as due to the
        difference between vselect and SSE/AVX's BLEND insn
      
      llvm-svn: 193900
      b638d05e
  4. Nov 01, 2013
  5. Oct 31, 2013
    • Dan Gohman's avatar
      Fix unused variable warnings. · 3e6f7aff
      Dan Gohman authored
      llvm-svn: 193823
      3e6f7aff
    • Chad Rosier's avatar
    • Andrew Trick's avatar
      Add new calling convention for WebKit Java Script. · a3a11ded
      Andrew Trick authored
      llvm-svn: 193812
      a3a11ded
    • Andrew Trick's avatar
      Add support for stack map generation in the X86 backend. · 153ebe6d
      Andrew Trick authored
      Originally implemented by Lang Hames.
      
      llvm-svn: 193811
      153ebe6d
    • Rui Ueyama's avatar
      Use StringRef::startswith_lower. No functionality change. · 29d29108
      Rui Ueyama authored
      llvm-svn: 193796
      29d29108
    • Chad Rosier's avatar
      [AArch64] Add support for NEON scalar shift immediate instructions. · 20e1f20d
      Chad Rosier authored
      llvm-svn: 193790
      20e1f20d
    • Roman Divacky's avatar
      SparcV9 doesnt have rem instruction either. · 2262cfaf
      Roman Divacky authored
      llvm-svn: 193789
      2262cfaf
    • Andrew Trick's avatar
      whitespace · d4d1d9c0
      Andrew Trick authored
      llvm-svn: 193765
      d4d1d9c0
    • Rafael Espindola's avatar
      Remove another unused flag. · 4b102d0e
      Rafael Espindola authored
      llvm-svn: 193756
      4b102d0e
    • Rafael Espindola's avatar
      Remove unused flag. · 74e1d0a0
      Rafael Espindola authored
      llvm-svn: 193752
      74e1d0a0
    • Cameron McInally's avatar
      Add AVX512 unmasked integer broadcast intrinsics and support. · 394d557f
      Cameron McInally authored
      llvm-svn: 193748
      394d557f
    • Elena Demikhovsky's avatar
      AVX-512: Implemented CMOV for 512-bit vectors · 49665690
      Elena Demikhovsky authored
      llvm-svn: 193747
      49665690
    • Richard Sandiford's avatar
      [SystemZ] Automatically detect zEC12 and z196 hosts · f834ea19
      Richard Sandiford authored
      As on other hosts, the CPU identification instruction is priveleged,
      so we need to look through /proc/cpuinfo.  I copied the PowerPC way of
      handling "generic".
      
      Several tests were implicitly assuming z10 and so failed on z196.
      
      llvm-svn: 193742
      f834ea19
    • Amara Emerson's avatar
      [AArch64] Make the use of FP instructions optional, but enabled by default. · f80f95fc
      Amara Emerson authored
      This adds a new subtarget feature called FPARMv8 (implied by NEON), and
      predicates the support of the FP instructions and registers on this feature.
      
      llvm-svn: 193739
      f80f95fc
    • Jim Grosbach's avatar
      Legalize: Improve legalization of long vector extends. · 72366786
      Jim Grosbach authored
      When an extend more than doubles the size of the elements (e.g., a zext
      from v16i8 to v16i32), the normal legalization method of splitting the
      vectors will run into problems as by the time the destination vector is
      legal, the source vector is illegal. The end result is the operation
      often becoming scalarized, with the typical horrible performance. For
      example, on x86_64, the simple input of:
      define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind {
        %tmp = zext <16 x i8> %a to <16 x i32>
        store <16 x i32> %tmp, <16 x i32>*%p
        ret void
      }
      
      Generates:
        .section  __TEXT,__text,regular,pure_instructions
        .section  __TEXT,__const
        .align  5
      LCPI0_0:
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .long 255                     ## 0xff
        .section  __TEXT,__text,regular,pure_instructions
        .globl  _bar
        .align  4, 0x90
      _bar:
        vpunpckhbw  %xmm0, %xmm0, %xmm1
        vpunpckhwd  %xmm0, %xmm1, %xmm2
        vpmovzxwd %xmm1, %xmm1
        vinsertf128 $1, %xmm2, %ymm1, %ymm1
        vmovaps LCPI0_0(%rip), %ymm2
        vandps  %ymm2, %ymm1, %ymm1
        vpmovzxbw %xmm0, %xmm3
        vpunpckhwd  %xmm0, %xmm3, %xmm3
        vpmovzxbd %xmm0, %xmm0
        vinsertf128 $1, %xmm3, %ymm0, %ymm0
        vandps  %ymm2, %ymm0, %ymm0
        vmovaps %ymm0, (%rdi)
        vmovaps %ymm1, 32(%rdi)
        vzeroupper
        ret
      
      So instead we can check if there are legal types that enable us to split
      more cleverly when the input vector is already legal such that we don't
      turn it into an illegal type. If the extend is such that it's more than
      doubling the size of the input we check if
        - the number of vector elements is even,
        - the source type is legal,
        - the type of a split source is illegal,
        - the type of an extended (by doubling element size) source is legal, and
        - the type of that extended source when split is legal.
      If the conditions are met, instead of just splitting both the
      destination and the source types, we create an extend that only goes up
      one "step" (doubling the element width), and the continue legalizing the
      rest of the operation normally. The result is that this operates as a
      new, more effecient, termination condition for the loop of "split the
      operation until the destination type is legal."
      
      With this change, the above example now compiles to:
      _bar:
        vpxor %xmm1, %xmm1, %xmm1
        vpunpcklbw  %xmm1, %xmm0, %xmm2
        vpunpckhwd  %xmm1, %xmm2, %xmm3
        vpunpcklwd  %xmm1, %xmm2, %xmm2
        vinsertf128 $1, %xmm3, %ymm2, %ymm2
        vpunpckhbw  %xmm1, %xmm0, %xmm0
        vpunpckhwd  %xmm1, %xmm0, %xmm3
        vpunpcklwd  %xmm1, %xmm0, %xmm0
        vinsertf128 $1, %xmm3, %ymm0, %ymm0
        vmovaps %ymm0, 32(%rdi)
        vmovaps %ymm2, (%rdi)
        vzeroupper
        ret
      
      This generalizes a custom lowering that was added a while back to the
      ARM backend. That lowering is no longer necessary, and is removed. The
      testcases for it, however, provide excellent ARM tests for this change
      and so remain.
      
      rdar://14735100
      
      llvm-svn: 193727
      72366786
    • Matt Arsenault's avatar
      Fix a few typos · 909d0c06
      Matt Arsenault authored
      llvm-svn: 193723
      909d0c06
  6. Oct 30, 2013
    • Tom Roeder's avatar
      This commit adds some (but not all) of the x86-64 relocations that are not · 04d88fba
      Tom Roeder authored
      currently supported in the ELF object writer, along with a simple test case.
      
      llvm-svn: 193709
      04d88fba
    • Artyom Skrobov's avatar
    • Tom Stellard's avatar
      R600: Custom lower f32 = uint_to_fp i64 · c947d8ca
      Tom Stellard authored
      llvm-svn: 193701
      c947d8ca
    • Hans Wennborg's avatar
      Add #include of raw_ostream.h to MipsSEISelLowering.cpp · 3e9b1c10
      Hans Wennborg authored
      Fixing this Windows build error:
      
      ..\lib\Target\Mips\MipsSEISelLowering.cpp(997) : error C2027: use of undefined type 'llvm::raw_ostream'
      
      llvm-svn: 193696
      3e9b1c10
    • Daniel Sanders's avatar
      d5f554f0
    • Daniel Sanders's avatar
      [mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal... · ab94b537
      Daniel Sanders authored
      [mips][msa] Added support for matching bmnz, bmnzi, bmz, and bmzi from normal IR (i.e. not intrinsics)
      
      Also corrected the definition of the intrinsics for these instructions (the
      result register is also the first operand), and added intrinsics for bsel and
      bseli to clang (they already existed in the backend).
      
      These four operations are mostly equivalent to bsel, and bseli (the difference
      is which operand is tied to the result). As a result some of the tests changed
      as described below.
      
      bitwise.ll:
      - bsel.v test adapted so that the mask is unknown at compile-time. This stops
        it emitting bmnzi.b instead of the intended bsel.v.
      - The bseli.b test now tests the right thing. Namely the case when one of the
        values is an uimm8, rather than when the condition is a uimm8 (which is
        covered by bmnzi.b)
      
      compare.ll:
      - bsel.v tests now (correctly) emits bmnz.v instead of bsel.v because this
        is the same operation (see MSA.txt).
      
      i8.ll
      - CHECK-DAG-ized test.
      - bmzi.b test now (correctly) emits equivalent bmnzi.b with swapped operands
        because this is the same operation (see MSA.txt).
      - bseli.b still emits bseli.b though because the immediate makes it
        distinguishable from bmnzi.b.
      
      vec.ll:
      - CHECK-DAG-ized test.
      - bmz.v tests now (correctly) emits bmnz.v with swapped operands (see
        MSA.txt).
      - bsel.v tests now (correctly) emits bmnz.v with swapped operands (see
        MSA.txt).
      
      llvm-svn: 193693
      ab94b537
    • Chad Rosier's avatar
      be020d03
    • Daniel Sanders's avatar
      [mips][msa] Added support for matching bins[lr]i.[bhwd] from normal IR (i.e. not intrinsics) · d74b130c
      Daniel Sanders authored
      This required correcting the definition of the bins[lr]i intrinsics because
      the result is also the first operand.
      
      It also required removing the (arbitrary) check for 32-bit immediates in
      MipsSEDAGToDAGISel::selectVSplat().
      
      Currently using binsli.d with 2 bits set in the mask doesn't select binsli.d
      because the constant is legalized into a ConstantPool. Similar things can
      happen with binsri.d with more than 10 bits set in the mask. The resulting
      code when this happens is correct but not optimal.
      
      llvm-svn: 193687
      d74b130c
    • Daniel Sanders's avatar
      [mips][msa] Combine binsri-like DAG of AND and OR into equivalent VSELECT · 53fe6c4d
      Daniel Sanders authored
      (or (and $a, $mask), (and $b, $inverse_mask)) => (vselect $mask, $a, $b).
      where $mask is a constant splat. This allows bitwise operations to make use
      of bsel.
      
      It's also a stepping stone towards matching bins[lr], and bins[lr]i from
      normal IR.
      
      Two sets of similar tests have been added in this commit. The bsel_* functions
      test the case where binsri cannot be used. The binsr_*_i functions will
      start to use the binsri instruction in the next commit.
      
      llvm-svn: 193682
      53fe6c4d
Loading