Skip to content
  1. Jul 09, 2018
    • Stefan Pintilie's avatar
      [Power9] Add __float128 builtins for Round To Odd · 83a5fe14
      Stefan Pintilie authored
      GCC has builtins for these round to odd instructions:
      
      __float128 __builtin_sqrtf128_round_to_odd (__float128)
      __float128 __builtin_{add,sub,mul,div}f128_round_to_odd (__float128, __float128)
      __float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128)
      
      Differential Revision: https://reviews.llvm.org/D47550
      
      llvm-svn: 336578
      83a5fe14
    • Craig Topper's avatar
      [X86] In combineFMA, make sure we bitcast the result of isFNEG back the... · 47170b31
      Craig Topper authored
      [X86] In combineFMA, make sure we bitcast the result of isFNEG back the expected type before creating the new FMA node.
      
      Previously, we were creating malformed SDNodes, but nothing noticed because the type constraints prevented isel from noticing.
      
      llvm-svn: 336566
      47170b31
    • Craig Topper's avatar
      [X86] Remove some patterns that include a bitcast of a floating point load to an integer type. · e9cff7d4
      Craig Topper authored
      DAG combine should have converted the type of the load.
      
      llvm-svn: 336557
      e9cff7d4
    • Craig Topper's avatar
      [X86] Remove some patterns that seems to be unreachable. · 16ee4b49
      Craig Topper authored
      These patterns mapped (v2f64 (X86vzmovl (v2f64 (scalar_to_vector FR64:$src)))) to a MOVSD and an zeroing XOR. But the complexity of a pattern for (v2f64 (X86vzmovl (v2f64))) that selects MOVQ is artificially and hides this MOVSD pattern.
      
      Weirder still, the SSE version of the pattern was explicitly blocked on SSE41, but yet we had copied it to AVX and AVX512.
      
      llvm-svn: 336556
      16ee4b49
    • Craig Topper's avatar
      [X86] Remove some seemingly unnecessary AddedComplexity lines. · 22330c70
      Craig Topper authored
      Looking at the generated tables this didn't seem to make an obvious difference in pattern priority.
      
      llvm-svn: 336555
      22330c70
    • Sander de Smalen's avatar
      [AArch64][SVE] Asm: Support for CNT(B|H|W|D) and CNTP instructions. · d3efb59f
      Sander de Smalen authored
      This patch adds support for the following instructions:
      
        CNTB CNTH - Determine the number of active elements implied by
        CNTW CNTD   the named predicate constant, multiplied by an
                    immediate, e.g.
      
                      cnth x0, vl8, #16
      
        CNTP      - Count active predicate elements, e.g.
                      cntp  x0, p0, p1.b
      
                    counts the number of active elements in p1, predicated
                    by p0, and stores the result in x0.
      
      llvm-svn: 336552
      d3efb59f
    • Stefan Pintilie's avatar
      [Power9] Add __float128 support for compare operations · 3d76326d
      Stefan Pintilie authored
      Added handling for the select f128.
      
      Differential Revision: https://reviews.llvm.org/D48294
      
      llvm-svn: 336548
      3d76326d
    • Sander de Smalen's avatar
      [AArch64][SVE] Asm: Support for remaining shift instructions. · 813b21e3
      Sander de Smalen authored
      This patch completes support for shifts, which include:
      - LSL   - Logical Shift Left
      - LSLR  - Logical Shift Left, Reversed form
      - LSR   - Logical Shift Right
      - LSRR  - Logical Shift Right, Reversed form
      - ASR   - Arithmetic Shift Right
      - ASRR  - Arithmetic Shift Right, Reversed form
      - ASRD  - Arithmetic Shift Right for Divide
      
      In the following variants:
      
      - Predicated shift by immediate - ASR, LSL, LSR, ASRD
        e.g.
          asr z0.h, p0/m, z0.h, #1
      
        (active lanes of z0 shifted by #1)
      
      - Unpredicated shift by immediate - ASR, LSL*, LSR*
        e.g.
          asr z0.h, z1.h, #1
      
        (all lanes of z1 shifted by #1, stored in z0)
      
      - Predicated shift by vector - ASR, LSL*, LSR*
        e.g.
          asr z0.h, p0/m, z0.h, z1.h
      
        (active lanes of z0 shifted by z1, stored in z0)
      
      - Predicated shift by vector, reversed form - ASRR, LSLR, LSRR
        e.g.
          lslr z0.h, p0/m, z0.h, z1.h
      
        (active lanes of z1 shifted by z0, stored in z0)
      
      - Predicated shift left/right by wide vector - ASR, LSL, LSR
        e.g.
          lsl z0.h, p0/m, z0.h, z1.d
      
        (active lanes of z0 shifted by wide elements of vector z1)
      
      - Unpredicated shift left/right by wide vector - ASR, LSL, LSR
        e.g.
          lsl z0.h, z1.h, z2.d
      
        (all lanes of z1 shifted by wide elements of z2, stored in z0)
      
      *Variants added in previous patches.
      
      llvm-svn: 336547
      813b21e3
    • Stefan Maksimovic's avatar
      [mips] Addition of the [d]rem and [d]remu instructions · 0a23998f
      Stefan Maksimovic authored
      Related to http://reviews.llvm.org/D15772
      Depends on http://reviews.llvm.org/D16889
      Adds [D]REM[U] instructions.
      
      Patch By: Srdjan Obucina
      Contributions from: Simon Dardis
      
      Differential Revision: https://reviews.llvm.org/D17036
      
      llvm-svn: 336545
      0a23998f
    • Sander de Smalen's avatar
      [AArch64][SVE] Asm: Support for TBL instruction. · 54077dcf
      Sander de Smalen authored
      Support for SVE's TBL instruction for programmable table
      lookup/permute using vector of element indices, e.g.
      
        tbl  z0.d, { z1.d }, z2.d
      
      stores elements from z1, indexed by elements from z2, into z0.
      
      llvm-svn: 336544
      54077dcf
    • Sander de Smalen's avatar
      [AArch64][SVE] Asm: Support for ADR instruction. · c69944c6
      Sander de Smalen authored
      Supporting various addressing modes:
      - adr z0.s, [z0.s, z0.s]
      - adr z0.s, [z0.s, z0.s, lsl #<shift>]
      - adr z0.d, [z0.d, z0.d]
      - adr z0.d, [z0.d, z0.d, lsl #<shift>]
      - adr z0.d, [z0.d, z0.d, uxtw #<shift>]
      - adr z0.d, [z0.d, z0.d, sxtw #<shift>]
      
      Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
      
      Reviewed By: SjoerdMeijer
      
      Differential Revision: https://reviews.llvm.org/D48870
      
      llvm-svn: 336533
      c69944c6
    • Sander de Smalen's avatar
      [AArch64][SVE] Asm: Support for UZP and TRN instructions. · bd513b42
      Sander de Smalen authored
      This patch adds support for:
        UZP1  Concatenate even elements from two vectors
        UZP2  Concatenate  odd elements from two vectors
        TRN1  Interleave  even elements from two vectors
        TRN2  Interleave   odd elements from two vectors
      
      With variants for both data and predicate vectors, e.g.
        uzp1    z0.b, z1.b, z2.b
        trn2    p0.s, p1.s, p2.s
      
      llvm-svn: 336531
      bd513b42
    • Craig Topper's avatar
      [X86] Improve the message for some asserts. Remove an if that is guaranteed true by said asserts. · b8145ec6
      Craig Topper authored
      This replaces some asserts in lowerV2F64VectorShuffle with the similar asserts from lowerVIF64VectorShuffle which are more readable. The original asserts mentioned a blend, but there's no guarantee that it is a blend.
      
      Also remove an if that the asserts prove is always true. Mask[0] is always less than 2 and Mask[1] is always at least 2. Therefore (Mask[0] >= 2) + (Mask[1] >= 2) == 1 must wlays be true.
      
      llvm-svn: 336517
      b8145ec6
    • Craig Topper's avatar
      [X86] Remove an AddedComplexity line that seems unnecessary. · c98c675f
      Craig Topper authored
      It only existed on SSE and AVX version. AVX512 version didn't have it.
      
      I checked the generated table and this didn't seem necessary to creat a match preference.
      
      llvm-svn: 336516
      c98c675f
  2. Jul 08, 2018
  3. Jul 07, 2018
  4. Jul 06, 2018
  5. Jul 05, 2018
Loading