Skip to content
  1. Jan 26, 2017
  2. Dec 15, 2016
    • Nemanja Ivanovic's avatar
      [Power9] Allow AnyExt immediates for XXSPLTIB · 552c8e96
      Nemanja Ivanovic authored
      In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
      a splat of an i8 constant will end up with signed 8-bit values and other
      situations, it'll end up with unsigned ones. Handle both situations.
      
      Fixes PR31340.
      
      llvm-svn: 289804
      552c8e96
  3. Dec 09, 2016
  4. Dec 06, 2016
    • Nemanja Ivanovic's avatar
      [PowerPC] Improvements for BUILD_VECTOR Vol. 4 · 15748f49
      Nemanja Ivanovic authored
      This is the final patch in the series of patches that improves
      BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
      to remove redundant instructions. It also adds a large test case which
      encompasses a large set of code patterns that build vectors - this test
      case was the motivator for this series of patches.
      
      Differential Revision: https://reviews.llvm.org/D26066
      
      llvm-svn: 288800
      15748f49
  5. Nov 30, 2016
  6. Nov 29, 2016
  7. Nov 23, 2016
    • Nemanja Ivanovic's avatar
      [PowerPC] Remove InstAlias definitions that cause incorrect assembly · 10fc3cfc
      Nemanja Ivanovic authored
      In rL283190, I added some InstAlias definitions to generate extended mnemonics
      for some uses of the XXPERMDI instruction. However, when the assembler matches
      these extended mnemonics, it matches the new instruction in situations where it
      should match the old one.
      This patch removes these definitions and accomplishes that by defining these
      mnemonics with additional instructions that are isCodeGenOnly.
      
      Fixes PR31127.
      
      llvm-svn: 287765
      10fc3cfc
  8. Nov 22, 2016
  9. Nov 15, 2016
    • Zaara Syeda's avatar
      vector load store with length (left justified) llvm portion · a19c9e60
      Zaara Syeda authored
      llvm-svn: 286993
      a19c9e60
    • Tony Jiang's avatar
      [PowerPC] Implement BE VSX load/store builtins - llvm portion. · 5f850cd1
      Tony Jiang authored
      This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
      they behaves exactly the same with vec_xl and vec_xst, therefore they are
      simply implemented by defining a matching macro. On LE, they are implemented
      by defining new builtins and intrinsics. For int/float/long long/double, it
      is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
      we also need some extra shuffling before or after call the builtins to get the
      desired BE order. For int128, simply call vec_xl or vec_xst.
      
      llvm-svn: 286967
      5f850cd1
  10. Nov 14, 2016
  11. Nov 11, 2016
  12. Oct 26, 2016
  13. Oct 24, 2016
  14. Oct 04, 2016
    • Nemanja Ivanovic's avatar
      [Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set · 6354d235
      Nemanja Ivanovic authored
      This patch corresponds to review:
      
      The newly added VSX D-Form (register + offset) memory ops target the upper half
      of the VSX register set. The existing ones target the lower half. In order to
      unify these and have the ability to target all the VSX registers using D-Form
      operations, this patch defines Pseudo-ops for the loads/stores which are
      expanded post-RA. The expansion then choses the correct opcode based on the
      register that was allocated for the operation.
      
      llvm-svn: 283212
      6354d235
    • Nemanja Ivanovic's avatar
      [Power9] Part-word VSX integer scalar loads/stores and sign extend instructions · 11049f8f
      Nemanja Ivanovic authored
      This patch corresponds to review:
      https://reviews.llvm.org/D23155
      
      This patch removes the VSHRC register class (based on D20310) and adds
      exploitation of the Power9 sub-word integer loads into VSX registers as well
      as vector sign extensions.
      The new instructions are useful for a few purposes:
      
          Int to Fp conversions of 1 or 2-byte values loaded from memory
          Building vectors of 1 or 2-byte integers with values loaded from memory
          Storing individual 1 or 2-byte elements from integer vectors
      
      This patch implements all of those uses.
      
      llvm-svn: 283190
      11049f8f
  15. Sep 27, 2016
    • Nemanja Ivanovic's avatar
      [Power9] Builtins for ELF v.2 API conformance - back end portion · 6f22b413
      Nemanja Ivanovic authored
      This patch corresponds to review:
      https://reviews.llvm.org/D24396
      
      This patch adds support for the "vector count trailing zeroes",
      "vector compare not equal" and "vector compare not equal or zero instructions"
      as well as "scalar count trailing zeroes" instructions. It also changes the
      vector negation to use XXLNOR (when VSX is enabled) so as not to increase
      register pressure (previously this was done with a splat immediate of all
      ones followed by an XXLXOR). This was done because the altivec.h
      builtins (patch to follow) use vector negation and the use of an additional
      register for the splat immediate is not optimal.
      
      llvm-svn: 282478
      6f22b413
  16. Sep 23, 2016
  17. Sep 22, 2016
  18. Aug 18, 2016
  19. Jul 18, 2016
  20. Jul 12, 2016
  21. Jul 05, 2016
    • Nemanja Ivanovic's avatar
      [PowerPC] - Legalize vector types by widening instead of integer promotion · 44513e54
      Nemanja Ivanovic authored
      This patch corresponds to review:
      http://reviews.llvm.org/D20443
      
      It changes the legalization strategy for illegal vector types from integer
      promotion to widening. This only applies for vectors with elements of width
      that is a multiple of a byte since we have hardware support for vectors with
      1, 2, 3, 8 and 16 byte elements.
      Integer promotion for vectors is quite expensive on PPC due to the sequence
      of breaking apart the vector, extending the elements and reconstituting the
      vector. Two of these operations are expensive.
      This patch causes between minor and major improvements in performance on most
      benchmarks. There are very few benchmarks whose performance regresses. These
      regressions can be handled in a subsequent patch with a DAG combine (similar
      to how this patch handles int -> fp conversions of illegal vector types).
      
      llvm-svn: 274535
      44513e54
  22. May 04, 2016
  23. Mar 31, 2016
  24. Mar 28, 2016
    • Chuang-Yu Cheng's avatar
      [Power9] Implement new vsx instructions: insert, extract, test data class,... · 80722719
      Chuang-Yu Cheng authored
      [Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat
      
      This change implements the following vsx instructions:
      
      - Scalar Insert/Extract
          xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp
      
      - Vector Insert/Extract
          xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
          xxextractuw xxinsertw
      
      - Scalar/Vector Test Data Class
          xststdcdp xststdcsp xststdcqp
          xvtstdcdp xvtstdcsp
      
      - Maximum/Minimum
          xsmaxcdp xsmaxjdp
          xsmincdp xsminjdp
      
      - Vector Byte-Reverse/Permute/Splat
          xxbrd xxbrh xxbrq xxbrw
          xxperm xxpermr
          xxspltib
      
      30 instructions
      
      Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
      Reviewers: hal, nemanja, kbarton, tjablin, amehsan
      
      http://reviews.llvm.org/D16842
      
      llvm-svn: 264567
      80722719
    • Chuang-Yu Cheng's avatar
      [Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic · 56638489
      Chuang-Yu Cheng authored
      This change implements the following vsx instructions:
      
      - quad-precision move
          xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp
      
      - quad-precision fp-arithmetic
          xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
          xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)
      
      22 instructions
      
      Thanks Nemanja and Kit for careful review and invaluable discussion!
      Reviewers: hal, nemanja, kbarton, tjablin, amehsan
      
      http://reviews.llvm.org/D16110
      
      llvm-svn: 264565
      56638489
  25. Mar 08, 2016
  26. Feb 26, 2016
    • Kit Barton's avatar
      Power9] Implement new vsx instructions: compare and conversion · 93612ec5
      Kit Barton authored
      This change implements the following vsx instructions:
      
      Quad/Double-Precision Compare:
      xscmpoqp xscmpuqp
      xscmpexpdp xscmpexpqp
      xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp
      xvcmpnedp(.) xvcmpnesp(.)
      Quad-Precision Floating-Point Conversion
      xscvqpdp(o) xscvdpqp
      xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp
      xscvdphp xscvhpdp xvcvhpsp xvcvsphp
      xsrqpi xsrqpix xsrqpxp
      28 instructions
      
      Phabricator: http://reviews.llvm.org/D16709
      llvm-svn: 262068
      93612ec5
  27. Dec 15, 2015
  28. Dec 11, 2015
    • Matt Arsenault's avatar
      Start replacing vector_extract/vector_insert with extractelt/insertelt · fbd9bbfd
      Matt Arsenault authored
      These are redundant pairs of nodes defined for
      INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
      insertelement/extractelement are slightly closer to the corresponding
      C++ node name, and has stricter type checking so prefer it.
      
      Update targets to only use these nodes where it is trivial to do so.
      AArch64, ARM, and Mips all have various type errors on simple replacement,
      so they will need work to fix.
      
      Example from AArch64:
      
      def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
                (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;
      
      Which is trying to do sext_inreg i8, i8.
      
      llvm-svn: 255359
      fbd9bbfd
  29. Dec 10, 2015
  30. Oct 09, 2015
    • Nemanja Ivanovic's avatar
      Vector element extraction without stack operations on Power 8 · d3896573
      Nemanja Ivanovic authored
      This patch corresponds to review:
      http://reviews.llvm.org/D12032
      
      This patch builds onto the patch that provided scalar to vector conversions
      without stack operations (D11471).
      Included in this patch:
      
          - Vector element extraction for all vector types with constant element number
          - Vector element extraction for v16i8 and v8i16 with variable element number
          - Removal of some unnecessary COPY_TO_REGCLASS operations that ended up
            unnecessarily moving things around between registers
      
      Not included in this patch (will be in upcoming patch):
      
          - Vector element extraction for v4i32, v4f32, v2i64 and v2f64 with
            variable element number
          - Vector element insertion for variable/constant element number
      
      Testing is provided for all extractions. The extractions that are not
      implemented yet are just placeholders.
      
      llvm-svn: 249822
      d3896573
  31. Sep 29, 2015
  32. Aug 31, 2015
    • Hal Finkel's avatar
      [PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands · a2cdbce6
      Hal Finkel authored
      There were really two problems here. The first was that we had the truth tables
      for signed i1 comparisons backward. I imagine these are not very common, but if
      you have:
        setcc i1 x, y, LT
      this has the '0 1' and the '1 0' results flipped compared to:
        setcc i1 x, y, ULT
      because, in the signed case, '1 0' is really '-1 0', and the answer is not the
      same as in the unsigned case.
      
      The second problem was that we did not have patterns (at all) for the unsigned
      comparisons select_cc nodes for i1 comparison operands. This was the specific
      cause of PR24552. These had to be added (and a missing Altivec promotion added
      as well) to make sure these function for all types. I've added a bunch more
      test cases for these patterns, and there are a few FIXMEs in the test case
      regarding code-quality.
      
      Fixes PR24552.
      
      llvm-svn: 246400
      a2cdbce6
  33. Aug 13, 2015
    • Nemanja Ivanovic's avatar
      Scalar to vector conversions using direct moves · 1c39ca65
      Nemanja Ivanovic authored
      This patch corresponds to review:
      http://reviews.llvm.org/D11471
      
      It improves the code generated for converting a scalar to a vector value. With
      direct moves from GPRs to VSRs, we no longer require expensive stack operations
      for this. Subsequent patches will handle the reverse case and more general
      operations between vectors and their scalar elements.
      
      llvm-svn: 244921
      1c39ca65
Loading