Skip to content
  1. Jun 08, 2017
    • Zaara Syeda's avatar
      [Power9] Exploit vector integer extend instructions · 79acbbe5
      Zaara Syeda authored
      This patch adds build vector patterns to exploit the vector integer
      extend instructions:
      vextsb2w - Vector Extend Sign Byte To Word
      vextsb2d - Vector Extend Sign Byte To Doubleword
      vextsh2w - Vector Extend Sign Halfword To Word
      vextsh2d - Vector Extend Sign Halfword To Doubleword
      vextsw2d - Vector Extend Sign Word To Doubleword
      
      Differential Revision: https://reviews.llvm.org/D33510
      
      llvm-svn: 304992
      79acbbe5
  2. May 31, 2017
  3. May 29, 2017
    • Hiroshi Inoue's avatar
      [PPC] Fix assertion failure during binary encoding with -mcpu=pwr9 · e3c14ebb
      Hiroshi Inoue authored
      Summary
      clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding.
      The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate.
      
      This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test).
      The fix is from Nemanja Ivanovic @nemanjai.
      
      Differential Revision: https://reviews.llvm.org/D33482
      
      llvm-svn: 304133
      e3c14ebb
  4. May 25, 2017
  5. May 24, 2017
  6. May 12, 2017
    • Guozhi Wei's avatar
      [PPC] Change the register constraint of the first source operand of... · 22e7da95
      Guozhi Wei authored
      [PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0
      
      According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.
      
      This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.
      
      Differential Revision: https://reviews.llvm.org/D32880
      
      llvm-svn: 302834
      22e7da95
  7. May 02, 2017
  8. Mar 30, 2017
  9. Mar 15, 2017
  10. Jan 26, 2017
  11. Dec 15, 2016
    • Nemanja Ivanovic's avatar
      [Power9] Allow AnyExt immediates for XXSPLTIB · 552c8e96
      Nemanja Ivanovic authored
      In some situations, the BUILD_VECTOR node that builds a v18i8 vector by
      a splat of an i8 constant will end up with signed 8-bit values and other
      situations, it'll end up with unsigned ones. Handle both situations.
      
      Fixes PR31340.
      
      llvm-svn: 289804
      552c8e96
  12. Dec 09, 2016
  13. Dec 06, 2016
    • Nemanja Ivanovic's avatar
      [PowerPC] Improvements for BUILD_VECTOR Vol. 4 · 15748f49
      Nemanja Ivanovic authored
      This is the final patch in the series of patches that improves
      BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
      to remove redundant instructions. It also adds a large test case which
      encompasses a large set of code patterns that build vectors - this test
      case was the motivator for this series of patches.
      
      Differential Revision: https://reviews.llvm.org/D26066
      
      llvm-svn: 288800
      15748f49
  14. Nov 30, 2016
  15. Nov 29, 2016
  16. Nov 23, 2016
    • Nemanja Ivanovic's avatar
      [PowerPC] Remove InstAlias definitions that cause incorrect assembly · 10fc3cfc
      Nemanja Ivanovic authored
      In rL283190, I added some InstAlias definitions to generate extended mnemonics
      for some uses of the XXPERMDI instruction. However, when the assembler matches
      these extended mnemonics, it matches the new instruction in situations where it
      should match the old one.
      This patch removes these definitions and accomplishes that by defining these
      mnemonics with additional instructions that are isCodeGenOnly.
      
      Fixes PR31127.
      
      llvm-svn: 287765
      10fc3cfc
  17. Nov 22, 2016
  18. Nov 15, 2016
    • Zaara Syeda's avatar
      vector load store with length (left justified) llvm portion · a19c9e60
      Zaara Syeda authored
      llvm-svn: 286993
      a19c9e60
    • Tony Jiang's avatar
      [PowerPC] Implement BE VSX load/store builtins - llvm portion. · 5f850cd1
      Tony Jiang authored
      This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
      they behaves exactly the same with vec_xl and vec_xst, therefore they are
      simply implemented by defining a matching macro. On LE, they are implemented
      by defining new builtins and intrinsics. For int/float/long long/double, it
      is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
      we also need some extra shuffling before or after call the builtins to get the
      desired BE order. For int128, simply call vec_xl or vec_xst.
      
      llvm-svn: 286967
      5f850cd1
  19. Nov 14, 2016
  20. Nov 11, 2016
  21. Oct 26, 2016
  22. Oct 24, 2016
  23. Oct 04, 2016
    • Nemanja Ivanovic's avatar
      [Power9] Exploit D-Form VSX Scalar memory ops that target full VSX register set · 6354d235
      Nemanja Ivanovic authored
      This patch corresponds to review:
      
      The newly added VSX D-Form (register + offset) memory ops target the upper half
      of the VSX register set. The existing ones target the lower half. In order to
      unify these and have the ability to target all the VSX registers using D-Form
      operations, this patch defines Pseudo-ops for the loads/stores which are
      expanded post-RA. The expansion then choses the correct opcode based on the
      register that was allocated for the operation.
      
      llvm-svn: 283212
      6354d235
    • Nemanja Ivanovic's avatar
      [Power9] Part-word VSX integer scalar loads/stores and sign extend instructions · 11049f8f
      Nemanja Ivanovic authored
      This patch corresponds to review:
      https://reviews.llvm.org/D23155
      
      This patch removes the VSHRC register class (based on D20310) and adds
      exploitation of the Power9 sub-word integer loads into VSX registers as well
      as vector sign extensions.
      The new instructions are useful for a few purposes:
      
          Int to Fp conversions of 1 or 2-byte values loaded from memory
          Building vectors of 1 or 2-byte integers with values loaded from memory
          Storing individual 1 or 2-byte elements from integer vectors
      
      This patch implements all of those uses.
      
      llvm-svn: 283190
      11049f8f
  24. Sep 27, 2016
    • Nemanja Ivanovic's avatar
      [Power9] Builtins for ELF v.2 API conformance - back end portion · 6f22b413
      Nemanja Ivanovic authored
      This patch corresponds to review:
      https://reviews.llvm.org/D24396
      
      This patch adds support for the "vector count trailing zeroes",
      "vector compare not equal" and "vector compare not equal or zero instructions"
      as well as "scalar count trailing zeroes" instructions. It also changes the
      vector negation to use XXLNOR (when VSX is enabled) so as not to increase
      register pressure (previously this was done with a splat immediate of all
      ones followed by an XXLXOR). This was done because the altivec.h
      builtins (patch to follow) use vector negation and the use of an additional
      register for the splat immediate is not optimal.
      
      llvm-svn: 282478
      6f22b413
  25. Sep 23, 2016
  26. Sep 22, 2016
  27. Aug 18, 2016
  28. Jul 18, 2016
  29. Jul 12, 2016
  30. Jul 05, 2016
    • Nemanja Ivanovic's avatar
      [PowerPC] - Legalize vector types by widening instead of integer promotion · 44513e54
      Nemanja Ivanovic authored
      This patch corresponds to review:
      http://reviews.llvm.org/D20443
      
      It changes the legalization strategy for illegal vector types from integer
      promotion to widening. This only applies for vectors with elements of width
      that is a multiple of a byte since we have hardware support for vectors with
      1, 2, 3, 8 and 16 byte elements.
      Integer promotion for vectors is quite expensive on PPC due to the sequence
      of breaking apart the vector, extending the elements and reconstituting the
      vector. Two of these operations are expensive.
      This patch causes between minor and major improvements in performance on most
      benchmarks. There are very few benchmarks whose performance regresses. These
      regressions can be handled in a subsequent patch with a DAG combine (similar
      to how this patch handles int -> fp conversions of illegal vector types).
      
      llvm-svn: 274535
      44513e54
  31. May 04, 2016
  32. Mar 31, 2016
  33. Mar 28, 2016
    • Chuang-Yu Cheng's avatar
      [Power9] Implement new vsx instructions: insert, extract, test data class,... · 80722719
      Chuang-Yu Cheng authored
      [Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat
      
      This change implements the following vsx instructions:
      
      - Scalar Insert/Extract
          xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp
      
      - Vector Insert/Extract
          xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
          xxextractuw xxinsertw
      
      - Scalar/Vector Test Data Class
          xststdcdp xststdcsp xststdcqp
          xvtstdcdp xvtstdcsp
      
      - Maximum/Minimum
          xsmaxcdp xsmaxjdp
          xsmincdp xsminjdp
      
      - Vector Byte-Reverse/Permute/Splat
          xxbrd xxbrh xxbrq xxbrw
          xxperm xxpermr
          xxspltib
      
      30 instructions
      
      Thanks Nemanja for invaluable discussion! Thanks Kit's great help!
      Reviewers: hal, nemanja, kbarton, tjablin, amehsan
      
      http://reviews.llvm.org/D16842
      
      llvm-svn: 264567
      80722719
    • Chuang-Yu Cheng's avatar
      [Power9] Implement new vsx instructions: quad-precision move, fp-arithmetic · 56638489
      Chuang-Yu Cheng authored
      This change implements the following vsx instructions:
      
      - quad-precision move
          xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp
      
      - quad-precision fp-arithmetic
          xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o)
          xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o)
      
      22 instructions
      
      Thanks Nemanja and Kit for careful review and invaluable discussion!
      Reviewers: hal, nemanja, kbarton, tjablin, amehsan
      
      http://reviews.llvm.org/D16110
      
      llvm-svn: 264565
      56638489
Loading