- Jun 08, 2017
-
-
Zaara Syeda authored
This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 llvm-svn: 304992
-
- May 31, 2017
-
-
Tony Jiang authored
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI Instruction, this patch recognizes them and does the selection to improve the PPC performance. Differential Revision: https://reviews.llvm.org/D33404 llvm-svn: 304298
-
- May 29, 2017
-
-
Hiroshi Inoue authored
Summary clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding. The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate. This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test). The fix is from Nemanja Ivanovic @nemanjai. Differential Revision: https://reviews.llvm.org/D33482 llvm-svn: 304133
-
- May 25, 2017
-
-
Tony Jiang authored
There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI instruction, this patch recognizes them and does the selection to improve the PPC performance. llvm-svn: 303822
-
- May 24, 2017
-
-
- May 12, 2017
-
-
Guozhi Wei authored
[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 llvm-svn: 302834
-
- May 02, 2017
-
-
Nemanja Ivanovic authored
Fixes PR30730. This is a re-commit of a pulled commit. The commit was pulled because some software projects contained uses of Altivec vectors that violated alignment requirements. Known issues have now been fixed. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D26861 llvm-svn: 301892
-
- Mar 30, 2017
-
-
Simon Pilgrim authored
Based on corrections mentioned in patch for clang for PR27635 llvm-svn: 299072
-
- Mar 15, 2017
-
-
Nemanja Ivanovic authored
mfvrd and mffprd are both alias to mfvrsd. This patch enables correct parsing of the aliases, but we still emit a mfvrsd. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29177 llvm-svn: 297849
-
- Jan 26, 2017
-
-
Sean Fertile authored
1) Explicitly sets mayLoad/mayStore property in the tablegen files on load/store instructions. 2) Updated the flags on a number of intrinsics indicating that they write memory. 3) Added SDNPMemOperand flags for some target dependent SDNodes so that they propagate their memory operand Review: https://reviews.llvm.org/D28818 llvm-svn: 293200
-
- Dec 15, 2016
-
-
Nemanja Ivanovic authored
In some situations, the BUILD_VECTOR node that builds a v18i8 vector by a splat of an i8 constant will end up with signed 8-bit values and other situations, it'll end up with unsigned ones. Handle both situations. Fixes PR31340. llvm-svn: 289804
-
- Dec 09, 2016
-
-
Sean Fertile authored
Revision: https://reviews.llvm.org/D26547 llvm-svn: 289227
-
- Dec 06, 2016
-
-
Nemanja Ivanovic authored
This is the final patch in the series of patches that improves BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations to remove redundant instructions. It also adds a large test case which encompasses a large set of code patterns that build vectors - this test case was the motivator for this series of patches. Differential Revision: https://reviews.llvm.org/D26066 llvm-svn: 288800
-
- Nov 30, 2016
-
-
https://reviews.llvm.org/rL287679Nemanja Ivanovic authored
This commit caused some miscompiles that did not show up on any of the bots. Reverting until we can investigate the cause of those failures. llvm-svn: 288214
-
- Nov 29, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D25912 This is the first patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. llvm-svn: 288152
-
- Nov 23, 2016
-
-
Nemanja Ivanovic authored
In rL283190, I added some InstAlias definitions to generate extended mnemonics for some uses of the XXPERMDI instruction. However, when the assembler matches these extended mnemonics, it matches the new instruction in situations where it should match the old one. This patch removes these definitions and accomplishes that by defining these mnemonics with additional instructions that are isCodeGenOnly. Fixes PR31127. llvm-svn: 287765
-
- Nov 22, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D26861 It also fixes PR30730. Committing on behalf of Lei Huang. llvm-svn: 287679
-
- Nov 15, 2016
-
-
Zaara Syeda authored
llvm-svn: 286993
-
Tony Jiang authored
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE, they behaves exactly the same with vec_xl and vec_xst, therefore they are simply implemented by defining a matching macro. On LE, they are implemented by defining new builtins and intrinsics. For int/float/long long/double, it is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short, we also need some extra shuffling before or after call the builtins to get the desired BE order. For int128, simply call vec_xl or vec_xst. llvm-svn: 286967
-
- Nov 14, 2016
-
-
Sean Fertile authored
add an intrinsic to expose the 'VSX Scalar Convert Half-Precision to Single-Precision' instruction. Differential review: https://reviews.llvm.org/D26536 llvm-svn: 286862
-
Sean Fertile authored
Differential Revision: https://reviews.llvm.org/D26272 llvm-svn: 286829
-
- Nov 11, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D26480 Adds all the intrinsics used for various permute builtins that will be added to altivec.h. llvm-svn: 286638
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D26307 Adds all the intrinsics used for various conversion builtins that will be added to altivec.h. These are type conversions between various types of vectors. llvm-svn: 286596
-
- Oct 26, 2016
-
-
Nemanja Ivanovic authored
This revision corresponds to review: https://reviews.llvm.org/D25957. Committing on behalf of Zaara Syeda. llvm-svn: 285225
-
- Oct 24, 2016
-
-
Ehsan Amiri authored
https://reviews.llvm.org/D23614 Currently we load +0.0 from constant area. That can change to be generated using XOR instruction. llvm-svn: 284995
-
- Oct 04, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: The newly added VSX D-Form (register + offset) memory ops target the upper half of the VSX register set. The existing ones target the lower half. In order to unify these and have the ability to target all the VSX registers using D-Form operations, this patch defines Pseudo-ops for the loads/stores which are expanded post-RA. The expansion then choses the correct opcode based on the register that was allocated for the operation. llvm-svn: 283212
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D23155 This patch removes the VSHRC register class (based on D20310) and adds exploitation of the Power9 sub-word integer loads into VSX registers as well as vector sign extensions. The new instructions are useful for a few purposes: Int to Fp conversions of 1 or 2-byte values loaded from memory Building vectors of 1 or 2-byte integers with values loaded from memory Storing individual 1 or 2-byte elements from integer vectors This patch implements all of those uses. llvm-svn: 283190
-
- Sep 27, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D24396 This patch adds support for the "vector count trailing zeroes", "vector compare not equal" and "vector compare not equal or zero instructions" as well as "scalar count trailing zeroes" instructions. It also changes the vector negation to use XXLNOR (when VSX is enabled) so as not to increase register pressure (previously this was done with a splat immediate of all ones followed by an XXLXOR). This was done because the altivec.h builtins (patch to follow) use vector negation and the use of an additional register for the splat immediate is not optimal. llvm-svn: 282478
-
- Sep 23, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D21135 This patch exploits the following instructions: mtvsrws lxvwsx mtvsrdd mfvsrld In order to improve some build_vector and extractelement patterns. llvm-svn: 282246
-
- Sep 22, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to: https://reviews.llvm.org/D21409 The LXVD2X, LXVW4X, STXVD2X and STXVW4X instructions permute the two doublewords in the vector register when in little-endian mode. Custom code ensures that the necessary swaps are inserted for these. This patch simply removes the possibilty that a load/store node will match one of these instructions in the SDAG as that would not insert the necessary swaps. llvm-svn: 282144
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D19825 The new lxvx/stxvx instructions do not require the swaps to line the elements up correctly. In order to select them over the lxvd2x/lxvw4x instructions which require swaps, the patterns for the old instruction have a predicate that ensures they won't be selected on Power9 and newer CPUs. llvm-svn: 282143
-
- Aug 18, 2016
-
-
Michael Kuperstein authored
The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129
-
- Jul 18, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: https://reviews.llvm.org/D21354 We use direct moves for extracting integer elements from vectors. We also use direct moves when converting integers to FP. When these operations are chained, we get a direct move out of a VSR followed by a direct move back into a VSR. These are redundant - all we need to do is line up the element and convert. llvm-svn: 275796
-
- Jul 12, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: http://reviews.llvm.org/D20239 It adds exploitation of XXINSERTW and XXEXTRACTUW instructions that are useful in some cases for inserting and extracting vector elements of v4[if]32 vectors. llvm-svn: 275215
-
Nemanja Ivanovic authored
This patch corresponds to review: http://reviews.llvm.org/D21358 Vector shifts that have the same semantics as a vector swap are cannonicalized as such to provide additional opportunities for swap removal optimization to remove unnecessary swaps. llvm-svn: 275168
-
- Jul 05, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535
-
- May 04, 2016
-
-
Nemanja Ivanovic authored
This patch corresponds to review: http://reviews.llvm.org/D18592 It allows the PPC back end to generate the xxspltw instruction where we previously only emitted vspltw. llvm-svn: 268516
-
- Mar 31, 2016
-
-
Ehsan Amiri authored
http://reviews.llvm.org/D18097 Initial support does not include any patterns to generate this instructions llvm-svn: 265031
-
- Mar 28, 2016
-
-
Chuang-Yu Cheng authored
[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat This change implements the following vsx instructions: - Scalar Insert/Extract xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp - Vector Insert/Extract xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp xxextractuw xxinsertw - Scalar/Vector Test Data Class xststdcdp xststdcsp xststdcqp xvtstdcdp xvtstdcsp - Maximum/Minimum xsmaxcdp xsmaxjdp xsmincdp xsminjdp - Vector Byte-Reverse/Permute/Splat xxbrd xxbrh xxbrq xxbrw xxperm xxpermr xxspltib 30 instructions Thanks Nemanja for invaluable discussion! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16842 llvm-svn: 264567
-
Chuang-Yu Cheng authored
This change implements the following vsx instructions: - quad-precision move xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp - quad-precision fp-arithmetic xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o) xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o) 22 instructions Thanks Nemanja and Kit for careful review and invaluable discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16110 llvm-svn: 264565
-