- Jun 12, 2017
-
-
Tony Jiang authored
Power9 has instructions that will reverse the bytes within an element for all sizes (half-word, word, double-word and quad-word). These can be used for the vec_revb builtins in altivec.h. However, we implement these to match vector shuffle nodes as that will cover both the builtins and vector shuffles that occur in the SDAG through other means. Differential Revision: https://reviews.llvm.org/D33690 llvm-svn: 305214
-
Tony Jiang authored
Note that if we need the result of both the divide and the modulo then we compute the modulo based on the result of the divide and not using the new hardware instruction. Commit on behalf of STEFAN PINTILIE. Differential Revision: https://reviews.llvm.org/D33940 llvm-svn: 305210
-
Sanjay Patel authored
This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192
-
Daniel Neilson authored
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
-
- Jun 08, 2017
-
-
Guozhi Wei authored
In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64. This patch fixed PR32442. Differential Revision: https://reviews.llvm.org/D31407 llvm-svn: 305001
-
Zaara Syeda authored
This patch adds build vector patterns to exploit the vector integer extend instructions: vextsb2w - Vector Extend Sign Byte To Word vextsb2d - Vector Extend Sign Byte To Doubleword vextsh2w - Vector Extend Sign Halfword To Word vextsh2d - Vector Extend Sign Halfword To Doubleword vextsw2d - Vector Extend Sign Word To Doubleword Differential Revision: https://reviews.llvm.org/D33510 llvm-svn: 304992
-
- Jun 07, 2017
-
-
Nemanja Ivanovic authored
Adds handling for i64 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33720 llvm-svn: 304907
-
Nemanja Ivanovic authored
Adds handling for i32 SETNE comparison (both sign and zero extended). Differential Revision: https://reviews.llvm.org/D33718 llvm-svn: 304901
-
Zachary Turner authored
This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864
-
- Jun 06, 2017
-
-
Chandler Carruth authored
I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
-
- May 31, 2017
-
-
Matthias Braun authored
This adds a callback to the LLVMTargetMachine that lets target indicate that they do not pass the machine verifier checks in all cases yet. This is intended to be a temporary measure while the targets are fixed allowing us to enable the machine verifier by default with EXPENSIVE_CHECKS enabled! Differential Revision: https://reviews.llvm.org/D33696 llvm-svn: 304320
-
Sean Fertile authored
Fixes PPCTTIImpl::getCacheLineSize() returning the wrong cache line size for newer ppc processors. Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D33656 llvm-svn: 304317
-
Zaara Syeda authored
This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
-
Tony Jiang authored
There are some VectorShuffle Nodes in SDAG which can be selected to XXPERMDI Instruction, this patch recognizes them and does the selection to improve the PPC performance. Differential Revision: https://reviews.llvm.org/D33404 llvm-svn: 304298
-
Nemanja Ivanovic authored
This patch builds upon https://reviews.llvm.org/rL302810 to add handling for the 64-bit SETEQ patterns. Differential Revision: https://reviews.llvm.org/D33369 llvm-svn: 304286
-
Nemanja Ivanovic authored
This patch builds upon https://reviews.llvm.org/rL302810 to add handling for bitwise logical operations in general purpose registers. The idea is to keep the values in GPRs as long as possible - only extracting them to a condition register bit when no further operations are to be done. Differential Revision: https://reviews.llvm.org/D31851 llvm-svn: 304282
-
- May 30, 2017
-
-
Matthias Braun authored
TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247
-
Craig Topper authored
Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215
-
- May 29, 2017
-
-
Hiroshi Inoue authored
Summary clang -c -mcpu=pwr9 test/CodeGen/PowerPC/build-vector-tests.ll causes an assertion failure during the binary encoding. The failure occurs when a D-form load instruction takes two register operands instead of a register + an immediate. This patch fixes the problem and also adds an assertion to catch this failure earlier before the binary encoding (i.e. during lit test). The fix is from Nemanja Ivanovic @nemanjai. Differential Revision: https://reviews.llvm.org/D33482 llvm-svn: 304133
-
- May 26, 2017
-
-
Matthias Braun authored
- Take reference instead of pointer to a TRI that cannot be nullptr. - Improve documentation comments. llvm-svn: 304038
-
Tim Shen authored
llvm-svn: 303940
-
Tim Shen authored
I forgot to forward the chain, causing some missing instruction dependencies. The test crashes the compiler without this patch. Inspired by the test case, D33519 also tries to remove the extra sync. Differential Revision: https://reviews.llvm.org/D33573 llvm-svn: 303931
-
- May 25, 2017
-
-
Kyle Butt authored
PPC::GETtlsADDR is lowered to a branch and a nop, by the assembly printer. Its size was incorrectly marked as 4, correct it to 8. The incorrect size can cause incorrect branch relaxation in PPCBranchSelector under the right conditions. llvm-svn: 303904
-
Tony Jiang authored
There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI instruction, this patch recognizes them and does the selection to improve the PPC performance. llvm-svn: 303822
-
- May 24, 2017
-
-
- May 21, 2017
-
-
Hiroshi Inoue authored
PPC backend eliminates compare instructions by using record-form instructions in PPCInstrInfo::optimizeCompareInstr, which is called from peephole optimization pass. This patch improves this optimization to eliminate more compare instructions in two types of common case. - comparison against a constant 1 or -1 The record-form instructions set CR bit based on signed comparison against 0. So, the current implementation does not exploit the record-form instruction for comparison against a non-zero constant. This patch enables record-form optimization for constant of 1 or -1 if possible; it changes the condition "greater than -1" into "greater than or equal to 0" and "less than 1" into "less than or equal to 0". With this patch, compare can be eliminated in the following code sequence, as an example. uint64_t a, b; if ((a | b) & 0x8000000000000000ull) { ... } else { ... } - andi for 32-bit comparison on PPC64 Since record-form instructions execute 64-bit signed comparison and so we have limitation in eliminating 32-bit comparison, i.e. with cmplwi, using the record-form. The original implementation already has such checks but andi. is not recognized as an instruction which executes implicit zero extension and hence safe to convert into record-form if used for equality check. %1 = and i32 %a, 10 %2 = icmp ne i32 %1, 0 br i1 %2, label %foo, label %bar In this simple example, LLVM generates andi. + cmplwi + beq on PPC64. This patch make it possible to eliminate the cmplwi for this case. I added andi. for optimization targets if it is safe to do so. Differential Revision: https://reviews.llvm.org/D30081 llvm-svn: 303500
-
- May 18, 2017
-
-
Francis Visoiu Mistrih authored
This provides a new way to access the TargetMachine through TargetPassConfig, as a dependency. The patterns replaced here are: * Passes handling a null TargetMachine call `getAnalysisIfAvailable<TargetPassConfig>`. * Passes not handling a null TargetMachine `addRequired<TargetPassConfig>` and call `getAnalysis<TargetPassConfig>`. * MachineFunctionPasses now use MF.getTarget(). * Remove all the TargetMachine constructors. * Remove INITIALIZE_TM_PASS. This fixes a crash when running `llc -start-before prologepilog`. PEI needs StackProtector, which gets constructed without a TargetMachine by the pass manager. The StackProtector pass doesn't handle the case where there is no TargetMachine, so it segfaults. Related to PR30324. Differential Revision: https://reviews.llvm.org/D33222 llvm-svn: 303360
-
- May 17, 2017
-
-
Kyle Butt authored
When legalizing vector operations on vNi128, they will be split to v1i128 because that is a legal type on ppc64, but then the compiler will crash in selection dag because it fails to select for these operations. This patch fixes shift operations. Logical shift right and left shift can be performed in the vector unit, but algebraic shift right requires being split. Differential Revision: https://reviews.llvm.org/D32774 llvm-svn: 303307
-
Krzysztof Parzyszek authored
The variables MinGPR/MinG8R were not updated properly when resetting the offsets, which in the included testcase lead to saving the CR register in the same location as R30. This fixes another issue reported in PR26519. Differential Revision: https://reviews.llvm.org/D33017 llvm-svn: 303257
-
- May 16, 2017
-
-
Tim Shen authored
Summary: This fixes pr32392. The lowering pipeline is: llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in expandPostRAPseudo. The reason why expandPostRAPseudo is chosen is because previous passes are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne- 7, .+4 (some branch pass(s)). Differential Revision: https://reviews.llvm.org/D32763 llvm-svn: 303205
-
- May 12, 2017
-
-
Tim Shen authored
Summary: Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc., because those are not defined for b > sizeof(a) * 8, even after some of the combiners run. However, PPCISD::SHL defines that behavior (as the instructions themselves). Move the combination to the backend. The tests in shift_mask.ll still pass. Reviewers: echristo, hfinkel, efriedma, iteratee Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D33076 llvm-svn: 302937
-
Guozhi Wei authored
[PPC] Change the register constraint of the first source operand of instruction mtvsrdd to g8rc_nox0 According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0. This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified. Differential Revision: https://reviews.llvm.org/D32880 llvm-svn: 302834
-
- May 11, 2017
-
-
Nemanja Ivanovic authored
This patch is the first in a series of patches to provide code gen for doing compares in GPRs when the compare result is required in a GPR. It adds the infrastructure to select GPR sequences for i1->i32 and i1->i64 extensions. This first patch handles equality comparison on i32 operands with the result sign or zero extended. Differential Revision: https://reviews.llvm.org/D31847 llvm-svn: 302810
-
- May 09, 2017
-
-
Tim Shen authored
Now both emitLeadingFence and emitTrailingFence take the instruction itself, instead of taking IsLoad/IsStore pairs. Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used for determining those two booleans. The instruction argument is also useful for later D32763, in emitTrailingFence. For emitLeadingFence, it seems to have cleaner interface with the proposed change. Differential Revision: https://reviews.llvm.org/D32762 llvm-svn: 302539
-
Serge Pavlov authored
Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
-
- May 05, 2017
-
-
Craig Topper authored
[KnownBits] Add wrapper methods for setting and clear all bits in the underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262
-
- May 04, 2017
-
-
Krzysztof Parzyszek authored
This happened on the PPC32/SVR4 path and was discovered when building FreeBSD on PPC32. It was a typo-class error in the frame lowering code. This fixes PR26519. llvm-svn: 302183
-
- May 03, 2017
-
-
Tim Shen authored
Summary: This is the corresponding llvm change to D28037 to ensure no performance regression. Reviewers: bogner, kbarton, hfinkel, iteratee, echristo Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28329 llvm-svn: 301990
-
- May 02, 2017
-
-
Nemanja Ivanovic authored
Fixes PR30730. This is a re-commit of a pulled commit. The commit was pulled because some software projects contained uses of Altivec vectors that violated alignment requirements. Known issues have now been fixed. Committing on behalf of Lei Huang. Differential Revision: https://reviews.llvm.org/D26861 llvm-svn: 301892
-
- May 01, 2017
-
-
Amara Emerson authored
This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
-