- Aug 14, 2017
-
-
Lei Huang authored
Add codegen for VSX word extract conversion from signed/unsigned to single/double precision. For UINT_TO_FP: Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239. Here we will add the missing extract integer and conversion to double. This utilizes the new P9 instruction xxextractuw to extracting an integer element when the result will be converted to double thereby saving 2 direct moves (VSR <-> GPR). For SINT_TO_FP: We will implement the following sequence which will also reduce the number of instructions by saving 2 direct moves. v4i32->f32: xxspltw xvcvsxwsp xscvspdpn v4i32->f64: xxspltw xvcvsxwdp Differential Revision: https://reviews.llvm.org/D35859 llvm-svn: 310866
-
Chandler Carruth authored
introduce a miscompile bug. There appears to be a bug where the generated code to extract the sign bit doesn't work correctly for 32-bit inputs. I've replied to the original commit pointing out the problem. I think I see by inspection (and reading the manual for PPC) how to fix this, but I can't be 100% confident and I also don't know what the best way to test this is. Currently it seems nearly impossible to get the backend to hit this code path, but the patch autohr is likely in a better position to craft such test cases than I am, and based on where the bug is it should be easily done. Original commit message for r310346: """ [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 """ llvm-svn: 310809
-
- Aug 10, 2017
-
-
Krzysztof Parzyszek authored
The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM. Differential Revision: https://reviews.llvm.org/D36160 llvm-svn: 310619
-
- Aug 09, 2017
-
-
Nemanja Ivanovic authored
llvm-svn: 310424
-
- Aug 08, 2017
-
-
Nemanja Ivanovic authored
We've implemented a 1-byte splat using XXSPLTISB on P9. However, LLVM will produce a 1-byte splat even for wider element BUILD_VECTOR nodes. This patch prevents crashing in that situation. Differential Revision: https://reviews.llvm.org/D35650 llvm-svn: 310358
-
Nemanja Ivanovic authored
llvm-svn: 310356
-
Nemanja Ivanovic authored
Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it adds the handling for the special case where RHS == 0. Differential Revision: https://reviews.llvm.org/D34048 llvm-svn: 310346
-
- Aug 03, 2017
-
-
Rafael Espindola authored
llvm-svn: 309921
-
Rafael Espindola authored
IMHO it is an antipattern to have a enum value that is Default. At any given piece of code it is not clear if we have to handle Default or if has already been mapped to a concrete value. In this case in particular, only the target can do the mapping and it is nice to make sure it is always done. This deletes the two default enum values of CodeModel and uses an explicit Optional<CodeModel> when it is possible that it is unspecified. llvm-svn: 309911
-
- Aug 02, 2017
-
-
Stefan Pintilie authored
Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876
-
- Jul 31, 2017
-
-
Hiroshi Inoue authored
Changed method names based on the discussion in https://reviews.llvm.org/D34986: getInt64 -> selectI64Imm, getInt64Count -> selectI64ImmInstrCount. llvm-svn: 309541
-
- Jul 27, 2017
-
-
Hiroshi Inoue authored
In optimizeCompareInstr, a compare instruction is eliminated by using a record form instruction if possible. If the branch instruction that uses the result of the compare has a static branch hint, the optimization does not happen. This patch makes this optimization happen regardless of the branch hint by splitting branch hint and branch condition before checking the predicate to identify the possible optimizations. Differential Revision: https://reviews.llvm.org/D35801 llvm-svn: 309255
-
- Jul 26, 2017
-
-
Peter Collingbourne authored
This was a use-after-free waiting to happen. llvm-svn: 309159
-
Stefan Pintilie authored
Added a comment to explain how to add a PPCISD node. llvm-svn: 309114
-
Eric Christopher authored
llvm-svn: 309041
-
- Jul 25, 2017
-
-
Nemanja Ivanovic authored
This patch just adds printing of CR bit registers in a more human-readable form akin to that used by the GNU binutils. Differential Revision: https://reviews.llvm.org/D31494 llvm-svn: 309001
-
Nemanja Ivanovic authored
This is just a recommit since the issue that the commit exposed is now resolved. llvm-svn: 308995
-
- Jul 21, 2017
-
-
Guozhi Wei authored
MIR SRADI uses instruction template XSForm_1rc which declares Defs = [CARRY]. But MIR SRADI_32 uses instruction template XSForm_1, and it doesn't declare such implicit definition. With patch D33720 it causes wrong code generation for perl. This patch adds the implicit definition. Differential Revision: https://reviews.llvm.org/D35699 llvm-svn: 308780
-
Jonas Paulsson authored
This patch makes LSR generate better code for SystemZ in the cases of memory intrinsics, Load->Store pairs or comparison of immediate with memory. In order to achieve this, the following common code changes were made: * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if LSR should do instruction-based addressing evaluations by calling isLegalAddressingMode() with the Instruction pointers. * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address, not just loads or stores. SystemZ changes: * isLSRCostLess() implemented with Insns first, and without ImmCost. * New function supportedAddressingMode() that is a helper for TTI methods looking at Instructions passed via pointers. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D35262 https://reviews.llvm.org/D35049 llvm-svn: 308729
-
- Jul 18, 2017
-
-
Hiroshi Inoue authored
llvm-svn: 308305
-
- Jul 14, 2017
-
-
Eric Christopher authored
llvm-svn: 307999
-
- Jul 13, 2017
-
-
Nemanja Ivanovic authored
As outlined in the PR, we didn't ensure that displacements for DQ-Form instructions are multiples of 16. Since the instruction encoding encodes a quad-word displacement, a sub-16 byte displacement is meaningless and ends up being encoded incorrectly. Fixes https://bugs.llvm.org/show_bug.cgi?id=33671. Differential Revision: https://reviews.llvm.org/D35007 llvm-svn: 307934
-
- Jul 12, 2017
-
-
Rafael Espindola authored
The issue is not if the value is pcrel. It is whether we have a relocation or not. If we have a relocation, the static linker will select the upper bits. If we don't have a relocation, we have to do it. llvm-svn: 307730
-
- Jul 11, 2017
-
-
Tony Jiang authored
1. The available program storage region of the red zone to compilers is 288 bytes rather than 244 bytes. 2. The formula for negative number alignment calculation should be y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1). Differential Revision: https://reviews.llvm.org/D34337 llvm-svn: 307672
-
Hiroshi Inoue authored
llvm-svn: 307662
-
Hiroshi Inoue authored
In the POWER9 instruction scheduler, SchedWriteRes for the simple integer instructions are misconfigured to use that of (costly) DFU instructions. This results in surprisingly long instruction latency estimation and causes misbehavior in some optimizers such as if-conversion. Differential Revision: https://reviews.llvm.org/D34869 llvm-svn: 307624
-
Hiroshi Inoue authored
This patch reduces compilation time by avoiding redundant analysis while selecting instructions to create an immediate. If the instruction count required to create the input number without rotate is 2, we do not need further analysis to find a shorter instruction sequence with rotate; rotate + load constant cannot be done by 1 instruction (i.e. getInt64CountDirectnever return 0). This patch should not change functionality. Differential Revision: https://reviews.llvm.org/D34986 llvm-svn: 307623
-
- Jul 10, 2017
-
-
Tony Jiang authored
Differential Revision: https://reviews.llvm.org/D34908 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307563
-
Lei Huang authored
[PowerPC] Reduce register pressure by not materializing a constant just for use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
-
Hiroshi Inoue authored
llvm-svn: 307533
-
Hiroshi Inoue authored
llvm-svn: 307523
-
- Jul 07, 2017
-
-
Lei Huang authored
llvm-svn: 307442
-
Tony Jiang authored
Differential Revision: https://reviews.llvm.org/D33572 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307413
-
Simon Pilgrim authored
llvm-svn: 307382
-
- Jul 05, 2017
-
-
Sean Fertile authored
On power 8 we sometimes insert swaps to deal with the difference between Little-Endian and Big-Endian. The swap removal pass is supposed to clean up these swaps. On power 9 we don't need this pass since we do not need to insert the swaps in the first place. Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D34627 llvm-svn: 307185
-
Sean Fertile authored
Commiting on behalf of Stefan Pintilie. Differential Revision: https://reviews.llvm.org/D34829 llvm-svn: 307180
-
Tony Jiang authored
This patch adds the exploitation for new power 9 instructions which extract variable elements from vectors: VEXTUBLX VEXTUBRX VEXTUHLX VEXTUHRX VEXTUWLX VEXTUWRX Differential Revision: https://reviews.llvm.org/D34032 Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com) llvm-svn: 307174
-
Tony Jiang authored
This patch adds on to the exploitation added by https://reviews.llvm.org/D33510. This now catches build vector nodes where the inputs are coming from sign extended vector extract elements where the indices used by the vector extract are not correct. We can still use the new hardware instructions by adding a shuffle to move the elements to the correct indices. I introduced a new PPCISD node here because adding a vector_shuffle and changing the elements of the vector_extracts was getting undone by another DAG combine. Commit on behalf of Zaara Syeda (syzaara@ca.ibm.com) Differential Revision: https://reviews.llvm.org/D34009 llvm-svn: 307169
-
Nemanja Ivanovic authored
Remove casts to a constant when a node can be an undef. Differential Revision: https://reviews.llvm.org/D34808 llvm-svn: 307120
-
- Jul 01, 2017
-
-
Rafael Espindola authored
It was not processing any value. All that it ever did was force relocations, so name it shouldForceRelocation. llvm-svn: 306906
-