- Oct 07, 2019
-
-
Simon Atanasyan authored
J/JAL/JALX/JALS are absolute branches, but stay within the current 256 MB-aligned region, so we must include the high bits of the instruction address when calculating the branch target. Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D68548 llvm-svn: 373906
-
Mirko Brkusanin authored
Fix comment. llvm-svn: 373901
-
Craig Topper authored
Move the erasing and iterator updating inside to match the other slow LEA function. I've adapted code from optTwoAddrLEA and basically rebuilt the implementation here. We do lose the kill flags now just like optTwoAddrLEA. This runs late enough in the pipeline that shouldn't really be a problem. llvm-svn: 373877
-
- Oct 06, 2019
-
-
Simon Pilgrim authored
If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element. This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted. Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper. Fixes PR43217 Differential Revision: https://reviews.llvm.org/D68544 llvm-svn: 373871
-
Simon Pilgrim authored
llvm-svn: 373870
-
Amy Kwan authored
This is patch aims to group together the `CRNotPat` multi class instantiations within the `PPCInstrInfo.td` file. Integer instantiations of the multi class are grouped together into a section, and the floating point patterns are separated into its own section. Differential Revision: https://reviews.llvm.org/D67975 llvm-svn: 373869
-
Simon Pilgrim authored
Move the resolveTargetShuffleInputsAndMask call to after the shuffle mask combine before the undef/zero constant fold instead. llvm-svn: 373868
-
Simon Pilgrim authored
Replaces setTargetShuffleZeroElements with getTargetShuffleAndZeroables which reports the Zeroable elements but doesn't merge them into the decoded target shuffle mask (the merging has been moved up into getTargetShuffleInputs until we can get rid of it entirely). This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle mask isn't adjusted by its source inputs but instead we cache them in a parallel Zeroable mask. llvm-svn: 373867
-
Craig Topper authored
[X86] Add custom type legalization for v16i64->v16i8 truncate and v8i64->v8i8 truncate when v8i64 isn't legal Summary: The default legalization for v16i64->v16i8 tries to create a multiple stage truncate concatenating after each stage and truncating again. But avx512 implements truncates with multiple uops. So it should be better to truncate all the way to the desired element size and then concatenate the pieces using unpckl instructions. This minimizes the number of 2 uop truncates. The unpcks are all single uop instructions. I tried to handle this by just custom splitting the v16i64->v16i8 shuffle. And hoped that the DAG combiner would leave the two halves in the state needed to make D68374 do the job for each half. This worked for the first half, but the second half got messed up. So I've implemented custom handling for v8i64->v8i8 when v8i64 needs to be split to produce the VTRUNCs directly. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68428 llvm-svn: 373864
-
Simon Pilgrim authored
[X86][SSE] resolveTargetShuffleInputs - call getTargetShuffleInputs instead of using setTargetShuffleZeroElements directly. NFCI. llvm-svn: 373855
-
Xiangling Liao authored
Summary: Replace 'isDarwin' with 'IsDarwin' based on LLVM naming convention. Differential Revision: https://reviews.llvm.org/D68336 llvm-svn: 373852
-
Simon Pilgrim authored
llvm-svn: 373849
-
Simon Pilgrim authored
We can make use of the Zeroable mask to indicate which elements we can safely set to zero instead of creating a target shuffle mask on the fly. This allows us to remove createTargetShuffleMask. This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle masks isn't adjusted by its source inputs in setTargetShuffleZeroElements but instead we cache them in a parallel Zeroable mask. llvm-svn: 373846
-
David Zarzycki authored
llvm-svn: 373845
-
Matt Arsenault authored
llvm-svn: 373842
-
Matt Arsenault authored
llvm-svn: 373841
-
Matt Arsenault authored
llvm-svn: 373840
-
Matt Arsenault authored
llvm-svn: 373839
-
Matt Arsenault authored
Turn into shift and truncate. Doesn't yet handle pointers. llvm-svn: 373838
-
Matt Arsenault authored
This wasn't updated for the immarg handling change. llvm-svn: 373837
-
- Oct 05, 2019
-
-
Simon Pilgrim authored
As discussed on PR42025, with more complex boolean math we can end up with many truncations/extensions of the comparison results through each bitop. This patch handles the cases introduced in combineBitcastvxi1 by pushing the sign extension through the AND/OR/XOR ops so its just the original SETCC ops that gets extended. Differential Revision: https://reviews.llvm.org/D68226 llvm-svn: 373834
-
Simon Pilgrim authored
Rename some variables to match lowerShuffleAsRepeatedMaskAndLanePermute - prep work toward adding some equivalent sublane functionality. llvm-svn: 373832
-
Ana Pazos authored
simm9_lsb0 and simm12_lsb0 operand types were missing predicates. llvm-svn: 373812
-
- Oct 04, 2019
-
-
Huihui Zhang authored
../llvm-project/llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:355:48: warning: suggest braces around initialization of subobject [-Wmissing-braces] return addMappingFromTable<1>(MI, MRI, { 0 }, Table); ^ {} llvm-svn: 373784
-
Craig Topper authored
[X86] Remove isel patterns for mask vpcmpgt/vpcmpeq. Switch vpcmp to these based on the immediate in MCInstLower The immediate form of VPCMP can represent these completely. The vpcmpgt/eq are just shorter encodings. This patch removes the isel patterns and just swaps the opcodes and removes the immediate in MCInstLower. This matches where we do some other encodings tricks. Removes over 10K bytes from the isel table. Differential Revision: https://reviews.llvm.org/D68446 llvm-svn: 373766
-
Craig Topper authored
We already do this for ISD::TRUNCATE, but we can do the same for X86ISD::VTRUNC Differential Revision: https://reviews.llvm.org/D68432 llvm-svn: 373765
-
Dmitry Preobrazhensky authored
See bug 43484: https://bugs.llvm.org/show_bug.cgi?id=43484 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68349 llvm-svn: 373745
-
Dmitry Preobrazhensky authored
See bug 43485: https://bugs.llvm.org/show_bug.cgi?id=43485 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68348 llvm-svn: 373740
-
Tim Northover authored
Darwin platforms need the frame register to always point at a valid record even if it's not updated in a leaf function. Backtraces are more important than one extra GPR. llvm-svn: 373738
-
Dmitry Preobrazhensky authored
See bug 43483: https://bugs.llvm.org/show_bug.cgi?id=43483 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68347 llvm-svn: 373736
-
Matt Arsenault authored
This was always passing the destination flat address space, when it should be picking between the two valid source options. llvm-svn: 373716
-
Matt Arsenault authored
llvm-svn: 373715
-
Matt Arsenault authored
llvm-svn: 373714
-
David Zarzycki authored
llvm-svn: 373706
-
Piotr Sobczak authored
Summary: This patch fixes a potential aliasing problem in InstClassEnum, where local values were mixed with machine opcodes. Introducing InstSubclass will keep them separate and help extending InstClassEnum with other instruction types (e.g. MIMG) in the future. This patch also makes getSubRegIdxs() more concise. Reviewers: nhaehnle, arsenm, tstellar Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68384 llvm-svn: 373699
-
Shiva Chen authored
We would like to split the SP adjustment to reduce the instructions in prologue and epilogue as the following case. In this way, the offset of the callee saved register could fit in a single store. add sp,sp,-2032 sw ra,2028(sp) sw s0,2024(sp) sw s1,2020(sp) sw s3,2012(sp) sw s4,2008(sp) add sp,sp,-64 Differential Revision: https://reviews.llvm.org/D68011 llvm-svn: 373688
-
- Oct 03, 2019
-
-
Nick Desaulniers authored
Summary: Fixes pr/42576. Link: https://github.com/ClangBuiltLinux/linux/issues/697 Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D68356 llvm-svn: 373655
-
Jinsong Ji authored
Summary: This is follow up patch of https://reviews.llvm.org/D67595. Adjust naming and the Commutable operands for additional patterns to make it easier to read. The testcase update also show that we can save some unecessary fmr as well. Reviewers: #powerpc, steven.zhang, hfinkel, nemanjai Reviewed By: #powerpc, nemanjai Subscribers: wuzish, hiraditya, kbarton, MaskRay, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68112 llvm-svn: 373652
-
Jordan Rupprecht authored
llvm-svn: 373646
-
Craig Topper authored
[X86] Add v32i8 shuffle lowering strategy to recognize two v4i64 vectors truncated to v4i8 and concatenated into the lower 8 bytes with undef/zero upper bytes. This patch recognizes the shuffle pattern we get from a v8i64->v8i8 truncate when v8i64 isn't a legal type. With VLX we can use two VTRUNCs, unpckldq, and a insert_subvector. Diffrential Revision: https://reviews.llvm.org/D68374 llvm-svn: 373645
-