- Apr 06, 2021
-
-
Florian Hahn authored
For VPWidenPHIRecipes that model all incoming values as VPValue operands, print those operands instead of printing the original PHI. D99294 updates recipes of reduction PHIs to use the VPValue for the incoming value from the loop backedge, making use of this new printing.
-
Dmitry Preobrazhensky authored
Corrected SMEM decoding when IMM=0 and OFFSET>127 Fixed bug 49819 (https://bugs.llvm.org/show_bug.cgi?id=49819) Differential Revision: https://reviews.llvm.org/D99804
-
Simon Pilgrim authored
After rG47321c311bdbe0145b9bf45d822185c37b19fa50 we promote vXi8 reductions to vXi16 to create a much faster PMULLW mul reduction, followed by a (free) truncation. This avoids the high cost of repeated vXi8 multiplications (which extend+multiply+truncate to/from vXi16 types....). Fixes the missing vXi8 mul reduction vectorization in PR42674 (Comment #20) 'mul16' test case.
-
Jay Foad authored
-
Thomas Preud'homme authored
LLVM test CodeGen/AArch64/aarch64-tbz.ll tries to check for the absence of a sequence of instructions with several CHECK-NOT with one of those directives using a variable defined in another. However CHECK-NOT are checked independently so that is using a variable defined in a pattern that should not occur in the input. This commit removes the definition and uses of variable to check each line independently, making the check stronger than the current one. It also removes unnecessary regex match for labels. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D99602
-
Simon Pilgrim authored
This is a mixture of instcombine/simplfycfg/instcombine to recognise and then remove the abs pattern
-
madhur13490 authored
This patch enhances hasAddressTaken() to ignore bitcasts as a callee in callbase instruction. Such bitcast usage doesn't really take the address in a useful meaningful way. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D98884
-
Simon Pilgrim authored
As promised in D98866
-
Sjoerd Meijer authored
It is generally beneficial to prefer "movi d0, #0" over "fmov s0, wzr" as this is most efficient across all cores; it is recognised as a zeroing idiom. For newer cores, fmov instructions can also be eliminated early and there is no difference with movi, but some implementations lack this so is not true for other/older cores. Thus this standardises on using movi as this should always gives the same or better performance than the fmov with wzr. Differential Revision: https://reviews.llvm.org/D99586
-
Sam Parker authored
-
Sjoerd Meijer authored
This was using the .2d variant which zeros 128 bits, but using the .2s variant that zeros 64 bits is faster on some cores. This is a prep step for D99586 to always using movi for zeroing floats. Differential Revision: https://reviews.llvm.org/D99710
-
Jay Foad authored
Differential Revision: https://reviews.llvm.org/D99647
-
Yevgeny Rouban authored
Fixes commit 39e3e3aa: Redesign of PreserveCFG Checker
-
Yevgeny Rouban authored
The reason for the NewPM redesign is described in the commit cba3e783: [NewPM] Disable PreservedCFGChecker ... The checker introduces an internal custom CFG analysis that tracks current up-to date CFG snapshot. The analysis is invalidated along any other CFG related analysis (the key is CFGAnalyses). If the CFG analysis is not invalidated at a functional pass exit then the checker asserts that the CFG snapshot taken from this analysis is equals to a snapshot of the current CFG. Along the way: - the function CFG::printDiff() is simplified by removing function name calculation. The name is printed by the caller; - fixed CFG invalidated condition (see CFG::invalidate()); - StandardInstrumentations::registerCallbacks() gets additional optional parameter of type FunctionAnalysisManager*, which is needed by the checker to get the custom CFG analysis; - several PM related tests updated to explicitly set -verify-cfg-preserved=1 as they need. This patch is safe to land as the CFGChecker is left switched off (the options -verify-cfg-preserved is false by default). It will be switched on by a separate patch to minimize possible reverts. Reviewed By: skatkov, kuhar Differential Revision: https://reviews.llvm.org/D91327
-
Serguei Katkov authored
[Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D99875
-
Yevgeny Rouban authored
Change several pass sequence sensitive tests to be indifferent to the PreserveCFGChecker by explicitly settting the option -verify-cfg-preserved=0. It is a preparation step that allows a redesign of PreserveCFGChecker. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D99878
-
Craig Topper authored
I missed a few intrinsics in 3dd4aa7d when I did this for masked loads and masked segment loads/stores. Found while trying to share more code between these custom isel functions.
-
Philip Reames authored
-
Arthur Eubanks authored
When we are able to SROA an alloca, we know all uses of it, meaning we don't have to preserve the invariant group intrinsics and metadata. It's possible that we could lose information regarding redundant loads/stores, but that's unlikely to have any real impact since right now the only user is Clang and vtables. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99760
-
Philip Reames authored
Use that fact to improve isKnownNonEqual.
-
Philip Reames authored
-
Stanislav Mekhanoshin authored
Fixes: SWDEV-280070 Differential Revision: https://reviews.llvm.org/D99902
-
Craig Topper authored
For some reason we only had 1 test case. This synchronizes the test with vslide1down so we have the same number of tests for both.
-
- Apr 05, 2021
-
-
Sanjay Patel authored
This is the sibling fix to c590a988 - as there, we can't subsitute a vector value the equality compare replacement that we are trying requires that the comparison is true for the entire value. Vector select can be partly true/false.
-
Sanjay Patel authored
We need a sibling fix to c590a988 ( https://llvm.org/PR49832 ) to avoid miscompiling.
-
Craig Topper authored
The scalar type is already marked as XLenVT. The floating point version would need a different rule.
-
Craig Topper authored
It's a bit silly, but it allows us to write stricter type constraints for isel. There's still some extra type checks in the generated table due to some type interference limitations around HWMode.
-
Philip Reames authored
Several of these weren't testing what was intented.
-
Philip Reames authored
For use in an uncoming patch. Left out the phi case (which could otherwise fit in this framework) as it would cause infinite recursion in said patch. We can probably also leverage this in instcombine to ensure we keep the two sets of related analysis and transforms in sync.
-
Philip Reames authored
-
Craig Topper authored
FP would need VFSLIDE1UP_VF which uses an FP register.
-
Ricky Taylor authored
These look like $00A0cf for hex and %001010101 for binary. They are used in Motorola assembly syntax. Differential Revision: https://reviews.llvm.org/D98519
-
Jennifer Yu authored
Added basic parsing/sema/serialization support for the 'nocontext' clause. Differential Revision: https://reviews.llvm.org/D99848
-
Nico Weber authored
-
Tom Stellard authored
This reverts commit 43ceb74e. This caused some build failures: https://bugs.llvm.org/show_bug.cgi?id=49818
-
Tom Stellard authored
This reverts commit d66f9c4f. This was a follow up fix for 43ceb74e, which will be reverted.
-
Cyndy Ishida authored
TextAPI/ELF has moved out into InterfaceStubs, so theres no longer a need to seperate out TextAPI between formats. Reviewed By: ributzka, int3, #lld-macho Differential Revision: https://reviews.llvm.org/D99811
-
LLVM GN Syncbot authored
-
Ta-Wei Tu authored
If only the second candidate loop is guarded while the first one is not, fusioning two loops might not be valid but this check is currently missing. Fixes https://bugs.llvm.org/show_bug.cgi?id=48060 Reviewed By: sidbav Differential Revision: https://reviews.llvm.org/D99716
-
Fraser Cormack authored
This patch supports bitcasts from scalar types to fixed-length vectors and vice versa. It custom-lowers and custom-legalizes them to EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT operations, using a single-element vectors to hold the scalar where appropriate. Previously, some of these would fail to select, others would be expanded through stack loads and stores. Effort was made to ensure the codegen avoids the stack for both legal and illegal scalar types. Some of the codegen could be improved, but on first glance it looks like a general optimization of EXTRACT_VECTOR_ELT when extracting an i64 element on RV32. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99667
-