- Jun 01, 2020
-
-
Sanjay Patel authored
SimplifyDemandedVectorElts() bails out on ScalableVectorType anyway, but we can exit faster with the external check. Move this to a helper function because there are likely other vector folds that we can try here.
-
Hiroshi Yamauchi authored
Summary: The working set size heuristics (ProfileSummaryInfo::hasHugeWorkingSetSize) under the partial sample PGO may not be accurate because the profile is partial and the number of hot profile counters in the ProfileSummary may not reflect the actual working set size of the program being compiled. To improve this, the (approximated) ratio of the the number of profile counters of the program being compiled to the number of profile counters in the partial sample profile is computed (which is called the partial profile ratio) and the working set size of the profile is scaled by this ratio to reflect the working set size of the program being compiled and used for the working set size heuristics. The partial profile ratio is approximated based on the number of the basic blocks in the program and the NumCounts field in the ProfileSummary and computed through the thin LTO indexing. This means that there is the limitation that the scaled working set size is available to the thin LTO post link passes only. Reviewers: davidxl Subscribers: mgorny, eraman, hiraditya, steven_wu, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79831
-
Stanislav Mekhanoshin authored
Differential Revision: https://reviews.llvm.org/D79218
-
Sanjay Patel authored
As discussed in https://bugs.llvm.org/show_bug.cgi?id=45951 and D80584, the name 'tmp' is almost always a bad choice, but we have a legacy of regression tests with that name because it was baked into utils/update_test_checks.py. This change makes -instnamer more consistent (already using "arg" and "bb", the common LLVM shorthand). And it avoids the conflict in telling users of the FileCheck script to run "-instnamer" to create a better regression test and having that cause a warn/fail in update_test_checks.py.
-
Ehud Katz authored
-
Ehud Katz authored
This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. The new implementation uses SCCs instead of Loops to take account of irreducible loops. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037
-
- May 30, 2020
-
-
Whitney Tsang authored
Differential Revision: https://reviews.llvm.org/D80477.
-
zoecarver authored
Adds a simple fast-path check for the pattern: v = load ptr store v to ptr I took the tests from the bugzilla post, I can add more if needed (but I think these should be sufficent). Refs: https://bugs.llvm.org/show_bug.cgi?id=45795 Differential Revision: https://reviews.llvm.org/D79391
-
Christopher Tetreault authored
Reviewers: efriedma, aymanmus, c-rhodes, david-arm Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80332
-
Valery N Dmitriev authored
relevant aggregate build instructions only (UserCost). Users are detected with findBuildAggregate routine and the trick is that following SLP vectorization may end up vectorizing entire list with smaller chunks. Cost adjustment then is applied for individual chunks and these adjustments obviously have to be smaller than the entire aggregate build cost. Differential Revision: https://reviews.llvm.org/D80773
-
Christopher Tetreault authored
Reviewers: efriedma, david-arm, fpetrogalli, spatel Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80334
-
Christopher Tetreault authored
Reviewers: efriedma, fpetrogalli, kmclaughlin Reviewed By: fpetrogalli Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80335
-
Christopher Tetreault authored
Reviewers: efriedma, c-rhodes, sdesmalen, xbolva00 Reviewed By: c-rhodes Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80337
-
- May 29, 2020
-
-
Stanislav Mekhanoshin authored
This reverts commit f66a43c1.
-
Stanislav Mekhanoshin authored
Differential Revision: https://reviews.llvm.org/D79218
-
Christopher Tetreault authored
Reviewers: efriedma, c-rhodes, david-arm, fhahn Reviewed By: david-arm Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80339
-
Ehud Katz authored
Prevent `invertCondition` from creating the inversion instruction, in case the given value is an argument which has already been inverted. Note that this approach has already been taken in case the given value is an instruction (and not an argument). Differential Revision: https://reviews.llvm.org/D80399
-
Paul Robinson authored
Fixes PR46002.
-
Florian Hahn authored
Currently SCCP does not widen PHIs, stores or along call edges (arguments/return values), but on operations that directly extend ranges (like binary operators). This means PHIs, stores and call edges are not pessimized by widening currently, while binary operators are. The main reason for widening operators initially was that opting-out for certain operations was more straight-forward in the initial implementation (and it did not matter too much, as range support initially was only implemented for a very limited set of operations. During the discussion in D78391, it was suggested to consider flipping widening to PHIs, stores and along call edges. After adding support for tracking the number of range extensions in ValueLattice, limiting the number of range extensions per value is straight forward. This patch introduces a MaxWidenSteps option to the MergeOptions, limiting the number of range extensions per value. For PHIs, it seems natural allow an extension for each (active) incoming value plus 1. For the other cases, a arbitrary limit of 10 has been chosen initially. It would potentially make sense to set it depending on the users of a function/global, but that still needs investigating. This potentially leads to more state-changes and longer compile-times. The results look quite promising (MultiSource, SPEC): Same hash: 179 (filtered out) Remaining: 58 Metric: sccp.IPNumInstRemoved Program base widen-phi diff test-suite...ks/Prolangs-C/agrep/agrep.test 58.00 82.00 41.4% test-suite...marks/SciMark2-C/scimark2.test 32.00 43.00 34.4% test-suite...rks/FreeBench/mason/mason.test 6.00 8.00 33.3% test-suite...langs-C/football/football.test 104.00 128.00 23.1% test-suite...cations/hexxagon/hexxagon.test 36.00 42.00 16.7% test-suite...CFP2000/177.mesa/177.mesa.test 214.00 249.00 16.4% test-suite...ngs-C/assembler/assembler.test 14.00 16.00 14.3% test-suite...arks/VersaBench/dbms/dbms.test 10.00 11.00 10.0% test-suite...oxyApps-C++/miniFE/miniFE.test 43.00 47.00 9.3% test-suite...ications/JM/ldecod/ldecod.test 179.00 195.00 8.9% test-suite...CFP2006/433.milc/433.milc.test 249.00 265.00 6.4% test-suite.../CINT2000/175.vpr/175.vpr.test 98.00 104.00 6.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 70.00 74.00 5.7% test-suite...CFP2000/188.ammp/188.ammp.test 71.00 75.00 5.6% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 111.00 117.00 5.4% test-suite...ce/Applications/Burg/burg.test 41.00 43.00 4.9% test-suite...000/197.parser/197.parser.test 66.00 69.00 4.5% test-suite...tions/lambda-0.1.3/lambda.test 23.00 24.00 4.3% test-suite...urce/Applications/lua/lua.test 301.00 313.00 4.0% test-suite...TimberWolfMC/timberwolfmc.test 76.00 79.00 3.9% test-suite...lications/ClamAV/clamscan.test 991.00 1030.00 3.9% test-suite...plications/d/make_dparser.test 53.00 55.00 3.8% test-suite...fice-ispell/office-ispell.test 83.00 86.00 3.6% test-suite...lications/obsequi/Obsequi.test 28.00 29.00 3.6% test-suite.../Prolangs-C/bison/mybison.test 56.00 58.00 3.6% test-suite.../CINT2000/254.gap/254.gap.test 170.00 176.00 3.5% test-suite.../Applications/lemon/lemon.test 30.00 31.00 3.3% test-suite.../CINT2000/176.gcc/176.gcc.test 1202.00 1240.00 3.2% test-suite...pplications/treecc/treecc.test 79.00 81.00 2.5% test-suite...chmarks/MallocBench/gs/gs.test 357.00 366.00 2.5% test-suite...eeBench/analyzer/analyzer.test 103.00 105.00 1.9% test-suite...T2006/445.gobmk/445.gobmk.test 1697.00 1724.00 1.6% test-suite...006/453.povray/453.povray.test 1812.00 1839.00 1.5% test-suite.../Benchmarks/Bullet/bullet.test 337.00 342.00 1.5% test-suite.../CINT2000/252.eon/252.eon.test 426.00 432.00 1.4% test-suite...T2000/300.twolf/300.twolf.test 214.00 217.00 1.4% test-suite...pplications/oggenc/oggenc.test 244.00 247.00 1.2% test-suite.../CINT2006/403.gcc/403.gcc.test 4008.00 4055.00 1.2% test-suite...T2006/456.hmmer/456.hmmer.test 175.00 177.00 1.1% test-suite...nal/skidmarks10/skidmarks.test 430.00 434.00 0.9% test-suite.../Applications/sgefa/sgefa.test 115.00 116.00 0.9% test-suite...006/447.dealII/447.dealII.test 1082.00 1091.00 0.8% test-suite...6/482.sphinx3/482.sphinx3.test 141.00 142.00 0.7% test-suite...ocBench/espresso/espresso.test 152.00 153.00 0.7% test-suite...3.xalancbmk/483.xalancbmk.test 4003.00 4025.00 0.5% test-suite...lications/sqlite3/sqlite3.test 548.00 551.00 0.5% test-suite...marks/7zip/7zip-benchmark.test 5522.00 5551.00 0.5% test-suite...nsumer-lame/consumer-lame.test 208.00 209.00 0.5% test-suite...:: External/Povray/povray.test 1556.00 1563.00 0.4% test-suite...000/186.crafty/186.crafty.test 298.00 299.00 0.3% test-suite.../Applications/SPASS/SPASS.test 2019.00 2025.00 0.3% test-suite...ications/JM/lencod/lencod.test 8427.00 8449.00 0.3% test-suite...6/464.h264ref/464.h264ref.test 6797.00 6813.00 0.2% test-suite...6/471.omnetpp/471.omnetpp.test 431.00 430.00 -0.2% test-suite...006/450.soplex/450.soplex.test 446.00 447.00 0.2% test-suite...0.perlbench/400.perlbench.test 1729.00 1727.00 -0.1% test-suite...000/255.vortex/255.vortex.test 3815.00 3819.00 0.1% Reviewers: efriedma, nikic, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79036
-
David Sherwood authored
Whilst trying to compile this test to assembly: CodeGen/aarch64-sve-intrinsics/acle_sve_reinterpret.c I discovered some warnings were firing in InstCombiner::visitBitCast due to calls to getNumElements() for scalable vector types. These calls only really made sense for fixed width vectors so I have fixed up the code appropriately. Differential Revision: https://reviews.llvm.org/D80559
-
Whitney Tsang authored
removed in https://reviews.llvm.org/D80477
-
Whitney Tsang authored
latch. Summary: Remove the limitation in LoopUnrollPass that exiting block must be either header or latch. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto, fhahn, efriedma Reviewed By: etiotto, fhahn, efriedma Subscribers: efriedma, lkail, xbolva00, hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80477
-
- May 28, 2020
-
-
Philip Reames authored
These are the two operand sets which are expected to survive more than another week or so. Instead of bothering to update the deopt and gc-transition operands, we'll just wait until those are removed and delete the code. For those following along, this is likely to be the last (major) change in this sequence for about a week. I want to wait until all of this has been merged downstream to ensure I haven't introduced any bugs (and migrate some downstream code to the new interfaces). Once that's done, we should be able to delete Statepoint/ImmutableStatepoint without too much work.
-
aartbik authored
Summary: Only column-major was supported so far. This adds row-major support as well. Note that we probably also want very efficient SIMD implementations for the various target platforms. Bug: https://bugs.llvm.org/show_bug.cgi?id=46085 Reviewers: nicolasvasilache, reidtatge, bkramer, fhahn, ftynse, andydavis1, craig.topper, dcaballe, mehdi_amini, anemet Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80673
-
Whitney Tsang authored
This reverts commit 28105822. Revert until http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/7334 is resolved.
-
Whitney Tsang authored
latch. Summary: Remove the limitation in LoopUnrollPass that exiting block must be either header or latch. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto, fhahn, efriedma Reviewed By: etiotto, fhahn, efriedma Subscribers: efriedma, lkail, xbolva00, hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80477
-
Philip Reames authored
Continues from D80598. The key point of the change is to default to using operand bundles instead of the inline length prefix argument lists for statepoint nodes. An important subtlety to note is that the presence of a bundle has semantic meaning, even if it is empty. As such, we need to make a somewhat deeper change to the interface than is first obvious. Existing code treats statepoint deopt arguments and the deopt bundle operands differently during inlining. The former is ignored (resulting in caller state being dropped), the later is merged. We can't preserve the old behaviour for calls with deopt fed to RS4GC and then inlining, but we can avoid the no-deopt case changing. At least in internal testing, that seem to be the important one. (I'd argue the "stop merging after RS4GC" behaviour for the former was always "unexpected", but that the behaviour for non-deopt calls actually make sense.) Differential Revision: https://reviews.llvm.org/D80674
-
Hiroshi Yamauchi authored
Summary: Follow up D79751 and put the instrumentation / value collection side (in addition to the optimization side) behind the flag as well. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80646
-
Sidharth Baveja authored
Summary: The following code from /llvm/lib/Transforms/Utils/LoopUnrollAndJam.cpp can be used by other transformations: while (!MergeBlocks.empty()) { BasicBlock *BB = *MergeBlocks.begin(); BranchInst *Term = dyn_cast<BranchInst>(BB->getTerminator()); if (Term && Term->isUnconditional() && L->contains(Term->getSuccessor(0))) { BasicBlock *Dest = Term->getSuccessor(0); BasicBlock *Fold = Dest->getUniquePredecessor(); if (MergeBlockIntoPredecessor(Dest, &DTU, LI)) { // Don't remove BB and add Fold as they are the same BB assert(Fold == BB); (void)Fold; MergeBlocks.erase(Dest); } else MergeBlocks.erase(BB); } else MergeBlocks.erase(BB); } Hence it should be separated into its own utility function. Authored By: sidbav Reviewer: Whitney, Meinersbur, asbirlea, dmgreen, etiotto Reviewed By: asbirlea Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80583
-
Matt Arsenault authored
This one is slightly odd since it counts as an address expression, which previously could never fail. Allow the existing TTI hook to return the value to use, and re-use it for handling how to handle ptrmask. Handles the no-op addrspacecasts for AMDGPU. We could probably do something better based on analysis of the mask value based on the address space, but leave that for now.
-
Kazu Hirata authored
Summary: This patch replaces push_back with emplace_back where appropriate. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80688
-
Philip Reames authored
Now that all of the statepoint related routines have classes with isa support, let's cleanup. I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.
-
Layton Kifer authored
Currently we can only eliminate call return pairs that either return the result of the call or a dynamic constant. This patch removes that limitation. Differential Revision: https://reviews.llvm.org/D79660
-
- May 27, 2020
-
-
Mircea Trofin authored
ProfileSummaryInfo is updated seldom, as result of very specific triggers. This patch clearly demarcates state updates from read-only uses. This, arguably, improves readability and maintainability.
-
Rithik Sharma authored
code motion Summary: Currently isSafeToMoveBefore uses DFS numbering for determining the relative position of instruction and insert point which is not always correct. This PR proposes the use of Dominator Tree depth for the same. If a node is at a higher level than the insert point then it is safe to say that we want to move in the forward direction. Authored By: RithikSharma Reviewer: Whitney, nikic, bmahjour, etiotto, fhahn Reviewed By: Whitney Subscribers: fhahn, hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D80084
-
David Green authored
This makes sure to correctly register the loop info of the children of unroll and jammed loops. It re-uses some code from the unroller for registering subloops. Differential Revision: https://reviews.llvm.org/D80619
-
Florian Hahn authored
If it turns out that we can do runtime checks, but there are no runtime-checks to generate, set RtCheck.Need to false. This can happen if we can prove statically that the pointers passed in to canCheckPtrAtRT do not alias. This should not change any results, but allows us to skip some work and assert that runtime checks are generated, if LAA indicates that runtime checks are required. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79969 Note: This is a recommit of 259abfc7, with some suggested renaming.
-
Florian Hahn authored
This reverts commit 259abfc7. Reverting this, as I missed a case where we return without setting RtCheck.Need.
-
Florian Hahn authored
If it turns out that we can do runtime checks, but there are no runtime-checks to generate, set RtCheck.Need to false. This can happen if we can prove statically that the pointers passed in to canCheckPtrAtRT do not alias. This should not change any results, but allows us to skip some work and assert that runtime checks are generated, if LAA indicates that runtime checks are required. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D79969
-
Daniil Suchkov authored
Currently if instructions defined in a block are used in unreachable blocks and SimpleLoopUnswitch attempts deleting the block, it triggers assertion "Uses remain when a value is destroyed!". This patch fixes it by replacing all uses of instructions from BB with undefs before BB deletion. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D80551
-