- Jun 17, 2020
-
-
Nick Desaulniers authored
Summary: llvm::SplitEdge was failing an assertion that the BasicBlock only had one successor (for BasicBlocks terminated by CallBrInst, we typically have multiple successors). It was surprising that the earlier call to SplitCriticalEdge did not handle the critical edge (there was an early return). Removing that triggered another assertion relating to creating a BlockAddress for a BasicBlock that did not (yet) have a parent, which is a simple order of operations issue in llvm::SplitCriticalEdge (a freshly constructed BasicBlock must be inserted into a Function's basic block list to have a parent). Thanks to @nathanchance for the report. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018 Reviewers: craig.topper, jyknight, void, fhahn, efriedma Reviewed By: efriedma Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D81607
-
Davide Italiano authored
When the zext gets promoted, it used to retain the original location, which pessimizes the debugging experience causing an unexpected jump in stepping at -Og. Fixes https://bugs.llvm.org/show_bug.cgi?id=46120 (which also contains a full C repro). Differential Revision: https://reviews.llvm.org/D81437
-
Kirill Naumov authored
This reverts commit 37e06e8f.
-
Kirill Naumov authored
This reverts commit 52b0db22.
-
Kirill Naumov authored
This reverts commit 34fba68d.
-
Kirill Naumov authored
If the GEP instruction contanins only constants as its arguments, then it should be recognized as a constant. For now, there was also added a flag to turn off this simplification if it causes any regressions ("disable-gep-const-evaluation") which is off by default. Once I gather needed data of the effectiveness of this simplification, the flag will be deleted. Reviewers: apilipenko, davidxl, mtrofin Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D81026
-
Kirill Naumov authored
This patch enables printing of constants to see which instructions were constant-folded. Needed for tests and better visiual analysis of inliner's work. Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D81024
-
Kirill Naumov authored
This class allows to see the inliner's decisions for better optimization verifications and tests. To use, use flag "-passes="print<inline-cost>"". Reviewers: apilipenko, mtrofin, davidxl, fedor.sergeev Reviewed By: mtrofin Differential revision: https://reviews.llvm.org/D81743
-
Florian Hahn authored
-
Sjoerd Meijer authored
Fixed ARM regression test. Please see the original commit message rG47650451738c for details.
-
Florian Hahn authored
-
Sam Parker authored
I originally reverted the patch because it was causing performance issues, but now I think it's just enabling simplify-cfg to do something that I don't want instead :) Sorry for the noise. This reverts commit 3e39760f.
-
Hans Wennborg authored
The invoke instruction can have profile metadata with branch_weights, which does not make sense for a call instruction and will be rejected by the verifier. Differential revision: https://reviews.llvm.org/D81996
-
Sjoerd Meijer authored
This reverts commit 47650451 while I investigate the build bot failures.
-
Sjoerd Meijer authored
This emits new IR intrinsic @llvm.get.active.mask for tail-folded vectorised loops if the intrinsic is supported by the backend, which is checked by querying TargetTransform hook emitGetActiveLaneMask. This intrinsic creates a mask representing active and inactive vector lanes, which is used by the masked load/store instructions that are created for tail-folded loops. The semantics of @llvm.get.active.mask are described here in LangRef: https://llvm.org/docs/LangRef.html#llvm-get-active-lane-mask-intrinsics This intrinsic is also used to provide a hint to the backend. That is, the second argument of the intrinsic represents the back-edge taken count of the loop. For MVE, for example, we use that to set up tail-predication, which is a new form of predication in MVE for vector loops that implicitely predicates the last vector loop iteration by implicitely setting active/inactive lanes, i.e. the tail loop is predicated. In order to set up a tail-predicated vector loop, we need to know the number of data elements processed by the vector loop, which corresponds the the tripcount of the scalar loop, which we can now reconstruct using @llvm.get.active.mask. Differential Revision: https://reviews.llvm.org/D79100
-
Florian Hahn authored
Currently load instructions are added to the cache for invariant pointer group dependencies, but only pointer values are removed currently. That leads to dangling AssertingVHs in the test case below, where we delete a load from an invariant pointer group. We should also remove the entries from the cache. Fixes PR46054. Reviewers: efriedma, hfinkel, asbirlea Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81726
-
Max Kazantsev authored
-
- Jun 16, 2020
-
-
Davide Italiano authored
-
Florian Hahn authored
Some tests were missing alignment info. Subsequent changes properly preserve the set alignment. Set it properly beforehand, to avoid unnecessary test changes.
-
Hiroshi Yamauchi authored
Summary: delete(void*, unsigned int, align_val_t) delete(void*, unsigned long, align_val_t) delete[](void*, unsigned int, align_val_t) delete[](void*, unsigned long, align_val_t) Differential Revision: https://reviews.llvm.org/D81853
-
Sanjay Patel authored
Generalize scalarization (recently enhanced with D80885) to allow compares as well as binops. Similar to binops, we are avoiding scalarization of a loaded value because that could avoid a register transfer in codegen. This requires 1 extra predicate that I am aware of: we do not want to scalarize the condition value of a vector select. That might also invert a transform that we do in instcombine that prefers a vector condition operand for a vector select. I think this is the final step in solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 Differential Revision: https://reviews.llvm.org/D81661
-
Florian Hahn authored
Some tests were missing alignment info. Subsequent changes properly preserve the set alignment. Set it properly beforehand, to avoid unnecessary test changes. It also updates cases where an alignment of 16 was specified, instead of the vector element type alignment.
-
Tyker authored
Summary: this reduces significantly the number of assumes generated without aftecting too much the information that is preserved. this improves the compile-time cost of enable-knowledge-retention significantly. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79650
-
Jay Foad authored
This reverts commit 69bdfb07. Reverting to investigate https://bugs.llvm.org/show_bug.cgi?id=46343
-
sstefan1 authored
Summary: Couple of tests to showcase what will be done and what to expect with ICV tracking. Reviewers: jdoerfert, JonChesterfield Subscribers: yaxunl, guansong, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81114
-
- Jun 15, 2020
-
-
Davide Italiano authored
The promotion machinery in CGP moves instructions retaining debug locations. When the transformation is local, this is mostly correct, but when instructions are moved cross-BBs, this is not always true and causes jumpiness in line tables. This is the first of a series of commits. sext(s) and zext(s) need to be treated similarly. Differential Revision: https://reviews.llvm.org/D81879
-
Florian Hahn authored
As suggested in D81472, the load/store intrinsics' pointer arguments can be marked as nocapture and all matrix intrinsics as nosync. This also re-flows the intrinsic definitions, to make them a little more concise.
-
Florian Hahn authored
Port partial constant store merging logic to MemorySSA backed DSE. The heavy lifting is done by the existing helper function. It is used in context where we already ensured that the later instruction can eliminate the earlier one, if it is a complete overwrite.
-
Florian Hahn authored
Also enables a now-passing test case, that exposed a crash caused by the wrong order.
-
Florian Hahn authored
Additional tests added ahead of partial overlapping store merging.
-
Max Kazantsev authored
This test demonstrates weird behavior of SimplifyCFG: seems that bigger size of block leads to worse optimization choice.
-
Sam Parker authored
Have BasicTTI call the base implementation so that both agree on the default behaviour, which the default being a cost of '1'. This has required an X86 specific implementation as it seems to be very reliant on those instructions being free. Changes are also made to AMDGPU so that their implementations distinguish between cost kinds, so that the unrolling isn't affected. PowerPC also has its own implementation to prevent changes to the reg-usage vectorizer test. The cost model test changes now reflect that ret instructions are not generally free. Differential Revision: https://reviews.llvm.org/D79164
-
Sam Parker authored
This reverts commit 23291b98. This caused performance regressions.
-
Max Kazantsev authored
-
Wenlei He authored
Summary: When an SCC got split due to inlining, we have two mechanisms for reprocessing the updated SCC, first is UR.UpdatedC that repeatedly rerun the new, current SCC; second is a worklist for all newly split SCCs. We can avoid rerun of the same SCC when the SCC is set to be processed by both mechanisms *back to back*. In pathological cases, such redundant rerun could cause exponential size growth due to inlining along cycles, even when there's no SCC mutation and hence convergence is not a problem. Note that it's ok to have SCC updated and rerun immediately, and also in the work list if we have actually moved an SCC to be topologically "below" the current one due to merging. In that case, we will need to revisit the current SCC after those moved SCCs. For that reason, the redundant avoidance here only targets back to back rerun of the same SCC - the case described by the now removed FIXME comment. Reviewers: chandlerc, wmi Subscribers: llvm-commits, hoy Tags: #llvm Differential Revision: https://reviews.llvm.org/D80589
-
- Jun 14, 2020
-
-
Florian Hahn authored
Alternative approach to D80570. canCheckPtrAtRT already contains checks the figure out for which alias sets runtime checks are needed. But it currently sets CanDoRT to false for alias sets for which we cannot do RT checks but also do not need any. If we know that we do not need RT checks based on the number of reads/writes in the alias set, we can skip processing the AS. This patch also adds an assertion to ensure that DepCands does not contain more than one write from the alias set. Reviewers: Ayal, anemet, hfinkel, dmgreen Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D80622
-
Whitney Tsang authored
is not necessary one of them. Summary: Currently LoopUnrollPass already allow loops with multiple exiting blocks, but it is only allowed when the loop latch is one of the exiting blocks. When the loop latch is not an exiting block, then only single exiting block is supported. When possible, the single loop latch or the single exiting block terminator is optimized to an unconditional branch in the unrolled loop. This patch allows loops with multiple exiting blocks even if the loop latch is not one of them. However, the optimization of exiting block terminator to unconditional branch is not done when there exists more than one exiting block. Reviewer: dmgreen, Meinersbur, etiotto, fhahn, efriedma, bmahjour Reviewed By: efriedma Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D81053
-
Sanjay Patel authored
(a[0] + a[1] + a[2] + a[3]) - (b[0] + b[1] + b[2] +b[3]) --> (a[0] - b[0]) + (a[1] - b[1]) + (a[2] - b[2]) + (a[3] - b[3]) This should be the last step in solving PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953 We started emitting reduction intrinsics with: D80867/ rGe50059f6b6b3 So it's a relatively easy pattern match now to re-order those ops. Also, I have not seen any complaints for the switch to intrinsics yet, so I'll propose to remove the "experimental" tag from the intrinsics soon. Differential Revision: https://reviews.llvm.org/D81491
-
Sanjay Patel authored
This is a hacky, but low-risk fix to avoid the infinite loop in PR46271: https://bugs.llvm.org/show_bug.cgi?id=46271 As discussed there, the problem is that FoldOpIntoSelect() can get into a conflict with a transform that wants to pull a 'not' op through min/max via SimplifyDemandedVectorElts(). We need to relax our matching of min/max to include undefined elements in vector constants to avoid that. Alternatively, we could improve or cripple the demanded elements analysis, but that could create even more problems. The likely better, safer alternative will be to create min/max intrinsics, so we can remove all of the hacks related to min/max matching in instcombine. Differential Revision: https://reviews.llvm.org/D81698
-