- Dec 09, 2020
-
-
Jon Chesterfield authored
-
Justin Bogner authored
This method previously always recursively checked both the left-hand side and right-hand side of binary operations for splatted (broadcast) vector values to determine if the parent DAG node is a splat. Like several other SelectionDAG methods, limit the recursion depth to MaxRecursionDepth (6). This prevents stack overflow. See also https://issuetracker.google.com/173785481 Patch by Nicolas Capens. Thanks! Differential Revision: https://reviews.llvm.org/D92421
-
Alexey Bader authored
To support llorg builds this patch provides the following changes: 1) Added cmake variable ITTAPI_GIT_REPOSITORY to control the location of ITTAPI repository. Default value of ITTAPI_GIT_REPOSITORY is github location: https://github.com/intel/ittapi.git Also, the separate cmake variable ITTAPI_GIT_TAG was added for repo tag. 2) Added cmake variable ITTAPI_SOURCE_DIR to control the place where the repo will be cloned. Default value of ITTAPI_SOURCE_DIR is build area: PROJECT_BINARY_DIR Reviewed By: etyurin, bader Patch by ekovanov. Differential Revision: https://reviews.llvm.org/D91935
-
Arthur Eubanks authored
This was accidentally reverted by a later change. LSR currently only runs in the codegen pass manager. There are a couple issues with LSR and the NPM. 1) Lots of tests assume that LCSSA isn't run before LSR. This breaks a bunch of tests' expected output. This is fixable with some time put in. 2) LSR doesn't preserve LCSSA. See llvm/test/Analysis/MemorySSA/update-remove-deadblocks.ll. LSR's use of SCEVExpander is the only use of SCEVExpander where the PreserveLCSSA option is off. Turning it on causes some code sinking out of loops to fail due to SCEVExpander's inability to handle the newly created trivial PHI nodes in the broken critical edge (I was looking at llvm/test/Transforms/LoopStrengthReduce/X86/2011-11-29-postincphi.ll). I also tried simply just calling formLCSSA() at the end of LSR, but the extra PHI nodes cause regressions in codegen tests. We'll delay figuring these issues out until later. This causes the number of check-llvm failures with -enable-new-pm true by default to go from 60 to 29. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D92796
-
Fangrui Song authored
Otherwise `check-llvm-*` may not rebuild llvm-profgen, causing llvm-profgen tests to fail if llvm-profgen happens to be stale.
-
Jonas Devlieghere authored
The reproducers currently use a static variable to track the API boundary. This is obviously incorrect when the SB API is used concurrently. While I do not plan to support that use-case (right now), I do want to avoid us crashing. As a first step, correctly track API boundaries across multiple threads. Before this patch SB API calls made by the embedded script interpreter would be considered "behind the API boundary" and correctly ignored. After this patch, we need to tell the reproducers to ignore the scripting thread as a "private thread". Differential revision: https://reviews.llvm.org/D92811
-
Arthur Eubanks authored
Reviewed By: hans Differential Revision: https://reviews.llvm.org/D92866
-
Mircea Trofin authored
Explicitly opt-out llvm/test/Transforms/Attributor. Verified by flipping the default value of allow-unused-prefixes and observing that none of the failures were under llvm/test/Transforms. Differential Revision: https://reviews.llvm.org/D92404
-
Sam McCall authored
This is a step towards making compile_commands.json reloadable. The idea is: - in addition to rare CDB loads we're soon going to have somewhat-rare CDB reloads and fairly-common stat() of files to validate the CDB - so stop doing all our work under a big global lock, instead using it to acquire per-directory structures with their own locks - each directory can be refreshed from disk every N seconds, like filecache - avoid locking these at all in the most common case: directory has no CDB Differential Revision: https://reviews.llvm.org/D92381
-
Louis Dionne authored
The goal was to add coverage for back-deployment over the filesystem library, but it was added in macOS 10.15, not 10.14. Differential Revision: https://reviews.llvm.org/D92937
-
LLVM GN Syncbot authored
-
LLVM GN Syncbot authored
-
LLVM GN Syncbot authored
-
Adam Czachorowski authored
No changes to the tests themselves, other than some auto -> const auto diagnostic fixes and formatting. Differential Revision: https://reviews.llvm.org/D92939
-
Kazushi (Jam) Marukawa authored
Add vsum and vfsum intrinsic instructions and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92938
-
Paul C. Anagnostopoulos authored
Differential Revision: https://reviews.llvm.org/D92674
-
Sanjay Patel authored
This is an enhancement to load vectorization that is motivated by a pattern in https://llvm.org/PR16739. Unfortunately, it's still not enough to make a difference there. We will have to handle multi-use cases in some better way to avoid creating multiple overlapping loads. Differential Revision: https://reviews.llvm.org/D92858
-
Roman Lebedev authored
We could create uadd.sat under incorrect circumstances if a select with -1 as the false value was canonicalized by swapping the T/F values. Unlike the other transforms in the same function, it is not invariant to equality. Some alive proofs: https://alive2.llvm.org/ce/z/emmKKL Based on original patch by David Green! Fixes https://bugs.llvm.org/show_bug.cgi?id=48390 Differential Revision: https://reviews.llvm.org/D92717
-
Roman Lebedev authored
The non-strict variants are already handled because they are canonicalized to strict variants by swapping hands in both the select and icmp, and the fold simply considers that strictness is irrelevant here. But that isn't actually true for the last pattern, as PR48390 reports.
-
Kazushi (Jam) Marukawa authored
Add vfmk intrinsic instructions, a few pseudo instructions to expand vfmk intrinsic using VM512 correctly, and regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92758
-
Yvan Roux authored
Should fix non-x86 bot failures.
-
Simon Pilgrim authored
[X86] Fold CONCAT(VPERMV3(X,Y,M0),VPERMV3(Z,W,M1)) -> VPERMV3(CONCAT(X,Z),CONCAT(Y,W),CONCAT(M0,M1)) Further prep work toward supporting different subvector sizes in combineX86ShufflesRecursively
-
Matt Morehouse authored
The wrapper clears shadow for any events written. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92891
-
Muhammad Omair Javaid authored
TestLldbGdbServer.py testcases are timing out on LLDB/AArch64 Linux buildbot since recent changes. I am temporarily increasing DEFAULT_TIMEOUT to 20 seconds to see impact.
-
Anton Afanasyev authored
For stores chain vectorization we choose the size of vector elements to ensure we fit to minimum and maximum vector register size for the number of elements given. This patch corrects vector element size choosing the width of value truncated just before storing instead of the width of value stored. Fixes PR46983 Differential Revision: https://reviews.llvm.org/D92824
-
Djordje Todorovic authored
If a function parameter is marked as "undef", prevent creation of CallSiteInfo for that parameter. Without this patch, the parameter's call_site_value would be incorrect. The incorrect call_value case reported in PR39716, addressed in D85111. Patch by Nikola Tesic Differential revision: https://reviews.llvm.org/D92471
-
Kerry McLaughlin authored
This patch adds the following DAGCombines, which apply if isVectorLoadExtDesirable() returns true: - fold (and (masked_gather x)) -> (zext_masked_gather x) - fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x) LowerMGATHER has also been updated to fetch the LoadExtType associated with the gather and also use this value to determine the correct masked gather opcode to use. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92230
-
Sander de Smalen authored
* Steps are scaled by `vscale`, a runtime value. * Changes to circumvent the cost-model for now (temporary) so that the cost-model can be implemented separately. This can vectorize the following loop [1]: void loop(int N, double *a, double *b) { #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < N; i++) { a[i] = b[i] + 1.0; } } [1] This source-level example is based on the pragma proposed separately in D89031. This patch only implements the LLVM part. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91077
-
Sander de Smalen authored
This patch removes a number of asserts that VF is not scalable, even though the code where this assert lives does nothing that prevents VF being scalable. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D91060
-
Kerry McLaughlin authored
Adds the ExtensionType flag, which reflects the LoadExtType of a MaskedGatherSDNode. Also updated SelectionDAGDumper::print_details so that details of the gather load (is signed, is scaled & extension type) are printed. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91084
-
Christian Sigg authored
[mlir] Use mlir::OpState::operator->() to get to methods of mlir::Operation. This is a preparation step to remove the corresponding methods from OpState. Reviewed By: silvas, rriddle Differential Revision: https://reviews.llvm.org/D92878
-
Joe Ellis authored
This commit adds two new intrinsics. - llvm.experimental.vector.insert: used to insert a vector into another vector starting at a given index. - llvm.experimental.vector.extract: used to extract a subvector from a larger vector starting from a given index. The codegen work for these intrinsics has already been completed; this commit is simply exposing the existing ISD nodes to LLVM IR. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D91362
-
Cullen Rhodes authored
Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92482
-
Simon Moll authored
Translate VP intrinsics to VP_* SDNodes. The tests check whether a matching vp_* SDNode is emitted. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91441
-
Alex Zinenko authored
The original code was inserting the barrier at the location given by the caller. Make sure it is always inserted at the end of the loop exit block instead. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D92849
-
Tim Northover authored
I accidentally pushed the wrong patch originally.
-
Muhammad Omair Javaid authored
Fix failure introduced by 843f2dbf.
-
Haojian Wu authored
-
Roman Lebedev authored
[NFC][Instructions] Refactor CmpInst::getFlippedStrictnessPredicate() in terms of is{,Non}StrictPredicate()/get{Non,}StrictPredicate() In particular, this creates getStrictPredicate() method, to be symmetrical with already-existing getNonStrictPredicate().
-
Fraser Cormack authored
The register operand was not being marked as a def when it should be. No tests for this in the main branch as there are not yet any pseudos without a non-negative VLIndex. Also change the type of a virtual register operand from unsigned to Register and adjust formatting. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D92823
-