- Jan 26, 2021
-
-
Duncan P. N. Exon Smith authored
Refactor the duplicated canonicalize-path logic in `FileCollector` and `ModulesDependencyCollector` into a new utility called `PathCanonicalizer` that's shared. This popped up when tracking down a bug common to both in https://reviews.llvm.org/D95202. As drive-bys, update a few names and comments to better reflect the effect of the code, delay removal of `..`s to avoid an unnecessary extra string copy, and leave behind a couple of FIXMEs for future consideration. Differential Revision: https://reviews.llvm.org/D95279
-
- Jan 25, 2021
-
-
Nikita Popov authored
When LSR converts a branch on the pre-inc IV into a branch on the post-inc IV, the nowrap flags on the addition may no longer be valid. Previously, a poison result of the addition might have been ignored, in which case the program was well defined. After branching on the post-inc IV, we might be branching on poison, which is undefined behavior. Fix this by discarding nowrap flags which are not present on the SCEV expression. Nowrap flags on the SCEV expression are proven by SCEV to always hold, independently of how the expression will be used. This is essentially the same fix we applied to IndVars LFTR, which also performs this kind of pre-inc to post-inc conversion. I believe a similar problem can also exist for getelementptr inbounds, but I was not able to come up with a problematic test case. The inbounds case would have to be addressed in a differently anyway (as SCEV does not track this property). Fixes https://bugs.llvm.org/show_bug.cgi?id=46943. Differential Revision: https://reviews.llvm.org/D95286
-
Fraser Cormack authored
Original patch by @rogfer01. This patch adds support for insertelt and extractelt operations on scalable vectors. Special care must be taken on RV32 when dealing with i64 vectors as there are no straightforward ways to insert a 64-bit element without a register of that size. To that end, both are custom-lowered to different sequences. Authored-by:
Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by:
Fraser Cormack <fraser@codeplay.com> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94615
-
Cassie Jones authored
Implement widening for G_SADDO and G_SSUBO. Add legalize-add/sub tests for narrow overflowing add/sub on AArch64. Differential Revision: https://reviews.llvm.org/D95034
-
Richard Smith authored
This reverts commit 53176c16, which introduceed a layering violation. LLVM's IR library can't include headers from Analysis.
-
Jonas Devlieghere authored
Don't emit an output dash for an empty sequence. Take emitting a vector of strings for example: std::vector<std::string> Strings = {"foo", "bar"}; LLVM_YAML_IS_SEQUENCE_VECTOR(std::string) yout << Strings; This emits the following YAML document. --- - foo - bar ... When the vector is empty, this generates the following result: --- - [] ... Although this is valid YAML, it does not match what we meant to emit. The result is a one-element sequence consisting of an empty list. Indeed, if we were to try to read this again we get an error: YAML:2:4: error: not a mapping - [] The problem is the output dash before the empty list. The correct output would be: --- [] ... This patch fixes that by not emitting the output dash for an empty sequence. Differential revision: https://reviews.llvm.org/D95280
-
Julian Lettner authored
A bot owner contacted me. I will re-land after confirming that this doesn't break anyone (since it's low priority). This reverts commit 9946b169.
-
Konstantin Zhuravlyov authored
This reverts commit dd8ae426. This commit causes infinite loop when compiling rocThrust and hipCUB. Differential Revision: https://reviews.llvm.org/D95389
-
LLVM GN Syncbot authored
-
Akira Hatanaka authored
or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end annotates calls with attribute "clang.arc.rv"="retain" or "clang.arc.rv"="claim", which indicates the call is implicitly followed by a marker instruction and a retainRV/claimRV call that consumes the call result. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the annotated calls in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the annotated calls. It doesn't remove the attribute on the call since the backend needs it to emit the marker instruction. The retainRV/claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes the autoreleaseRV call in the callee that returns the result if nothing in the callee prevents it from being paired up with the calls annotated with "clang.arc.rv"="retain/claim" in the caller. If the call is annotated with "claim", a release call is inserted since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the attributes to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV call returning the callee result, which makes it impossible to pair it up with the retainRV or claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the call is annotated with "retain" and does nothing if it's annotated with "claim". - This patch teaches dead argument elimination pass not to change the return type of a function if any of the calls to the function are annotated with attribute "clang.arc.rv". This is necessary since the pass can incorrectly determine nothing in the IR uses the function return, which can happen since the front-end no longer explicitly emits retainRV/claimRV calls in the IR, and change its return type to 'void'. Future work: - Use the attribute on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the attributes. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808
-
Julian Lettner authored
We can now use Python3. Let's use `os.cpu_count()` to cleanup this helper. Differential Revision: https://reviews.llvm.org/D94734
-
Florian Hahn authored
Now that VPRecipeBase inherits from VPDef, we can always use the new VPValue for replacement, if the recipe defines one. Given the recipes that are supported at the moment, all new recipes must have either 0 or 1 defined values.
-
Nick Desaulniers authored
Fixes an infinite loop encountered in GVN. GVN will delay PRE if it encounters critical edges, attempt to split them later via calls to SplitCriticalEdge(), then restart. The caller of GVN::splitCriticalEdges() assumed a return value of true meant that critical edges were split, that the IR had changed, and that PRE should be re-attempted, upon which we loop infinitely. This was exposed after D88438, by compiling the Linux kernel for s390, but the test case is reproducible on x86. Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261 Reviewed By: void Differential Revision: https://reviews.llvm.org/D94996
-
Craig Topper authored
This makes our i8/i16 codegen more similar to the i32 codegen. I've also added computeKnownBits support for DIVUW/REMUW so that we can remove zero extending ANDs from the output. Without this we end up turning DIVUW/REMUW back into DIVU/REMU via some isel patterns. Reviewed By: frasercrmck, luismarques Differential Revision: https://reviews.llvm.org/D95322
-
Reid Kleckner authored
The unwind info format requires that all adjustments are 8 byte aligned, and the bottom three bits are masked out. Most Win64 calling conventions have 32 bytes of shadow stack space for spilling parameters, and I believe that constructing these fixed stack objects had the side effect of ensuring an alignment of 8. However, the Intel regcall convention does not have this shadow space, so when using that convention, it was possible to make a 4 byte stack frame, which was impossible to describe with unwind info. Fixes pr48867
-
Wei Mi authored
turning off SampleFDO silently. Currently sample loader pass turns off SampleFDO optimization silently when it sees error in reading the profile. This behavior will defeat the tests which could have caught those bad/incompatible profile problems. This patch change the behavior to report error. Differential Revision: https://reviews.llvm.org/D95269
-
Nemanja Ivanovic authored
This intrinsic is supposed to have the permute control vector complemented on little endian systems (as the ABI specifies and GCC implements). With the current code gen, the result vector is byte-reversed. Differential revision: https://reviews.llvm.org/D95004
-
David Green authored
Until fairly recently the calling convention for IR half was not handled correctly in the ARM backend, meaning we needed to pass pointers that were loaded/stored. Now that that is fixed we can switch to using the type directly instead.
-
Craig Topper authored
As far as I know 32 bits arguments and returns on RV64 are always sign extended to i64. So I think we should be taking this into account around libcalls. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95285
-
Dmitry Preobrazhensky authored
Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D95212
-
Nico Weber authored
This reverts commit 6884fbc2. Breaks tests on Windows: http://45.33.8.238/win/31981/step_11.txt
-
Jeroen Dobbelaere authored
This was enabled in https://reviews.llvm.org/D95335 but it breaks the stage2 fuchsia build (See http://lab.llvm.org:8011/#/builders/98/builds/4105/steps/9/logs/stdio)
-
Simon Pilgrim authored
Remove bitcasts to/from v4x64 types through vperm2f128/vperm2i128 ops to help improve shuffle combining and demanded vector elts folding.
-
Jeroen Dobbelaere authored
Checking the llvm.experimental.noalias.scope.decl dominance can be worstcase O(N^2). Limit the dominance check to N=32. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D95335
-
xgupta authored
-
Florian Hahn authored
This patch adds plumbing to handle scalarized values directly in VPTransformState. Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D92282
-
xgupta authored
-
Simon Pilgrim authored
We're relying on the source inputs for shuffle combining having already been widened to the root size (otherwise the offset logic falls over) - we're going to be supporting different sized shuffle inputs soon, so we need to explicitly make the minimum widened width the original root size.
-
Abhina Sreeskantharajan authored
This reverts commit 06f8a496.
-
Abhina Sreeskantharajan authored
Revert "[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests - continued" This reverts commit 520b5ecf.
-
Sanjay Patel authored
We can sink extends after min/max if they match and would not change the sign-interpreted compare. The only combo that doesn't work is zext+smin/smax because the zexts could change a negative number into positive: https://alive2.llvm.org/ce/z/D6sz6J Sext+umax/umin works: define i32 @src(i8 %x, i8 %y) { %0: %sx = sext i8 %x to i32 %sy = sext i8 %y to i32 %m = umax i32 %sx, %sy ret i32 %m } => define i32 @tgt(i8 %x, i8 %y) { %0: %m = umax i8 %x, %y %r = sext i8 %m to i32 ret i32 %r } Transformation seems to be correct!
-
Sanjay Patel authored
-
Sander de Smalen authored
This change also changes getReductionCost to return InstructionCost, and it simplifies two expressions by removing a redundant 'isValid' check.
-
Simon Pilgrim authored
We allow extract_subvector lowering of all legal types, so pre-bitcast the source type to try and reduce bitcast pollution.
-
Simon Pilgrim authored
We allow insert_subvector lowering of all legal types, so don't always cast to the vXi64/vXf64 shuffle types - this is only necessary for X86ISD::SHUF128/X86ISD::VPERM2X128 patterns later.
-
Simon Pilgrim authored
Use const reference to avoid std::string copy - accordingly to the style guide we shouldn't be using auto anyway. Fixes MSVC analyzer warning.
-
Sander de Smalen authored
For a function that returns InstructionCost, it is very tempting to write: return InstructionCost::Invalid; But that actually returns InstructionCost(1 /* int value of Invalid */)) which has a totally different meaning. By marking this constructor as `delete`, this can no longer happen.
-
Fraser Cormack authored
This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501
-
Philip Pfaffe authored
The llvm-dwp tool hard-codes the target triple to x86. Instead, deduce the target triple from the object files being read. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D93749
-