- Jan 31, 2022
-
-
Kerry McLaughlin authored
Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca() when we call this function for a scalable alloca instruction, caused by the implicit conversion of TySize to uint64_t. This patch changes TySize to a TypeSize as returned by getTypeAllocSize() and ensures the allocation size is multiplied by vscale for scalable vectors. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D118372
-
Ties Stuij authored
This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and ARM. For more information, see: - https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c - https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components The following people contributed to this patch: - Simon Tatham - Ties Stuij Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D117202
-
Dávid Bolvanský authored
Very similar to https://reviews.llvm.org/D101230 Fixes https://github.com/llvm/llvm-project/issues/53501
-
Simon Pilgrim authored
[X86] combineAnd() - per-element simplification - call SimplifyDemandedBits using mask demanded bits if SimplifyDemandedVectorElts fails We already call SimplifyDemandedVectorElts using whether each vector mask element is zero/nonzero, this just extends this to also try SimplifyDemandedBits using the demanded bits mask generated from the nonzero elements. This also requires an additional TargetLowering::SimplifyDemandedBits DemandedBits/DemandedElts wrapper.
-
Jeremy Morse authored
If we only assign a variable value a single time, we can take a short-cut when computing its location: the variable value is only valid up to the dominance frontier of where the assignemnt happens. Past that point, there are other predecessors from where the variable has no value, meaning the variable has no location past that point. This patch recognises this scenario, and avoids expensive SSA computation, to improve compile-time performance. Differential Revision: https://reviews.llvm.org/D117877
-
Momchil Velikov authored
Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D118451
-
Simon Pilgrim authored
-
Simon Pilgrim authored
We can only assume bit[1] == zero if its the only demanded bit or the source is not undef/poison
-
Jay Foad authored
-
Jay Foad authored
This avoids various cases where StructurizeCFG would otherwise insert an xor i1 instruction, and it since it generally runs late in the pipeline, instcombine does not clean up the xor-of-cmp pattern. Differential Revision: https://reviews.llvm.org/D118478
-
Paulo Matos authored
This patches fixes the visibility and linkage information of symbols referring to IR globals. Emission of external declarations is now done in the first execution of emitConstantPool rather than in emitLinkage (and a few other places). This is the point where we have already gathered information about used symbols (by running the MC Lower PrePass) and not yet started emitting any functions so that any declarations that need to be emitted are done so at the top of the file before any functions. This changes the order of a few directives in the final asm file which required an update to a few tests. Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D118122
-
Florian Hahn authored
The current cost-model overestimates the cost of vector compares & selects for ordered floating point compares. This patch fixes that by extending the existing logic for integer predicates. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118256
-
serge-sans-paille authored
Based on the output of include-what you-use. Most notably, llvm/Remarks/Remark.h is no longer automatically included by llvm/Remarks/RemarkParser.h, so client code may need to include explicitly. clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Remarks/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 770253 after: 759347 Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118506
-
Nikita Popov authored
The pruning cloner already tries to remove unreachable blocks. The original cloning process will simplify instructions and constant terminators, and only clone blocks that are reachable at that point. However, phi nodes can only be simplified after everything has been cloned. For that reason, additional blocks may become unreachable after phi simplification. The code does try to handle this as well, but only removes blocks that don't have predecessors. It misses unreachable cycles. This can cause issues if SEH exception handling code is part of an unreachable cycle, as the inliner is not prepared to deal with that. This patch instead performs an explicit scan for reachable blocks, and drops everything else. Fixes https://github.com/llvm/llvm-project/issues/53206. Differential Revision: https://reviews.llvm.org/D118449
-
Nikita Popov authored
masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly mask it), so hardcode MVT::i32 as the access type here, rather than determining it from the pointer element type. Differential Revision: https://reviews.llvm.org/D118336
-
Max Kazantsev authored
Following Sanjay's proposal from discussion in D118317, this patch generalizes and-reduce handling to fold the following pattern ``` icmp ne (bitcast(icmp ne (lhs, rhs)), 0) ``` into ``` icmp ne (bitcast(lhs), bitcast(rhs)) ``` https://alive2.llvm.org/ce/z/WDcuJ_ Differential Revision: https://reviews.llvm.org/D118431 Reviewed By: lebedev.ri
-
Craig Topper authored
This is a slight change because I'm using the ANY_EXTEND result instead of the original operand, but getNode should constant fold. While there, add a comment about why the code specifically checks for a ConstantSDNode.
-
Kazu Hirata authored
Identified with readability-const-return-type.
-
Fangrui Song authored
[mlgo][regalloc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after a8a7bf92
- Jan 30, 2022
-
-
Mircea Trofin authored
If AllocationOrder has less than 32 elements, we were treating the extra positions as if they were valid. This was detected by a subsequent assert. The fix also tightens the asserts.
-
Craig Topper authored
These are special versions of the more general shfli/unshfli instructions. We can use the general ISD opcodes with the correct immediates.
-
Markus Böck authored
Both IDFCalculatorBase and its accompanying DominatorTreeBase only supports pointer nodes. The template argument is the block type itself and any uses of GraphTraits is therefore done via a pointer to the node type. However, the ChildrenGetterTy type of IDFCalculatorBase has a use on just the node type instead of a pointer to the node type. Various parts of the monorepo has worked around this issue by providing specializations of GraphTraits for the node type directly, or not been affected by using specializations instead of the generic case. These are unnecessary however and instead the generic code should be fixed instead. An example from within Tree is eg. A use of IDFCalculatorBase in InstrRefBasedImpl.cpp. It basically instantiates a IDFCalculatorBase<MachineBasicBlock, false> but due to the bug above then goes on to specialize GraphTraits<MachineBasicBlock> although GraphTraits<MachineBasicBlock*> exists (and should be used instead). Similar dead code exists in clang which defines redundant GraphTraits to work around this bug. This patch fixes both the original issue and removes the dead code that was used to work around the issue. Differential Revision: https://reviews.llvm.org/D118386
-
Craig Topper authored
We can use the RISCVISD::GREV encoding that swaps the bits in each byte. This allows it to use the existing computeKnownBits support for RISCVISD::GREV.
-
Kazu Hirata authored
Identified with modernize-use-nullptr.
-
Kazu Hirata authored
Identified with readability-string-compare.
-
Kazu Hirata authored
Identified with modernize-use-default-member-init.
-
Simon Pilgrim authored
Limit this to SSE41 - AVX1 targets to avoid UNPCKL(PSHUFB,PSHUFB), pre-SSE41 we don't have PACKUSDW/BLENDW and with AVX2 we can perform this as PERMQ(PSHUFB()).
-
Simon Pilgrim authored
Don't extract the ANY/ZERO_EXTEND_VECTOR_INREG subvector source until we're definitely combining to a new node.
-
Simon Pilgrim authored
Allows pow2 mask tests to avoid an unnecessary constant load. Noticed while investigating how to extend MatchVectorAllZeroTest to support more allof/anyof patterns.
-
Ricky Zhou authored
Before this change, InstCombine was willing to fold atomic and non-atomic loads through a PHI node as long as the first PHI argument is not an atomic load. The combined load would be non-atomic, which is incorrect. Fix this by only combining the loads in a PHI node when all of the arguments are non-atomic loads. Thanks to Eli Friedman for pointing out the bug at https://github.com/llvm/llvm-project/issues/50777#issuecomment-981045342! Fixes #50777 Differential Revision: https://reviews.llvm.org/D115113
-
Ricky Zhou authored
Preliminary clean-up for D115113 Differential Revision: https://reviews.llvm.org/D116086
-
Ricky Zhou authored
Uppercase some variable names, per LLVM coding standards. This change intentionally does not rename every miscased variable, as a follow-up change ( D116086 ) intends to eliminate many of those by switching loops to range for loops. Differential Revision: https://reviews.llvm.org/D118553
-
Florian Hahn authored
This removes the remaining dependence on LoopVectorizationCostModel from buildScalarSteps and is required so it can be moved out of ILV. It also improves allows us to remove a few unneeded instructions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116554
-
Nuno Lopes authored
gep(x, undef) carries the provenance of x, so we can't replace it with any pointer like undef. This leaves room for improvement for the poison case, but that's currently not possible as the demanded bits API doesn't distinguish between undef & poison bits. Fixes #44790
-
Nuno Lopes authored
-
Craig Topper authored
We already have an ISD opcode for the more general GREV/GREVI instructon. We can just use it with the encoding that corresponds to the behavior of brev8. This is similar to what we do for orc.b where we use the GORC ISD opcode.
-
Craig Topper authored
Especially placing W instructions/patterns near their non-W versions.
-
Craig Topper authored
-
- Jan 29, 2022
-
-
Nuno Lopes authored
phi([undef, A], [x, B]) -> x is only correct x is guaranteed to be a non-poison value. Otherwise we would be changing an undef to poison in the branch A. Differential Revision: https://reviews.llvm.org/D117907
-