- Jan 31, 2022
-
-
LLVM GN Syncbot authored
-
Kerry McLaughlin authored
Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca() when we call this function for a scalable alloca instruction, caused by the implicit conversion of TySize to uint64_t. This patch changes TySize to a TypeSize as returned by getTypeAllocSize() and ensures the allocation size is multiplied by vscale for scalable vectors. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D118372
-
Ties Stuij authored
This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and ARM. For more information, see: - https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c - https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components The following people contributed to this patch: - Simon Tatham - Ties Stuij Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D117202
-
Dávid Bolvanský authored
Very similar to https://reviews.llvm.org/D101230 Fixes https://github.com/llvm/llvm-project/issues/53501
-
Simon Pilgrim authored
[X86] combineAnd() - per-element simplification - call SimplifyDemandedBits using mask demanded bits if SimplifyDemandedVectorElts fails We already call SimplifyDemandedVectorElts using whether each vector mask element is zero/nonzero, this just extends this to also try SimplifyDemandedBits using the demanded bits mask generated from the nonzero elements. This also requires an additional TargetLowering::SimplifyDemandedBits DemandedBits/DemandedElts wrapper.
-
Jeremy Morse authored
If we only assign a variable value a single time, we can take a short-cut when computing its location: the variable value is only valid up to the dominance frontier of where the assignemnt happens. Past that point, there are other predecessors from where the variable has no value, meaning the variable has no location past that point. This patch recognises this scenario, and avoids expensive SSA computation, to improve compile-time performance. Differential Revision: https://reviews.llvm.org/D117877
-
Nico Weber authored
This reverts commit 7b2dfe1c. Matches ab3b8985.
-
Momchil Velikov authored
Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D118451
-
Simon Pilgrim authored
-
Simon Pilgrim authored
We can only assume bit[1] == zero if its the only demanded bit or the source is not undef/poison
-
Jay Foad authored
-
Jay Foad authored
If AMDGPUAnnotateUniformValues finds a load from a uniform pointer with no potentially clobbering stores between the kernel entry point and the load instruction, it adds noclobber metadata to the *address*. This is unsafe because it can get applied to other loads in the same which do have aliasing stores. Differential Revision: https://reviews.llvm.org/D118458
-
Simon Pilgrim authored
As raised by @efriedma on D117995 - the source must not be undef/poison to demand any bits in mul(x,x) other than bit[1] https://alive2.llvm.org/ce/z/Cxkjen
-
Jay Foad authored
This avoids various cases where StructurizeCFG would otherwise insert an xor i1 instruction, and it since it generally runs late in the pipeline, instcombine does not clean up the xor-of-cmp pattern. Differential Revision: https://reviews.llvm.org/D118478
-
Paulo Matos authored
This patches fixes the visibility and linkage information of symbols referring to IR globals. Emission of external declarations is now done in the first execution of emitConstantPool rather than in emitLinkage (and a few other places). This is the point where we have already gathered information about used symbols (by running the MC Lower PrePass) and not yet started emitting any functions so that any declarations that need to be emitted are done so at the top of the file before any functions. This changes the order of a few directives in the final asm file which required an update to a few tests. Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D118122
-
Florian Hahn authored
The current cost-model overestimates the cost of vector compares & selects for ordered floating point compares. This patch fixes that by extending the existing logic for integer predicates. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D118256
-
serge-sans-paille authored
Based on the output of include-what you-use. Most notably, llvm/Remarks/Remark.h is no longer automatically included by llvm/Remarks/RemarkParser.h, so client code may need to include explicitly. clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Remarks/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 770253 after: 759347 Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118506
-
serge-sans-paille authored
Based on the output of include-what-you-use. It's an utility directory, so no much impact on other code areas. clang++ -E -Iinclude -I../llvm/include ../llvm/utils/TableGen/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 4327274 after: 4316190 Related discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118466
-
Nikita Popov authored
The pruning cloner already tries to remove unreachable blocks. The original cloning process will simplify instructions and constant terminators, and only clone blocks that are reachable at that point. However, phi nodes can only be simplified after everything has been cloned. For that reason, additional blocks may become unreachable after phi simplification. The code does try to handle this as well, but only removes blocks that don't have predecessors. It misses unreachable cycles. This can cause issues if SEH exception handling code is part of an unreachable cycle, as the inliner is not prepared to deal with that. This patch instead performs an explicit scan for reachable blocks, and drops everything else. Fixes https://github.com/llvm/llvm-project/issues/53206. Differential Revision: https://reviews.llvm.org/D118449
-
Nikita Popov authored
masked.atomicrmw.*.i32 intrinsics access an i32 (and then possibly mask it), so hardcode MVT::i32 as the access type here, rather than determining it from the pointer element type. Differential Revision: https://reviews.llvm.org/D118336
-
Amir Ayupov authored
-
Max Kazantsev authored
Following Sanjay's proposal from discussion in D118317, this patch generalizes and-reduce handling to fold the following pattern ``` icmp ne (bitcast(icmp ne (lhs, rhs)), 0) ``` into ``` icmp ne (bitcast(lhs), bitcast(rhs)) ``` https://alive2.llvm.org/ce/z/WDcuJ_ Differential Revision: https://reviews.llvm.org/D118431 Reviewed By: lebedev.ri
-
Craig Topper authored
This is a slight change because I'm using the ANY_EXTEND result instead of the original operand, but getNode should constant fold. While there, add a comment about why the code specifically checks for a ConstantSDNode.
-
Craig Topper authored
Make sure we cover the encodings use for zext.h and other encodings not used for zext.h.
-
Craig Topper authored
Based on the existing naming "only" tests are used for rv32 instructions that don't exist in rv64. rv32 tests without "only" are for instructions that are in both rv32 and rv64. The rv64 tests are for instructions that are only in rv64. Both of these test files have instruction encodings that are only valid in rv64 so they can be the same file.
-
Craig Topper authored
-
Craig Topper authored
"pack t0, t1, zero" disassembles to "pack t0, t1, zero" with Zbkb not "zext.h t0, t1" Part of the test was using a CHECK prefix that doesn't appear on the RUN line.
-
Kazu Hirata authored
Identified with readability-const-return-type.
-
Kazu Hirata authored
-
Fangrui Song authored
[mlgo][regalloc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after a8a7bf92
-
- Jan 30, 2022
-
-
Mircea Trofin authored
If AllocationOrder has less than 32 elements, we were treating the extra positions as if they were valid. This was detected by a subsequent assert. The fix also tightens the asserts.
-
Craig Topper authored
These are special versions of the more general shfli/unshfli instructions. We can use the general ISD opcodes with the correct immediates.
-
Markus Böck authored
Both IDFCalculatorBase and its accompanying DominatorTreeBase only supports pointer nodes. The template argument is the block type itself and any uses of GraphTraits is therefore done via a pointer to the node type. However, the ChildrenGetterTy type of IDFCalculatorBase has a use on just the node type instead of a pointer to the node type. Various parts of the monorepo has worked around this issue by providing specializations of GraphTraits for the node type directly, or not been affected by using specializations instead of the generic case. These are unnecessary however and instead the generic code should be fixed instead. An example from within Tree is eg. A use of IDFCalculatorBase in InstrRefBasedImpl.cpp. It basically instantiates a IDFCalculatorBase<MachineBasicBlock, false> but due to the bug above then goes on to specialize GraphTraits<MachineBasicBlock> although GraphTraits<MachineBasicBlock*> exists (and should be used instead). Similar dead code exists in clang which defines redundant GraphTraits to work around this bug. This patch fixes both the original issue and removes the dead code that was used to work around the issue. Differential Revision: https://reviews.llvm.org/D118386
-
Craig Topper authored
We can use the RISCVISD::GREV encoding that swaps the bits in each byte. This allows it to use the existing computeKnownBits support for RISCVISD::GREV.
-
Kazu Hirata authored
Identified with modernize-use-nullptr.
-
Kazu Hirata authored
Identified with readability-string-compare.
-
Kazu Hirata authored
Identified with modernize-use-default-member-init.
-
Simon Pilgrim authored
Limit this to SSE41 - AVX1 targets to avoid UNPCKL(PSHUFB,PSHUFB), pre-SSE41 we don't have PACKUSDW/BLENDW and with AVX2 we can perform this as PERMQ(PSHUFB()).
-
Simon Pilgrim authored
Don't extract the ANY/ZERO_EXTEND_VECTOR_INREG subvector source until we're definitely combining to a new node.
-