- Jan 31, 2022
-
-
Kerry McLaughlin authored
Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca() when we call this function for a scalable alloca instruction, caused by the implicit conversion of TySize to uint64_t. This patch changes TySize to a TypeSize as returned by getTypeAllocSize() and ensures the allocation size is multiplied by vscale for scalable vectors. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D118372
-
Dávid Bolvanský authored
Very similar to https://reviews.llvm.org/D101230 Fixes https://github.com/llvm/llvm-project/issues/53501
-
Simon Pilgrim authored
[X86] combineAnd() - per-element simplification - call SimplifyDemandedBits using mask demanded bits if SimplifyDemandedVectorElts fails We already call SimplifyDemandedVectorElts using whether each vector mask element is zero/nonzero, this just extends this to also try SimplifyDemandedBits using the demanded bits mask generated from the nonzero elements. This also requires an additional TargetLowering::SimplifyDemandedBits DemandedBits/DemandedElts wrapper.
-
Jeremy Morse authored
If we only assign a variable value a single time, we can take a short-cut when computing its location: the variable value is only valid up to the dominance frontier of where the assignemnt happens. Past that point, there are other predecessors from where the variable has no value, meaning the variable has no location past that point. This patch recognises this scenario, and avoids expensive SSA computation, to improve compile-time performance. Differential Revision: https://reviews.llvm.org/D117877
-
Simon Pilgrim authored
-
Simon Pilgrim authored
We can only assume bit[1] == zero if its the only demanded bit or the source is not undef/poison
-
Fangrui Song authored
[mlgo][regalloc] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after a8a7bf92
-
- Jan 30, 2022
-
-
Mircea Trofin authored
If AllocationOrder has less than 32 elements, we were treating the extra positions as if they were valid. This was detected by a subsequent assert. The fix also tightens the asserts.
-
Markus Böck authored
Both IDFCalculatorBase and its accompanying DominatorTreeBase only supports pointer nodes. The template argument is the block type itself and any uses of GraphTraits is therefore done via a pointer to the node type. However, the ChildrenGetterTy type of IDFCalculatorBase has a use on just the node type instead of a pointer to the node type. Various parts of the monorepo has worked around this issue by providing specializations of GraphTraits for the node type directly, or not been affected by using specializations instead of the generic case. These are unnecessary however and instead the generic code should be fixed instead. An example from within Tree is eg. A use of IDFCalculatorBase in InstrRefBasedImpl.cpp. It basically instantiates a IDFCalculatorBase<MachineBasicBlock, false> but due to the bug above then goes on to specialize GraphTraits<MachineBasicBlock> although GraphTraits<MachineBasicBlock*> exists (and should be used instead). Similar dead code exists in clang which defines redundant GraphTraits to work around this bug. This patch fixes both the original issue and removes the dead code that was used to work around the issue. Differential Revision: https://reviews.llvm.org/D118386
-
Kazu Hirata authored
Identified with modernize-use-default-member-init.
-
- Jan 29, 2022
-
-
Mircea Trofin authored
This allows scearios where some central config sets it one way and a user wants to override it.
-
- Jan 28, 2022
-
-
Fangrui Song authored
Close #52781: for LTO, the inline asm diagnostic uses `<inline asm>` as the file name (lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp) and it is unclear which module has the issue. With this patch, we will see the module name (say `asm.o`) before `<inline asm>` with ThinLTO. ``` % clang -flto=thin -c asm.c && myld.lld asm.o -e f ld.lld: error: asm.o <inline asm>:1:2: invalid instruction mnemonic 'invalid' invalid ^~~~~~~ ``` For regular LTO, unfortunately the original module name is lost and we only get ld-temp.o. Reviewed By: #lld-macho, ychen, Jez Ng Differential Revision: https://reviews.llvm.org/D118434
-
Cullen Rhodes authored
If we have a vector FP division with a splatted divisor, use getVectorMinNumElements when scaling the num of uses by splat factor. For AArch64 the combine kicks in for the <vscale x 4 x float> case since it's above the fdiv threshold (3) when scaling num uses by splat factor, but the codegen is worse (splat + vector fdiv + vector fmul) than the <vscale x 2 x double> case (splat + vector fdiv). If the combine could be converted into a scalar FP division by scalarizeBinOpOfSplats it may be cheaper, but it looks like this is predicated on the isExtractVecEltCheap TLI function which is implemented for x86 but not AArch64. Perhaps for now combineRepeatedFPDivisors should only scale num uses by splat if the division can be converted into scalar op. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D118343
-
Jeremy Morse authored
Shiny new DBG_PHI instruction usually have physical registers as operands -- however, the machine verifier checks to see whether they're live, and occasionally this fails. There's a filter for DBG_VALUE instructions to not get verified in this way: expand it to exempt all debug instructions from liveness checking, which means DBG_PHIs get treated like DBG_VALUEs. This also future proofs against us adding new debug instructions. Differential Revision: https://reviews.llvm.org/D117891
-
Martin Storsjö authored
On the level of the generated object files, both symbols (both original and alias) are generally indistinguishable - both are regular defined symbols. But previously, only the original function had the COFF ComplexType set to IMAGE_SYM_DTYPE_FUNCTION, while the symbol created via an alias had the type set to IMAGE_SYM_DTYPE_NULL. This matches what GCC does, which emits directives for setting the COFF symbol type for this kind of alias symbol too. This makes a difference when GNU ld.bfd exports symbols without dllexport directives or a def file - it seems to decide between function or data exports based on the COFF symbol type. This means that functions created via aliases, like some C++ constructors, are exported as data symbols (missing the thunk for calling without dllimport). The hasnt been an issue when doing the same with LLD, as LLD decides between function or data export based on the flags of the section that the symbol points at. This should fix the root cause of https://github.com/msys2/MINGW-packages/issues/10547. Differential Revision: https://reviews.llvm.org/D118328
-
Ellis Hoag authored
Use the llvm flag `-pgo-function-entry-coverage` to create single byte "counters" to track functions coverage. This mode has significantly less size overhead in both code and data because * We mark a function as "covered" with a store instead of an increment which generally requires fewer assembly instructions * We use a single byte per function rather than 8 bytes per block The trade off of course is that this mode only tells you if a function has been covered. This is useful, for example, to detect dead code. When combined with debug info correlation [0] we are able to create an instrumented Clang binary that is only 150M (the vanilla Clang binary is 143M). That is an overhead of 7M (4.9%) compared to the default instrumentation (without value profiling) which has an overhead of 31M (21.7%). [0] https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4 Reviewed By: kyulee Differential Revision: https://reviews.llvm.org/D116180
-
- Jan 27, 2022
-
-
Simon Pilgrim authored
We already perform some basic folds (add/sub with zero etc.) on scalar types, this patch adds some basic support for constant splats as well in a few cases (we can add more with future test coverage). In the cases I've enabled, we can handle buildvector implicit truncation as we're not creating new constant nodes from the vector types - we're just returning existing nodes. This allows us to get a number of extra cases in the aarch64 tests. I haven't enabled support for undefs in buildvector splats, as we're often checking for zero/allones patterns that return the original constant and we shouldn't be returning undef elements in some of these cases - we can enable this later if we're OK with creating new constants. Differential Revision: https://reviews.llvm.org/D118264
-
Fraser Cormack authored
This patch adds support for expanding VP_MERGE through a sequence of vector operations producing a full-length mask setting up the elements past EVL/pivot to be false, combining this with the original mask, and culminating in a full-length vector select. This expansion should work for any data type, though the only use for RVV is for boolean vectors, which themselves rely on an expansion for the VSELECT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118058
-
- Jan 26, 2022
-
-
Adrian Prantl authored
A shift-left > 63 triggers a UBSAN failure. This patch kicks the can down the road (to the consumer) by emitting a more compact representation of the shift computation in DWARF expressions. Relanding (I accidentally pushed an earlier version of the patch previously). Differential Revision: https://reviews.llvm.org/D118183
-
Adrian Prantl authored
This reverts commit 216002c4 while investigating bot breakage.
-
Matt Arsenault authored
The physical register in the asm has the wrong type for the declared IR. It seems to work in the DAG by extracting the 4 elements that are defined in the IR from the register, but that isn't handled here. This doesn't seem to be a well tested path since other mismatched cases are crashing the DAG asm handling.
-
Adrian Prantl authored
A shift-left > 63 triggers a UBSAN failure. This patch kicks the can down the road (to the consumer) by emitting a more compact representation of the shift computation in DWARF expressions. Differential Revision: https://reviews.llvm.org/D118183
-
Chih-Ping Chen authored
DIStringType is used to encode the debug info of a character object in Fortran. A Fortran deferred-length character object is typically implemented as a pair of the following two pieces of info: An address of the raw storage of the characters, and the length of the object. The stringLocationExp field contains the DIExpression to get to the raw storage. This patch also enables the emission of DW_AT_data_location attribute in a DW_TAG_string_type debug info entry based on stringLocationExp in DIStringType. A test is also added to ensure that the bitcode reader is backward compatible with the old DIStringType format. Differential Revision: https://reviews.llvm.org/D117586
-
Benjamin Kramer authored
This reverts commit ef820632. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat
-
Sanjay Patel authored
The loop below the changed line assumes that the element width of the target constant is the same as the element width of the loaded value, but that is not always true. We could try harder to do some kind of min/max calc even if the sizes don't match, but that can be another patch if needed. This fixes #53401 (miscompile) and does not change the motivating cases added when this analysis was introduced: ad298f86
-
serge-sans-paille authored
As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).
-
alex-t authored
Currently not (xor_one_use) pattern is always selected to S_XNOR irrelative od the node divergence. This relies on further custom selection pass which converts to VALU if necessary and replaces with V_NOT_B32 ( V_XOR_B32) on those targets which have no V_XNOR. Current change enables the patterns which explicitly select the not (xor_one_use) to appropriate form. We assume that xor (not) is already turned into the not (xor) by the combiner. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D116270
-
Sebastian Neubauer authored
Fold (unmerge undef) -> undef, undef, ... Differential Revision: https://reviews.llvm.org/D118138
-
David Green authored
This is the unsigned variant of D111976, where we convert a clamped fptoui to a fptoui.sat. Because we are unsigned, the condition this time is only UMIN of UINT_MAX. Similarly to D111976 it handles ISD::UMIN, ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes. This especially helps on ARM/AArch64 where the vcvt instructions naturally saturate the result. Differential Revision: https://reviews.llvm.org/D114964
-
wangpc authored
When evicting interference, it causes an asseertion error since LiveIntervals::intervalIsInOneMBB assumes that input is not empty. This patch fixed bug mentioned in D118020. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D118124
-
- Jan 25, 2022
-
-
Adrian Prantl authored
"Fix UB in DwarfExpression::emitLegacyZExt()" This reverts commit e37de5d3.
-
Adrian Prantl authored
A shift-left > 63 triggers a UBSAN failure. This patch kicks the can down the road (to the consumer) by emitting a more compact representation of the shift computation in DWARF expressions. Differential Revision: https://reviews.llvm.org/D118183
-
Sean Fertile authored
During fast-isel calling 'markFunctionEnd' in the base class will call tidyLandingPads. This can cause an issue where we have determined that we need ehinfo and emitted a traceback table with the bits set to indicate that we will be emitting the ehinfo, but the tidying deletes all landing pads. In this case we end up emitting a reference to __ehinfo.N symbol, but not emitting a definition to said symbol and the resulting file fails to assemble. Differential Revision: https://reviews.llvm.org/D117040
-
Nikita Popov authored
Same change as has been made for the SDAG lowering.
-
Simon Pilgrim authored
Now that we constant fold and canonicalize constants to the RHS, we don't need to check both LHS and RHS for specific constants
-
Bjorn Pettersson authored
An sra is basically sign-extending a narrower value. Fold away the shift by doing a sextload of a narrower value, when it is legal to reduce the load width accordingly. Differential Revision: https://reviews.llvm.org/D116930
-
Fraser Cormack authored
This patch adds widening support for ISD::VP_MERGE, which widens identically to VP_SELECT and similarly to other select-like nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118030
-
Fraser Cormack authored
This patch adds splitting support for ISD::VP_MERGE, which splits identically to VP_SELECT and similarly to other select-like nodes. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118032
-
Victor Perez authored
Split these nodes in a similar way as their masked versions. Reviewed By: frasercrmck, craig.topper Differential Revision: https://reviews.llvm.org/D117760
-
Nikita Popov authored
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType(). This is part of D117885, in preparation for deprecating the API.
-