- Apr 07, 2022
-
-
Benjamin Kramer authored
-
Weining Lu authored
-
Fraser Cormack authored
This patch has no effect on the generated code, whilst mitigating the increase in ISel table size caused by the recent addition of masked patterns. I aim to do the same for floating-point patterns once D123051 lands, giving us a reason to use masked floating-point patterns. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D123217
-
Benjamin Kramer authored
-
Benjamin Kramer authored
Utils can't depend on Scalar transforms.
-
Florian Hahn authored
This brings the VPlan block naming in line with the naming of the generated basic blocks.
-
Fraser Cormack authored
This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions. Most notably this patch adds cond-code legalization for VP_SETCC, reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in additional SDValue parameters for the Mask and EVL. This method then uses VP operations to legalize the condcode. There is still a general lack of canonicalization on VP_SETCC as opposed to SETCC which results in worse code than is theoretically possible. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123051
-
Nikita Popov authored
This removes options for performing LTO with the legacy pass manager in LLD. Options that explicitly enable the new pass manager are retained as no-ops. Differential Revision: https://reviews.llvm.org/D123219
-
Nikita Popov authored
This option controls whether -opaque-pointers or -no-opaque-pointers is the default. Once opaque pointers are enabled by default, this will provide a simple way to temporarily opt-out of the change. Differential Revision: https://reviews.llvm.org/D123122
-
Valentin Clement authored
This patch enhances the CSE pass to deal with simple cases of duplicated operations with MemoryEffects. It allows the CSE pass to remove safely duplicate operations with the MemoryEffects::Read that have no other side-effecting operations in between. Other MemoryEffects::Read operation are allowed. The use case is pretty simple so far so we can build on top of it to add more features. This patch is also meant to avoid a dedicated CSE pass in FIR and was brought together afetr discussion on https://reviews.llvm.org/D112711. It does not currently cover the full range of use cases described in https://reviews.llvm.org/D112711 but the idea is to gradually enhance the MLIR CSE pass to handle common use cases that can be used by other dialects. This patch takes advantage of the new CSE capabilities in Fir. Reviewed By: mehdi_amini, rriddle, schweitz Differential Revision: https://reviews.llvm.org/D122801
-
Wei Xiao authored
smin(x, 0): (select (x < 0), x, 0) -> ((x >> (size_in_bits(x)-1))) & x smax(x, 0): (select (x > 0), x, 0) -> (~(x >> (size_in_bits(x)-1))) & x The comparison is testing for a positive value, we have to invert the sign bit mask, so only do that transform if the target has a bitwise 'and not' instruction (the invert is free). The transform is performed only when CMP has a single user to avoid increasing total instruction number. https://alive2.llvm.org/ce/z/euUnNm https://alive2.llvm.org/ce/z/37339J Differential Revision: https://reviews.llvm.org/D123109
-
Nikita Popov authored
LoopSink with the legacy pass manager still uses AST, because we can't compute MemorySSA conditionally. I think now that the legacy pass manager will be removed soon(TM) we don't need to care about compile-time impact here anymore. Additionally, since MemorySSA is no longer eagerly optimized, the impact is actually not that high anymore (~0.2% geomean regression on CTMark). This just makes legacy PM and new PM behavior line up -- as a followup I'll drop these options entirely and make MemorySSA use mandatory. Differential Revision: https://reviews.llvm.org/D123216
-
Balázs Kéri authored
Another change of the code design. Code simplified again, now there is a single place to check a handler function and less functions for bug report emitting. More details are added to the bug report messages. Reviewed By: whisperity Differential Revision: https://reviews.llvm.org/D118370
-
Tobias Hieta authored
Follow-up from 98bc304e - while that commit fixed when you had two PDBs colliding on the same Guid it didn't fix the case where you had more than two PDBs using the same Guid. This commit fixes that and also tests much more carefully that all the types are correct no matter the order. Reviewed By: aganea, saudi Differential Revision: https://reviews.llvm.org/D123185
-
Stanislav Mekhanoshin authored
Added -disable-gisel-legality-check to couple GlobalISel tests which have not legal instructions to avoid difference in debug and release builds.
-
Jason Molenda authored
debugserver does not call thread_set_state when changing xmm/ymm/zmm register values, so the register contents are never updated. Fix that. Mark the shell tests which xfail'ed these tests on darwin systems to xfail them when the system debugserver, they will pass when using the in-tree debugserver. When this makes it into the installed system debugservers, we'll remove the xfails. Differential Revision: https://reviews.llvm.org/D123269 rdar://91258333 rdar://31294382
-
Liqin Weng authored
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122644
-
Petr Hosek authored
We would like to use bolt with Fuchsia toolchain. Differential Revision: https://reviews.llvm.org/D123280
-
Fangrui Song authored
It is used by a few projects like keepassxc and mumble. Also see https://bugzilla.redhat.com/show_bug.cgi?id=2070813 that Fedora gcc has an (unneeded) gcc12-no-add-needed.patch which adds --no-add-needed, although --[no-]add-needed has been deprecated in GNU ld since 2009. Adding this has low costs and makes several folks happy. This basically restores 8f13bef5. Fixes https://github.com/llvm/llvm-project/issues/54756
-
Fangrui Song authored
-
Jez Ng authored
-
Fangrui Song authored
-
Fangrui Song authored
(The upgrade of the ppc64le bot and D121257 have fixed compiler-rt failures. Tested by nemanjai.) Default the option introduced in D113372 to ON to match all(?) major Linux distros. This matches GCC and improves consistency with Android and linux-musl which always default to PIE. Note: CLANG_DEFAULT_PIE_ON_LINUX may be removed in the future. Differential Revision: https://reviews.llvm.org/D120305
-
Fangrui Song authored
Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D122872
-
Jun Zhang authored
Signed-off-by:
Jun Zhang <jun@junz.org>
-
Matt Arsenault authored
Use new NotAtomic expansion to turn these into the equivalent non-atomic operations. Independent lanes cannot access the private memory of other lanes, so there's no possibility for synchronization. These don't really appear directly in user code, but InferAddressSpaces can make these appear after optimizations. Fixes issues 54693 and 54274.
-
Matt Arsenault authored
Currently LowerAtomics exists as a separate pass which blindly replaces all atomics. Add a new lowering strategy option to eliminate the atomics which the target can control on a per-instruction level.
-
Matt Arsenault authored
Use the same enum as the other atomic instructions for consistency, in preparation for addition of another strategy. Introduce a new "Expand" option, since the store expansion does not use cmpxchg. Alternatively, the existing CmpXChg strategy could be renamed to Expand.
-
Lian Wang authored
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122786
-
Krystian Kuzniarek authored
Differential Revision: https://reviews.llvm.org/D122064
-
River Riddle authored
These are functionally identical, and merging the two removes the number of redundant conversions within the parser.
-
Stanislav Mekhanoshin authored
Differential Revision: https://reviews.llvm.org/D123268
-
Michael Kruse authored
In a clean build directory, `check-openmp` or `check-libomptarget` will fail because of missing device RTL .bc files. Ensure that the new targets new custom targets `omptarget.devicertl.nvptx` and `omptarget.devicertl.amdgpu` (corresponding to the plugin rtl targets `omptarget.rtl.cuda`, respectively `omptarget.rlt.amdgpu` ) are dependencies of the regression tests. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D123177
-
LLVM GN Syncbot authored
-
Matt Arsenault authored
This will allow code sharing from AtomicExpandPass. Not entirely sure why these exist as separate passes though.
-
River Riddle authored
This commit refactors the expected form of native constraint and rewrite functions, and greatly reduces the necessary user complexity required when defining a native function. Namely, this commit adds in automatic processing of the necessary PDLValue glue code, and allows for users to define constraint/rewrite functions using the C++ types that they actually want to use. As an example, lets see a simple example rewrite defined today: ``` static void rewriteFn(PatternRewriter &rewriter, PDLResultList &results, ArrayRef<PDLValue> args) { ValueRange operandValues = args[0].cast<ValueRange>(); TypeRange typeValues = args[1].cast<TypeRange>(); ... // Create an operation at some point and pass it back to PDL. Operation *op = rewriter.create<SomeOp>(...); results.push_back(op); } ``` After this commit, that same rewrite could be defined as: ``` static Operation *rewriteFn(PatternRewriter &rewriter ValueRange operandValues, TypeRange typeValues) { ... // Create an operation at some point and pass it back to PDL. return rewriter.create<SomeOp>(...); } ``` Differential Revision: https://reviews.llvm.org/D122086
-
Petr Hosek authored
This includes the missing variables as pointed out in https://reviews.llvm.org/rGb0e2ffe151c3
-
Aart Bik authored
Rationale: Allocating the temporary buffers for access pattern expansion on the stack (using alloca) is a bit too agressive, since it easily runs out of stack space for large enveloping tensor dimensions. This revision changes the dynamic allocation of these buffers with explicit alloc/dealloc pairs. Reviewed By: bixia, wrengr Differential Revision: https://reviews.llvm.org/D123253
-
Simon Dardis authored
LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP instruction between a load and any instruction immediately following that reads the load's destination register. I also included a simple regression test. Since no existing tests target MIPS-I, those all still pass. Issue ref: https://github.com/simias/psx-sdk-rs/issues/1 I also tested by building a simple demo app with Clang and running it in an emulator. Patch by: @impiaaa Differential Revision: https://reviews.llvm.org/D122427
-
Stanislav Mekhanoshin authored
-