- Feb 24, 2022
-
-
Stanislav Mekhanoshin authored
Loads and stores can be out of order in the SILoadStoreOptimizer. When combining MachineMemOperands of two instructions operands are sent in the IR order into the combineKnownAdjacentMMOs. At the moment it picks the first operand and just replaces its offset and size. This essentially loses alignment information and may generally result in an incorrect base pointer to be used. Use a base pointer in memory addresses order instead and only adjust size. Differential Revision: https://reviews.llvm.org/D120370
-
Amir Ayupov authored
Introduce an option to expand all CMOV groups into hammocks, matching GCC's `-fno-if-conversion2` flag. The motivation is to leave CMOV conversion opportunities to a binary optimizer that can make the decision based on branch misprediction rate (available e.g. in Intel's LBR). Reviewed By: MaskRay, skan Differential Revision: https://reviews.llvm.org/D119777
-
Benjamin Kramer authored
-
Craig Topper authored
-
Momchil Velikov authored
Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D112327
-
Fangrui Song authored
-
Thomas Raoux authored
This transformation is useful to break dependency between consecutive loop iterations by increasing the size of a temporary buffer. This is usually combined with heavy software pipelining. Differential Revision: https://reviews.llvm.org/D119406
-
Craig Topper authored
Trying to reduce the diffs from D118333 for cases where it makes more sense to use an FP ABI. Reviewed By: asb, kito-cheng Differential Revision: https://reviews.llvm.org/D120447
-
Momchil Velikov authored
The PostRA scheduler can reorder non-CFI instructions in a way that makes the unwind info not instruction precise. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D112326
-
Craig Topper authored
Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974
-
Valentin Clement authored
Add lowering for simple assignement on allocatable scalars. This patch is part of the upstreaming effort from fir-dev branch. Depends on D120483 Reviewed By: PeteSteinfeld Differential Revision: https://reviews.llvm.org/D120488 Co-authored-by:
Eric Schweitz <eschweitz@nvidia.com> Co-authored-by:
Jean Perier <jperier@nvidia.com>
-
Stanislav Gatev authored
When assigning a value to a storage location of a struct member we need to also update the value in the corresponding `StructValue`. This is part of the implementation of the dataflow analysis framework. See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev. Reviewed-by: ymandel, xazax.hun Differential Revision: https://reviews.llvm.org/D120414
-
Sven van Haastregt authored
After D120254 some clang-tidy tests started failing on release builds. clang-tidy has been using the `-fdeclare-opencl-builtins` functionality since this became the default in clang, so there is no need to include `opencl-c.h`. Differential Revision: https://reviews.llvm.org/D120470
-
Sanjay Patel authored
This is the SDAG translation of D120253 : https://alive2.llvm.org/ce/z/qHpmNn The SDAG nodes can have different operand types than the result value. We can see an example of that with AArch64 - the funnel shift amount is an i64 rather than i32. We may need to make that match even more flexible to handle post-legalization nodes, but I have not stepped into that yet. Differential Revision: https://reviews.llvm.org/D120264
-
Valentin Clement authored
This patch handles allocatable dummy argument lowering in function and subroutines. This patch is part of the upstreaming effort from fir-dev branch. Reviewed By: schweitz Differential Revision: https://reviews.llvm.org/D120483 Co-authored-by:
Jean Perier <jperier@nvidia.com>
-
Anton Korobeynikov authored
-
Aaron Ballman authored
-
Joseph Huber authored
Summary: We use a section to embed offloading code into the host for later linking. This is normally unique to the translation unit as it is thrown away during linking. However, if the user performs a relocatable link the sections will be merged and we won't be able to access the files stored inside. This patch changes the section variables to have external linkage and a name defined by the section name, so if two sections are combined during linking we get an error.
-
Sanjay Patel authored
The corner case where 'nsz' needs to be removed is very narrow as discussed here: https://reviews.llvm.org/rG3cdd05e519dd If the select condition is not undef, there's no problem with propagating 'nsz': https://alive2.llvm.org/ce/z/4GWJdq
-
Sanjay Patel authored
-
Jay Foad authored
When parsing MachineMemOperands, MIRParser treated the "align" keyword the same as "basealign". Really "basealign" should specify the alignment of the MachinePointerInfo base value, and "align" should specify the alignment of that base value plus the offset. This worked OK when the specified alignment was no larger than the alignment of the offset, but in cases like this it just caused confusion: STW killed %18, 4, %stack.1.ap2.i.i :: (store (s32) into %stack.1.ap2.i.i + 4, align 8) MIRPrinter would never have printed this, with an offset of 4 but an align of 8, so it must have been written by hand. MIRParser would interpret "align 8" as "basealign 8", but I think it is better to give an error and force the user to write "basealign 8" if that is what they really meant. Differential Revision: https://reviews.llvm.org/D120400 Change-Id: I7eeeefc55c2df3554ba8d89f8809a2f45ada32d8
-
Marius Brehler authored
This adds a variable op, emitted as C/C++ locale variable, which can be used if the `emitc.constant` op is not sufficient. As an example, the canonicalization pass would transform ```mlir %0 = "emitc.constant"() {value = 0 : i32} : () -> i32 %1 = "emitc.constant"() {value = 0 : i32} : () -> i32 %2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32> %3 = emitc.apply "&"(%1) : (i32) -> !emitc.ptr<i32> emitc.call "write"(%2, %3) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> () ``` into ```mlir %0 = "emitc.constant"() {value = 0 : i32} : () -> i32 %1 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32> %2 = emitc.apply "&"(%0) : (i32) -> !emitc.ptr<i32> emitc.call "write"(%1, %2) : (!emitc.ptr<i32>, !emitc.ptr<i32>) -> () ``` resulting in pointer aliasing, as %1 and %2 point to the same address. In such a case, the `emitc.variable` operation can be used instead. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D120098
-
Corentin Jabot authored
This adds a diagnostic when an unqualified call is resolved to std::move or std::forward. This follows some C++ committee discussions where some people where concerns that this might be an usual anti pattern particularly britle worth warning about - both because move is a common name and because these functions accept any values. This warns inconditionnally of whether the current context is in std:: or not, as implementations probably want to always qualify these calls too, to avoid triggering adl accidentally. Differential Revision: https://reviews.llvm.org/D119670
-
Sven van Haastregt authored
Until now, any types that had TypeExtensions attached to them were not guarded with those extensions. Extend the OpenCLBuiltinFileEmitter such that all required extensions are emitted for the types of a builtin function. The `clang-tblgen -gen-clang-opencl-builtin-tests` emitter will now produce e.g.: #if defined(cl_khr_fp16) && defined(cl_khr_fp64) half8 test11802_convert_half8_rtp(double8 arg1) { return convert_half8_rtp(arg1); } #endif // TypeExtension Differential Revision: https://reviews.llvm.org/D120262
-
Florian Hahn authored
-
Simon Pilgrim authored
We're still better off expanding this once we have PMOVZX
-
Simon Pilgrim authored
-
Shraiysh Vaishay authored
This patch removes binary operator enum which was introduced with `omp.atomic.update`. Now the update operation handles update in a region so this is no longer required. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D120458
-
Benjamin Kramer authored
On some standard library configurations these have a dependency on the complete type of SymbolizableModule. They also do a lot of copying/freeing so no point in inlining them.
-
Alex Zinenko authored
Documentation exists about the details of the API but is missing a description of the overall structure per dialect. Reviewed By: shabalin Differential Revision: https://reviews.llvm.org/D117002
-
Benjamin Kramer authored
This codepath was entirely untested. Differential Revision: https://reviews.llvm.org/D120473
-
Roman Lebedev authored
-
serge-sans-paille authored
Estimation of the impact on preprocessor output: before: 1067349756 after: 1065940348 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120434
-
serge-sans-paille authored
Estimation of the impact on preprocessor output after: 1067349756 before:1067487786 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120433
-
Jay Foad authored
-
Florian Hahn authored
The tests show sub-optimal lowering of extend/cmp/select chains starting with v16i8 vectors.
-
Shao-Ce SUN authored
Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412
-
Sven van Haastregt authored
This simplifies completeness comparisons against OpenCLBuiltins.td and also makes the header no longer "claim" the identifiers "image", "image_array", "coord", "sampler", "sample", "gradientX", "gradientY", "lod", and "color". Continues the direction set out in D119560.
-
Pavel Labath authored
-
Javier Setoain authored
The current implementation of ShuffleVectorOp assumes all vectors are scalable. LLVM IR allows shufflevector operations on scalable vectors, and the current translation between LLVM Dialect and LLVM IR does the rigth thing when the shuffle mask is all zeroes. This is required to do a splat operation on a scalable vector, but it doesn't make sense for scalable vectors outside of that operation, i.e.: with non-all zero masks. Differential Revision: https://reviews.llvm.org/D118371
-