- Mar 10, 2021
-
-
Mauri Mustonen authored
Add support to widen select instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer. Previously select instructions get handled by the wrong recipe and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D97136
-
Giorgis Georgakoudis authored
The patch adds an argument to update_cc_test_checks for replacing a function name matching a regex. This functionality is needed to match generated function signatures that include file hashes. Example: The function signature for the following function: `__omp_offloading_50_b84c41e__Z9ftemplateIiET_i_l30_worker` with `--replace-function-regex "__omp_offloading_[0-9]+_[a-z0-9]+_(.*)"` will become: `CHECK-LABEL: @{{__omp_offloading_[0-9]+_[a-z0-9]+__Z9ftemplateIiET_i_l30_worker}}(` Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D97107
-
Christian Sigg authored
I missed a comment in D98279 that you don't need to copy pass options. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D98366
-
Alexey Lapshin authored
During D88827 it was requested to remove the local implementation of Memory/File Buffers: // TODO: refactor the buffer classes in LLVM to enable us to use them here // directly. This patch uses raw_ostream instead of Buffers. Generally, using streams could allow us to reduce memory usages. No need to load all data into the memory - the data could be streamed through a smaller buffer. Thus, this patch uses raw_ostream as an interface for output data: Error executeObjcopyOnBinary(CopyConfig &Config, object::Binary &In, raw_ostream &Out); Note 1. This patch does not change the implementation of Writers so that data would be directly stored into raw_ostream. This is assumed to be done later. Note 2. It would be better if Writers would be implemented in a such way that data could be streamed without seeking/updating. If that would be inconvenient then raw_ostream could be replaced with raw_pwrite_stream to have a possibility to seek back and update file headers. This is assumed to be done later if necessary. Note 3. Current FileOutputBuffer allows using a memory-mapped file. The raw_fd_ostream (which could be used if data should be stored in the file) does not allow us to use a memory-mapped file. Memory map functionality could be implemented for raw_fd_ostream: It is possible to add resize() method into raw_ostream. class raw_ostream { void resize(uint64_t size); } That method, implemented for raw_fd_ostream, could create a memory-mapped file. The streamed data would be written into that memory file then. Thus we would be able to use memory-mapped files with raw_fd_ostream. This is assumed to be done later if necessary. Differential Revision: https://reviews.llvm.org/D91028
-
Weiwei Li authored
co-authered-by:
Alan Liu <alanliu.yf@gmail.com> Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D98270
-
Stanislav Mekhanoshin authored
Differential Revision: https://reviews.llvm.org/D98221
-
Stanislav Mekhanoshin authored
FP atomics in system scope cannot be used and shall always be expanded in a CAS loop. Differential Revision: https://reviews.llvm.org/D98085
-
Giorgis Georgakoudis authored
Some tests in clang require running non-filechecked commands to generate the actual filecheck input. For example, tests for openmp offloading require generating the host bc without any checking, before running the clang command to actually generate the filechecked IR of the target device. This patch enables `update_cc_test_checks.py` to run non-filechecked run lines in-place. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D97068
-
George Balatsouras authored
Remove hard-coded shadow width references. Separate CHECK lines that only apply to fast16 mode. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D98308
-
Matteo Favaro authored
The isOverwrite function is making sure to identify if two stores are fully overlapping and ideally we would like to identify all the instances of OW_Complete as they'll yield possibly killable stores. The current implementation is incapable of spotting instances where the earlier store is offsetted compared to the later store, but still fully overlapped. The limitation seems to lie on the computation of the base pointers with the GetPointerBaseWithConstantOffset API that often yields different base pointers even if the stores are guaranteed to partially overlap (e.g. the alias analysis is returning AliasResult::PartialAlias). The patch relies on the offsets computed and cached by BatchAAResults (available after D93529) to determine if the offsetted overlapping is OW_Complete. Differential Revision: https://reviews.llvm.org/D97676
-
Greg McGary authored
Pointer and reference induction variables of range-based for loops are often const, and code authors often lax about qualifying them. Differential Revision: https://reviews.llvm.org/D98317
-
Sriraman Tallam authored
D96109 was recently submitted which contains the refactored implementation of -funique-internal-linakge-names by adding the unique suffixes in clang rather than as an LLVM pass. Deleting the former implementation in this change. Differential Revision: https://reviews.llvm.org/D98234
-
Nikita Popov authored
-
Alex Zinenko authored
This reverts commit 95db7b4a. This breaks vectorize_2d.mlir and vectorize_3d.mlir test under ASAN (use after free).
-
Alex Zinenko authored
This reverts commit 77a9d154. Parent commit is broken.
-
Rafael Auler authored
This patch introduces functionality used by BOLT when re-linking the final binary. It adds new relocation types that are currently unsupported by RuntimeDyldELF. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D97899
-
Dave Lee authored
Call `SetPrivate(true)` for subplans pushed via `PushPlan()`, as described in its docstring. Differential Revision: https://reviews.llvm.org/D96916
-
Quentin Colombet authored
Fix warnings caused by -Wrange-loop-analysis. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D98298
-
Diego Caballero authored
This patch adds support for vectorizing loops with 'iter_args' when those loops are not a vector dimension. This allows vectorizing outer loops with an inner 'iter_args' loop (e.g., reductions). Vectorizing scenarios where 'iter_args' loops are vector dimensions would require more work (e.g., analysis, generating horizontal reduction, etc.) not included in this patch. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97892
-
Diego Caballero authored
This patch replaces the root-terminal vectorization approach implemented in the Affine vectorizer with a topological order approach that vectorizes all the operations within the target loop nest. These are the most important changes introduced by the new algorithm: * Removed tracking of root and terminal ops. Existing vectorization functionality is preserved and extended so that loop nests without root-terminal chains can be vectorized. * Vectorizing a loop nest now only requires a single topological traversal. * A new vector loop nest is incrementally built along the vectorization process. The original scalar loop is kept intact. No cloning guard is needed to recover the scalar loop if vectorization fails. This approach also simplifies the challenging task of replacing a loop operation amid the vectorization process without invalidating the analysis information that depends on the original loop. * Vectorization of specific operations has been implemented as independent, preparing them to be moved to a potential vectorization interface. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D97442
-
Amy Kwan authored
This patch adds patterns to select the PC-Relative extloadi1 and zextloadi1 byte loads. Differential Revision: https://reviews.llvm.org/D98042
-
Arthur Eubanks authored
Fixes PR42961. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D97872
-
gbtozers authored
This patch refactors out the salvaging of GEP and BinOp instructions into separate functions, in preparation for further changes to the salvaging of these instructions coming in another patch; there should be no functional change as a result of this refactor. Differential Revision: https://reviews.llvm.org/D92851
-
Craig Topper authored
[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32. On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64. This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization. For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98004
-
Craig Topper authored
Currently we crash in type legalization any time an intrinsic uses a scalar i64 on RV32. This patch adds support for type legalizing this to prevent crashing. I don't promise that it uses the best possible codegen just that it is functional. This first version handles 3 cases. vmv.v.x intrinsic, vmv.s.x intrinsic and intrinsics that take a scalar input, splat it and then do some operation. For vmv.v.x we'll either rely on hardware sign extension for constants or we'll convert it to multiple splats and bit manipulation. For vmv.s.x we use a really unoptimal sequence inspired by what we do for an INSERT_VECTOR_ELT. For the third case we'll either try to use the .vi form for constants or convert to a complicated splat and bitmanip and use the .vv form of the operation. I've renamed the ExtendOperand field to SplatOperand now use it specifically for the third case. The first two cases are handled by custom lowering specifically for those intrinsics. I haven't updated all tests yet, but I tried to cover a subset that includes single-width, widening, and narrowing. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D97895
-
Dan Liew authored
When building with `LLVM_BUILD_EXTERNAL_COMPILER_RT=ON` (e.g. Swift does this) we do an "external" build of compiler-rt where we build compiler-rt with the just built clang. Unfortunately building in this mode had a bug where compiler-rt would not get rebuilt if compiler-rt sources changed. This is problematic for incremental builds because it meant that the compiler-rt binaries were stale. The fix is to use the `BUILD_ALWAYS` ExternalProject_Add option which means the build command for compiler-rt is always run. In principle if all of the following are true: * compiler-rt has already been built. * there are no compiler-rt source changes. * the compiler hasn't changed. * ninja is being used as the generator for the compiler-rt build. then the overhead for always running the build command for incremental builds is negligible. However, in practice clang gets rebuilt everytime the HEAD commit changes (due to commit hash being embedded in the output of `--version`) which means all of compiler-rt will be rebuilt everytime this happens. While this is annoying it's better to do the slow but correct thing rather than the fast but incorrect thing. rdar://75150660 Differential Revision: https://reviews.llvm.org/D98291
-
Peter Steinfeld authored
You can define a base type with a type-bound procedure which is erroneously missing a NOPASS attribute and then define another type that extends the base type and overrides the erroneous procedure. In this case, when we perform semantic checking on the overriding procedure, we verify the "pass index" of the overriding procedure. The attempt to get the procedure's pass index fails a call to CHECK(). I fixed this by calling SetError() on the symbol of the overridden procedure in the base type. Then, I check HasError() before executing the code that invokes the failing call to CHECK(). I also added a test that will cause the compiler to fail the call to CHECK() without this change. Differential Revision: https://reviews.llvm.org/D98355
-
Michał Górny authored
-
Michał Górny authored
Split out the common base of Linux hardware breakpoint/watchpoint support for AArch64 into a Utility class, and use it to implement the matching support on FreeBSD. Differential Revision: https://reviews.llvm.org/D96548
-
Daniil Seredkin authored
[InstCombine][SimplifyLibCalls] An extra sqrtf was produced because of transformations in optimizePow function See: https://bugs.llvm.org/show_bug.cgi?id=47613 There was an extra sqrt call because shrinking emitted a new powf and at the same time optimizePow replaces the previous pow with sqrt and as the result we have two instructions that will be in worklist of InstCombie despite the fact that %powf is not used by anyone (it is alive because of errno). As the result we have two instructions: %powf = call fast float @powf(float %x, float 5.000000e-01) %sqrt = call fast double @sqrt(double %dx) %powf will be converted to %sqrtf on a later iteration. As a quick fix for that I moved shrinking to the end of optimizePow so that pow is replaced with sqrt at first that allows not to emit a new shrunk powf. Differential Revision: https://reviews.llvm.org/D98235
-
Craig Topper authored
The type legalizer will visit the result before the operands. To avoid creating an illegal target specific node or falling back to scalarization, we need to manually split vector operands. This still doesn't handle the case of non-power of 2 operands which need to be widened. I'm not sure the type legalizer is ready for it. I think we would need to insert an INSERT_SUBVECTOR with the power of 2 type we want, with an undef first operand, and the non-power of 2 orignal operand as the vector to insert. Then fill in the neutral elements into the elements the padded elements. Alternatively we INSERT_SUBVECTOR into a neutral vector. From there we carry on splitting if needed to get to a legal type then do the target specific code. The problem with this is the type legalizer doesn't know how to widen an insert_subvector yet. We would need to add that including the handling for a non-undef first vector. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D98292
-
Stephen Tozer authored
This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578
-
Jianzhou Zhao authored
This is a part of https://reviews.llvm.org/D95835. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D98268
-
Andrzej Warzynski authored
In https://reviews.llvm.org/D98283, the RUN line in pre-fir-tree04.f90 was updated to use `%flang_fc1` instead of `%f18` (so that the test is shared between the old and the new driver). Unfortunately, the new driver does not know yet how to find standard intrinsics modules. As a result, the test fails when `FLANG_BUILD_NEW_DRIVER` is set to On. I'm restoring the original RUN line. This is rather straightforward, so sending without a review. This should make Flang builders happy.
-
Dávid Bolvanský authored
Follow up for fhahn's D98284. Also fixes a case from PR47644. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D98346
-
Florian Hahn authored
-
Jay Foad authored
D57708 changed SIInstrInfo::isReallyTriviallyReMaterializable to reject V_MOVs with extra implicit operands, but it accidentally rejected all V_MOVs because of their implicit use of exec. Fix it but avoid adding a moderately expensive call to MI.getDesc().getNumImplicitUses(). In real graphics shaders this changes quite a few vgpr copies into move- immediates, which is good for avoiding stalls on GFX10. Differential Revision: https://reviews.llvm.org/D98347
-
Stephen Tozer authored
This reverts commit 429c6ecb.
-
Eric Schweitz authored
The PFT has been updated to support Fortran 77. clang-tidy cleanup. Authors: Val Donaldson, Jean Perier, Eric Schweitz, et.al. Differential Revision: https://reviews.llvm.org/D98283
-