- Sep 20, 2021
-
-
Craig Topper authored
For strided accesses the loop vectorizer seems to prefer creating a vector induction variable with a start value of the form <i32 0, i32 1, i32 2, ...>. This value will be incremented each loop iteration by a splat constant equal to the length of the vector. Within the loop, arithmetic using splat values will be done on this vector induction variable to produce indices for a vector GEP. This pass attempts to dig through the arithmetic back to the phi to create a new scalar induction variable and a stride. We push all of the arithmetic out of the loop by folding it into the start, step, and stride values. Then we create a scalar GEP to use as the base pointer for a strided load or store using the computed stride. Loop strength reduce will run after this pass and can do some cleanups to the scalar GEP and induction variable. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D107790
-
Fangrui Song authored
We have the rule to simulate (https://sourceware.org/binutils/docs/ld/Entry-Point.html), but the behavior is questionable (https://sourceware.org/pipermail/binutils/2021-September/117929.html). gold doesn't fall back to .text. The behavior is unlikely relied by projects (there is even a warning for executable links), so let's just delete this fallback path. Reviewed By: jhenderson, peter.smith Differential Revision: https://reviews.llvm.org/D110014
-
Nikita Popov authored
Verify that !noalias, !alias.scope and llvm.experimental.noalias.scope arguments have the format specified in https://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata. I've fixed up a lot of broken metadata used by tests in advance. Especially using a scope instead of the expected scope list is a commonly made mistake. Differential Revision: https://reviews.llvm.org/D110026
-
Jonas Devlieghere authored
This moves the logic for adding symbols based on UUID, file and frame into little helper functions. This is in preparation for D110011. Differential revision: https://reviews.llvm.org/D110010
-
Jonas Devlieghere authored
-
Florian Hahn authored
Adds additional tests following comments from D109844. Also removes unusued in.ptr arguments and places in the call tests that used loads instead of a getval call.
-
Tobias Gysi authored
This revision depends on https://reviews.llvm.org/D109761 and https://reviews.llvm.org/D109766. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D109774
-
Morten Borup Petersen authored
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop. Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable. This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable). Differential Revision: https://reviews.llvm.org/D108454
-
Tobias Gysi authored
-
Alexey Bataev authored
Reworked reordering algorithm. Originally, the compiler just tried to detect the most common order in the reordarable nodes (loads, stores, extractelements,extractvalues) and then fully rebuilding the graph in the best order. This was not effecient, since it required an extra memory and time for building/rebuilding tree, double the use of the scheduling budget, which could lead to missing vectorization due to exausted scheduling resources. Patch provide 2-way approach for graph reodering problem. At first, all reordering is done in-place, it doe not required tree deleting/rebuilding, it just rotates the scalars/orders/reuses masks in the graph node. The first step (top-to bottom) rotates the whole graph, similarly to the previous implementation. Compiler counts the number of the most used orders of the graph nodes with the same vectorization factor and then rotates the subgraph with the given vectorization factor to the most used order, if it is not empty. Then repeats the same procedure for the subgraphs with the smaller vectorization factor. We can do this because we still need to reshuffle smaller subgraph when buildiong operands for the graph nodes with lasrger vectorization factor, we can rotate just subgraph, not the whole graph. The second step (bottom-to-top) scans through the leaves and tries to detect the users of the leaves which can be reordered. If the leaves can be reorder in the best fashion, they are reordered and their user too. It allows to remove double shuffles to the same ordering of the operands in many cases and just reorder the user operations instead. Plus, it moves the final shuffles closer to the top of the graph and in many cases allows to remove extra shuffle because the same procedure is repeated again and we can again merge some reordering masks and reorder user nodes instead of the operands. Also, patch improves cost model for gathering of loads, which improves x264 benchmark in some cases. Gives about +2% on AVX512 + LTO (more expected for AVX/AVX2) for {625,525}x264, +3% for 508.namd, improves most of other benchmarks. The compile and link time are almost the same, though in some cases it should be better (we're not doing an extra instruction scheduling anymore) + we may vectorize more code for the large basic blocks again because of saving scheduling budget. Differential Revision: https://reviews.llvm.org/D105020
-
peter klausler authored
Some intrinsic functions weren't findable because the table wasn't strictly in order of names. And complete a missing generalization of the extension DCONJG to accept any kind of complex argument, like DREAL and DIMAG were. Differential Revision: https://reviews.llvm.org/D110002
-
Wang, Pengfei authored
D109607 results in a regression in llvm-test-suite. The reason is we didn't check the size of SourceTy, so that we will return wrong SSE type when SourceTy is overlapped. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D110037
-
Wang, Pengfei authored
-
Justas Janickas authored
Atomics in C++ for OpenCL 2021 are now handled the same way as in OpenCL C 3.0. This is a header-only change. Differential Revision: https://reviews.llvm.org/D109424
-
Kadir Cetinkaya authored
Fixes https://github.com/clangd/clangd/issues/865 Differential Revision: https://reviews.llvm.org/D109894
-
Tobias Gysi authored
Add a new version of fusion on tensors that supports the following scenarios: - support input and output operand fusion - fuse a producer result passed in via tile loop iteration arguments (update the tile loop iteration arguments) - supports only linalg operations on tensors - supports only scf::for - cannot add an output to the tile loop nest The LinalgTileAndFuseOnTensors pass tiles the root operation and fuses its producers. Reviewed By: nicolasvasilache, mravishankar Differential Revision: https://reviews.llvm.org/D109766
-
Deep Majumder authored
The docs of alpha.cplusplus.SmartPtr was incorrectly placed under alpha.deadcode. Moved it to under alpha.cplusplus Differential Revision: https://reviews.llvm.org/D110032
-
David Sherwood authored
In ValueTracking.cpp we use a function called computeKnownBitsFromOperator to determine the known bits of a value. For the vscale intrinsic if the function contains the vscale_range attribute we can use the maximum and minimum values of vscale to determine some known zero and one bits. This should help to improve code quality by allowing certain optimisations to take place. Tests added here: Transforms/InstCombine/icmp-vscale.ll Differential Revision: https://reviews.llvm.org/D109883
-
Jay Foad authored
-
Stefan Gränitz authored
Following D109516, this patch re-uses the new helper function for ELF relocation traversal in the RISCV backend. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D109522
-
Stefan Gränitz authored
Following D109516, this patch re-uses the new helper function for ELF relocation traversal in the x86-64 backend. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D109520
-
Aaron Puchert authored
Previous changes like D101202 and D104261 have eliminated the special status that break and continue once had, since now we're making decisions purely based on the structure of the CFG without regard for the underlying source code constructs. This means we don't gain anything from defering handling for these blocks. Dropping it moves some diagnostics, though arguably into a better place. We're working around a "quirk" in the CFG that perhaps wasn't visible before: while loops have an empty "transition block" where continue statements and the regular loop exit meet, before continuing to the loop entry. To get a source location for that, we slightly extend our handling for empty blocks. The source location for the transition ends up to be the loop entry then, but formally this isn't a back edge. We pretend it is anyway. (This is safe: we can always treat edges as back edges, it just means we allow less and don't modify the lock set. The other way around it wouldn't be safe.) Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D106715
-
Michał Górny authored
Set value_regs and invalidate_regs in RegisterInfo pushed onto m_regs to nullptr, to ensure that the temporaries passed there are not accidentally used. Differential Revision: https://reviews.llvm.org/D109879
-
Michał Górny authored
Differential Revision: https://reviews.llvm.org/D109906
-
Brain Swift authored
Clang build fails when build directory contains space character. Error messages: [ 95%] Linking CXX executable ../../../../bin/clang clang: error: no such file or directory: 'Space/Net/llvm/Build/tools/clang/tools/driver/Info.plist' make[2]: *** [bin/clang-14] Error 1 make[1]: *** [tools/clang/tools/driver/CMakeFiles/clang.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... The path name is actually: 'Dev Space/Net/llvm/Build/tools/clang/tools/driver/Info.plist' Bugzilla issue - https://bugs.llvm.org/show_bug.cgi?id=51884 Reporter and patch author - Brain Swift <bsp2bsp-llvm@yahoo.com> Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D109979
-
David Green authored
The vectorizer can sometimes make reverse shuffles from indices that count down. In MVE, we don't have a 128bit rev instruction, but we can select this to a VREV64 with some lane movs to swap the two halfs. Ideally this would use VMOVD's, but only gets as far as VMOVS's at the moment. Differential Revision: https://reviews.llvm.org/D69510
-
Alex Richardson authored
Previously the script emitted output using plain CHECK directives. This can result in a test passing even if there are some instructions between CHECK directives that should have been removed. It also makes debugging tests that have the output in a different order more difficult since FileCheck can match with a later line and then complain about the "wrong" directive not being found. This will cause quite large diffs when updating existing tests, but I'm not sure we need an opt-in flag here. Depends on D109765 (pre-commit tests) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D109767
-
Alex Richardson authored
Differential Revision: https://reviews.llvm.org/D109765
-
Alex Richardson authored
Since https://reviews.llvm.org/D87118, the StaticAnalyzer directory is added unconditionally. In theory this should not cause the static analyzer sources to be built unless they are referenced by another target. However, the clang-cpp target (defined in clang/tools/clang-shlib) uses the CLANG_STATIC_LIBS global property to determine which libraries need to be included. To solve this issue, this patch avoids adding libraries to that property if EXCLUDE_FROM_ALL is set. In case something like this comes up again: `cmake --graphviz=targets.dot` is quite useful to see why a target is included as part of `ninja all`. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D109611
-
Simon Pilgrim authored
Reported by MSVC static analyzer.
-
Simon Pilgrim authored
(style) Break the if-else chain as they all return.
-
Simon Pilgrim authored
Avoid unnecessary copies, reported by MSVC static analyzer.
-
Valentin Clement authored
Make use of runtime extension for the second reference counter used in structured data region. This extension is implemented in D106510 and D106509. Differential Revision: https://reviews.llvm.org/D106517
-
Michał Górny authored
Always send PID in the detach packet when multiprocess extensions are enabled. This is required by qemu's GDB server, as plain 'D' packet results in an error and the emulated system is not resumed. Differential Revision: https://reviews.llvm.org/D110033
-
Petar Avramovic authored
Add eraseInstr(s) utility functions. Before deleting an instruction collects its use instructions. After deletion deletes use instructions that became trivially dead. This patch clears all dead instructions in existing legalizer mir tests. Differential Revision: https://reviews.llvm.org/D109154
-
Bjorn Pettersson authored
In default pipelines the ModuleInlinerWrapperPass is adding the InlinerPass to the pipeline twice, once due to MandatoryFirst (passing true in the ctor) and then a second time with false as argument. To make it possible to bisect and reduce opt test cases for this part of the pipeline we need to be able to choose between the two different variants of the InlinerPass when running opt. This patch is changing 'inline' to a CGSCC_PASS_WITH_PARAMS in the PassRegistry, making it possible run opt with both -passes=cgscc(inline) and -passes=cgscc(inline<only-mandatory>). Reviewed By: aeubanks, mtrofin Differential Revision: https://reviews.llvm.org/D109877
-
Andy Wingo authored
Remove code that has no effect in SemaType.cpp:processTypeAttrs. Differential Revision: https://reviews.llvm.org/D108360
-
Alexey Bader authored
-
Justas Janickas authored
Adds support for a feature macro __opencl_c_3d_image_writes in C++ for OpenCL 2021 enabling a respective optional core feature from OpenCL 3.0. This change aims to achieve compatibility between C++ for OpenCL 2021 and OpenCL 3.0. Differential Revision: https://reviews.llvm.org/D109328
-
Tim Northover authored
v8.4 says that normal loads/stores of 128-bytes are single-copy atomic if they're properly aligned (which all LLVM atomics are) so we no longer need to do a full RMW operation to guarantee we got a clean read.
-