- Feb 03, 2020
-
-
Marius Brehler authored
The link in the toy example pointed to the 'tensorflow/mlir' repo and is replaced with https://mlir.llvm.org. Differential Revision: https://reviews.llvm.org/D73770
-
Simon Tatham authored
Summary: In big-endian MVE, the simple vector load/store instructions (i.e. both contiguous and non-widening) don't all store the bytes of a register to memory in the same order: it matters whether you did a VSTRB.8, VSTRH.16 or VSTRW.32. Put another way, the in-register formats of different vector types relate to each other in a different way from the in-memory formats. So, if you want to 'bitcast' or 'reinterpret' one vector type as another, you have to carefully specify which you mean: did you want to reinterpret the //register// format of one type as that of the other, or the //memory// format? The ACLE `vreinterpretq` intrinsics are specified to reinterpret the register format. But I had implemented them as LLVM IR bitcast, which is specified for all types as a reinterpretation of the memory format. So a `vreinterpretq` intrinsic, applied to values already in registers, would code-generate incorrectly if compiled big-endian: instead of emitting no code, it would emit a `vrev`. To fix this, I've introduced a new IR intrinsic to perform a register-format reinterpretation: `@llvm.arm.mve.vreinterpretq`. It's implemented by a trivial isel pattern that expects the input in an MQPR register, and just returns it unchanged. In the clang codegen, I only emit this new intrinsic where it's actually needed: I prefer a bitcast wherever it will have the right effect, because LLVM understands bitcasts better. So we still generate bitcasts in little-endian mode, and even in big-endian when you're casting between two vector types with the same lane size. For testing, I've moved all the codegen tests of vreinterpretq out into their own file, so that they can have a different set of RUN lines to check both big- and little-endian. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73786
-
Simon Tatham authored
Summary: These instructions generate a vector of consecutive elements starting from a given base value and incrementing by 1, 2, 4 or 8. The `wdup` versions also wrap the values back to zero when they reach a given limit value. The instruction updates the scalar base register so that another use of the same instruction will continue the sequence from where the previous one left off. At the IR level, I've represented these instructions as a family of target-specific intrinsics with two return values (the constructed vector and the updated base). The user-facing ACLE API provides a set of intrinsics that throw away the written-back base and another set that receive it as a pointer so they can update it, plus the usual predicated versions. Because the intrinsics return two values (as do the underlying instructions), the isel has to be done in C++. This is the first family of MVE intrinsics that use the `imm_1248` immediate type in the clang Tablegen framework, so naturally, I found I'd given it the wrong C integer type. Also added some tests of the check that the immediate has a legal value, because this is the first time those particular checks have been exercised. Finally, I also had to fix a bug in MveEmitter which failed an assertion when I nested two `seq` nodes (the inner one used to extract the two values from the pair returned by the IR intrinsic, and the outer one put on by the predication multiclass). Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73357
-
Simon Tatham authored
Summary: The unpredicated case of this is trivial: the clang codegen just makes a vector splat of the input, and LLVM isel is already prepared to handle that. For the predicated version, I've generated a `select` between the same vector splat and the `inactive` input parameter, and added new Tablegen isel rules to match that pattern into a predicated `MVE_VDUP` instruction. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73356
-
Haojian Wu authored
Summary: misc-unused-using clang-tidy check needs this matcher to fix a false positive of C++17 deduced class template types. Reviewers: gribozavr2 Reviewed By: gribozavr2 Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73869
-
Simon Pilgrim authored
-
Raphael Isemann authored
Disable the red zone in the unit test allocator to fix the test errors in sanitizer builds. The red zone changed the amount of allocated bytes which made the test fail as it checked the number of allocated bytes of the allocator.
-
Martin Storsjö authored
This fixes building for mingw with BUILD_SHARED_LIBS. In static builds, the psapi dependency gets linked in transitively from Support, but when linking Support dynamically, it's revealed that these components also need linking against psapi. Differential Revision: https://reviews.llvm.org/D73839
-
Clement Courbet authored
Summary: It turns out that CUR_DIRECTION is just an internal placeholder, not an actual valid encoded value. Reviewers: gchatelet Subscribers: tschuett, mstojanovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73343
-
Alexander Belyaev authored
Differential Revision: https://reviews.llvm.org/D73684
-
Dmitri Gribenko authored
-
Hans Wennborg authored
-
Sam McCall authored
This seems to just be an oversight.
-
Sam McCall authored
-
Raphael Isemann authored
This reverts commit b848b510 as the unit tests fail on the sanitizer bots: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/unittests/Support/AllocatorTest.cpp:145: Failure Expected: SlabSize Which is: 4096 To be equal to: Alloc.getTotalMemory() Which is: 4097
-
Raphael Isemann authored
Revert "[lldb] Increase the rate at which ConstString's memory allocator scales the memory chunks it allocates" This reverts commit 500c324f because its parent commit b848b510 is failing on the sanitizer bots.
-
Sergej Jaskiewicz authored
This reverts commit 41f4dfd6. It broke standalone libc++ builds, which now try to use libc++abi from the wrong directory, instead of system instance. (cherry picked from commit 3573526c0286c9461f0459be1a4592b2214594e7)
-
Guillaume Chatelet authored
Summary: A Copy with a source that is zeros is the same as a Set of zeros. This fixes the invariant that SrcAlign should always be non-null. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73791
-
Raphael Isemann authored
[lldb] Increase the rate at which ConstString's memory allocator scales the memory chunks it allocates Summary: We currently do far more malloc calls than necessary in the ConstString BumpPtrAllocator. This is due to the 256 BumpPtrAllocators our ConstString implementation uses internally which end up all just receiving a small share of the total allocated memory and therefore keep allocating memory in small chunks for far too long. This patch fixes this by increasing the rate at which we increase the memory chunk size so that our collection of BumpPtrAllocators behaves in total similar to a single BumpPtrAllocator. Reviewers: llunak Reviewed By: llunak Subscribers: abidh, JDevlieghere, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D71699
-
Clement Courbet authored
Summary: There are no counters for individual ports, but this is already enough to find a lot of issues in the current model (upcoming patch). Reviewers: dblaikie, gchatelet Subscribers: hiraditya, tschuett, RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72032
-
Jay Foad authored
Summary: D68092 introduced a new SIRemoveShortExecBranches optimization pass and broke some graphics shaders. The problem is that it was removing branches over KILL pseudo instructions, and the fix is to explicitly check for that in mustRetainExeczBranch. Reviewers: critson, arsenm, nhaehnle, cdevadas, hakzsam Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73771
-
Stephan Herhut authored
Summary: In the original design, gpu.launch required explicit capture of uses and passing them as operands to the gpu.launch operation. This was motivated by infrastructure restrictions rather than design. This change lifts the requirement and removes the concept of kernel arguments from gpu.launch. Instead, the kernel outlining transformation now does the explicit capturing. This is a breaking change for users of gpu.launch. Differential Revision: https://reviews.llvm.org/D73769
-
Sam Parker authored
Duplicating instructions can lead to code size increases but using a threshold of 3 is good for reducing code size. Differential Revision: https://reviews.llvm.org/D72916
-
Kazuaki Ishizaki authored
Summary: Also, an exercise to merge this into the master myself after a reviewer gives LGTM. Reviewers: nicolasvasilache, mehdi_amini Reviewed By: mehdi_amini Subscribers: Joonsoo, merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73432
-
Raphael Isemann authored
Summary: In D68549 we noticed that our BumpPtrAllocator we use for LLDB's ConstString implementation is growing its slabs at a rate that is too slow for our use case. It causes that we spend a lot of time calling `malloc` for all the tiny slabs that our ConstString BumpPtrAllocators create. We also can't just increase the slab size in the ConstString implementation (which is what D68549 originally did) as this really increased the amount of (mostly unused) allocated memory in any process using ConstString. This patch adds a template argument for the BumpPtrAllocatorImpl that allows specifying a faster rate at which the BumpPtrAllocator increases the slab size. This allows LLDB to specify a faster rate at which the slabs grow which should keep both memory consumption and time spent calling malloc low. Reviewers: george.karpenkov, chandlerc, NoQ Subscribers: NoQ, llvm-commits, llunak Tags: #llvm Differential Revision: https://reviews.llvm.org/D71654
-
Martin Storsjö authored
This avoids a warning about "suggest parentheses around && within ||".
-
Martin Storsjö authored
Win64 isn't LP64, it's LLP64, but there's no __LLP64__ predefined - just check _WIN64 in addition to __LP64__. This fixes compilation after static asserts about the struct layout were added in f2a43605. Differential Revision: https://reviews.llvm.org/D73838
-
Martin Storsjö authored
Remove an extra semicolon, and add llvm_unreachable to avoid warnings about control reaching the end of a non-void function.
-
Martin Storsjö authored
-
Martin Probst authored
Summary: In release notes and the regular docs. Reviewers: MyDeveloperDay Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73768
-
Johannes Doerfert authored
There seems to be another instance of non-determinism which causes the number of iterations to be either 1 or 3 for one benchmark, depending on the system. This needs to be investigated and resolved. In the meantime we do not verify the number of iterations for this benchmark.
-
Johannes Doerfert authored
If all call sites are in `norecurse` functions we can derive `norecurse` as the ReversePostOrderFunctionAttrsPass does. This should make ReversePostOrderFunctionAttrsLegacyPass obsolete once the Attributor is enabled. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D72017
-
Johannes Doerfert authored
If we know that all call sites have been processed we can derive an early fixpoint. The use in this patch is likely not to trigger right now but a follow up patch will make use of it. Reviewed By: uenoku, baziotis Differential Revision: https://reviews.llvm.org/D72016
-
Fangrui Song authored
x86_64-windows and darwin default to PIC. They don't use PIE.
-
Igor Kudrin authored
The method was initially added for DWARFVerifier::verifyUnitHeader() but its results were never actually used. Differential Revision: https://reviews.llvm.org/D73773
-
Craig Topper authored
We only need to call this on floating point comparisons. In this case these are known to be integer compares. One of them even has a SUB opcode instead of CMP.
-
Johannes Doerfert authored
With this patch new trivial edges can be added to an SCC in a CGSCC pass via the updateCGAndAnalysisManagerForCGSCCPass method. It shares almost all the code with the existing updateCGAndAnalysisManagerForFunctionPass method but it implements the first step towards the TODOs. This was initially part of D70927. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D72025
-
Juneyoung Lee authored
Summary: This adds -keep-const-init option to llvm-extract which preserves initializers of used global constants. For example: ``` $ cat a.ll @g = constant i32 0 define i32 @f() { %v = load i32, i32* @g ret i32 %v } $ llvm-extract --func=f a.ll -S -o - @g = external constant i32 define i32 @f() { .. } $ llvm-extract --func=f a.ll -keep-const-init -S -o - @g = constant i32 0 define i32 @f() { .. } ``` This option is useful in checking whether a function that uses a constant global is optimized correctly. Reviewers: jsji, MaskRay, david2050 Reviewed By: MaskRay Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73833
-
LLVM GN Syncbot authored
-
Johannes Doerfert authored
If we had `noalias` on an argument the inliner created alias scope metadata already. However, the call site `noalias` annotation was not considered. Since the Attributor can derive such call site `noalias` annotation we should treat them the same as argument annotations. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D73528
-