- May 03, 2021
-
-
MaheshRavishankar authored
Convert subtensor and subtensor_insert operations to use their rank-reduced versions to drop unit dimensions. Differential Revision: https://reviews.llvm.org/D101495
-
Valentin Clement authored
Add function to create the offload_maptypes and the offload_mapnames globals. These two functions are used in clang. They will be used in the Flang/MLIR lowering as well. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101503
-
Tomas Matheson authored
atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as 3338290c. Reverted in c7df6b12.
-
thomasraoux authored
The current implementation had a bug as it was relying on the target vector dimension sizes to calculate where to insert broadcast. If several dimensions have the same size we may insert the broadcast on the wrong dimension. The correct broadcast cannot be inferred from the type of the source and destination vector. Instead when we want to extend transfer ops we calculate an "inverse" map to the projected permutation and insert broadcast in place of the projected dimensions. Differential Revision: https://reviews.llvm.org/D101738
-
Frederik Gossen authored
Differential Revision: https://reviews.llvm.org/D101771
-
Frederik Gossen authored
Add dedicated pass `convert-linalg-tiled-loops-to-scf` to lower `linalg.tiled_loop`s. Differential Revision: https://reviews.llvm.org/D101768
-
Anirudh Prasad authored
[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("*") for Z PC-relative instructions. - This patch attempts to implement the location counter syntax (*) for the HLASM variant for PC-relative instructions. - In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " *" which is parsed as "<pc-rel-insn 0>" - For combinations of absolute values and relocatable values, we don't expect the "*" preceding the token. When you have a " * " what’s accepted is: ``` *<space>.*{.*} -> <pc-rel-insn> 0 *[+|-][constant-value] -> <pc-rel-insn> [+|-]constant-value ``` When you don’t have a " * " what’s accepted is: ``` brasl 1,func is allowed (MCSymbolRef type) brasl 1,func+4 is allowed (MCBinary type) brasl 1,4+func is allowed (MCBinary type) brasl 1,-4+func is allowed (MCBinary type) brasl 1,func-4 is allowed (MCBinary type) brasl 1,*func is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*+func is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*+func+4 is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*+4+func is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*-4+8+func is not allowed (* cannot be used for non-MCConstantExprs) ``` Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D100987
-
Mitch Phillips authored
The Scudo C unit tests are currently non-hermetic. In particular, adding or removing a transfer batch is a global state of the allocator that persists between tests. This can cause flakiness in ScudoWrappersCTest.MallInfo, because the creation or teardown of a batch causes mallinfo's uordblks or fordblks to move up or down by the size of a transfer batch on malloc/free. It's my opinion that uordblks and fordblks should track the statistics related to the user's malloc() and free() usage, and not the state of the internal allocator structures. Thus, excluding the transfer batches from stat collection does the trick and makes these tests pass. Repro instructions of the bug: 1. ninja ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test 2. ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test --gtest_filter=ScudoWrappersCTest.MallInfo Reviewed By: cryptoad Differential Revision: https://reviews.llvm.org/D101653
-
Louis Dionne authored
This makes the libc++ tests more portable -- almost all of them should now work on Windows, except for some tests that assume a shell is available on the target. We should probably provide a way to exclude those anyway for the purpose of running tests on embedded targets. Differential Revision: https://reviews.llvm.org/D89495
-
Louis Dionne authored
This fixes the issue by implementing _And using the short-circuiting SFINAE trick that we previously used only in std::tuple. One thing we could look into is use the naive recursive implementation for disjunctions with a small number of arguments, and use that trick with larger numbers of arguments. It might be the case that the constant overhead for setting up the SFINAE trick makes it only worth doing for larger packs, but that's left for further work. This problem was raised in https://reviews.llvm.org/D96523. Differential Revision: https://reviews.llvm.org/D101661
-
Stella Laurenzo authored
* NFC but has some fixes for CMake glitches discovered along the way (things not cleaning properly, co-mingled depends). * Includes previously unsubmitted fix in D98681 and a TODO to fix it more appropriately in a smaller followup. Differential Revision: https://reviews.llvm.org/D101493
-
Louis Dionne authored
This patch gets rid of technical debt around std::pointer_safety which, I claim, is entirely unnecessary. I don't think anybody has used std::pointer_safety in actual code because we do not implement the underlying garbage collection support. In fact, P2186 even proposes removing these facilities entirely from a future C++ version. As such, I think it's entirely fine to get rid of complex workarounds whose goals were to avoid breaking the ABI back in 2017. I'm putting this up both to get reviews and to discuss this proposal for a breaking change. I think we should be comfortable with making these tiny breaks if we are confident they won't hurt anyone, which I'm fairly confident is the case here. Differential Revision: https://reviews.llvm.org/D100410
-
Paul Robinson authored
The comment about how to make use of debugger tuning within DwarfDebug really belongs inside the DwarfDebug declaration, where it will be easier to find.
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D101511
-
Stanislav Mekhanoshin authored
Extend the legalization of global SADDR loads and stores with changing to VADDR to the FLAT scratch instructions. Differential Revision: https://reviews.llvm.org/D101408
-
Zarko Todorovski authored
The previous implementation of the default AltiVec ABI marked registers V20-V31 as reserved. This failed to prevent reserved VFRC registers being allocated. In this patch instead of marking the registers reserved we remove unallowed registers from the allocation order completely. This is a slight rework of an implementation by @nemanjai Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D100050
-
Duncan P. N. Exon Smith authored
Remove an early return from an `else` block that's immediately followed by an equivalent early return after the `else` block. Differential Revision: https://reviews.llvm.org/D101671
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D101637
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D101643
-
Matt Morehouse authored
-
Fabian Meumertzheim authored
In the overwrite branch of MutationDispatcher::ApplyDictionaryEntry in FuzzerMutate.cpp, the index Idx at which W.size() bytes are overwritten with the word W is chosen uniformly at random in the interval [0, Size - W.size()). This means that Idx + W.size() will always be strictly less than Size, i.e., the last byte of the current unit will never be overwritten. This is fixed by adding 1 to the exclusive upper bound. Addresses https://bugs.llvm.org/show_bug.cgi?id=49989. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D101625
-
Stanislav Mekhanoshin authored
Instead of legalizing saddr operand with a readfirstlane when address is moved from SGPR to VGPR we can just change the opcode. Differential Revision: https://reviews.llvm.org/D101405
-
Giorgis Georgakoudis authored
Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101739
-
Benjamin Kramer authored
The converter assumed that all operands have the same type, that's not true for select. Differential Revision: https://reviews.llvm.org/D101767
-
thomasraoux authored
Move TransposeOp lowering in its own populate function as in some cases it is better to keep it during ContractOp lowering to better canonicalize it rather than emiting scalar insert/extract. Differential Revision: https://reviews.llvm.org/D101647
-
Arthur Eubanks authored
Reviewed By: asbirlea, ychen Differential Revision: https://reviews.llvm.org/D100912
-
Uday Bondhugula authored
Add missing check in -test-affine-data-copy without which a test case that has no affine.loads at all would crash this test pass. Fix two clang-tidy warnings in the file while at this. (Not adding a test case given the triviality.) Differential Revision: https://reviews.llvm.org/D101719
-
Stella Laurenzo authored
* This makes them consistent with custom types/attributes, whose constructors will do a type checked conversion. Of course, the base classes can represent everything so never error. * More importantly, this makes it possible to subclass Type and Attribute out of tree in sensible ways. Differential Revision: https://reviews.llvm.org/D101734
-
Chris Lattner authored
This avoids the non-trivial overhead of creating a TaskGroup in these degenerate cases, but also exposes parallelism. It turns out that the default executor underlying TaskGroup prevents recursive parallelism - so an instance of a task group being alive will make nested ones become serial. This is a big issue in MLIR in some dialects, if they have a single instance of an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g. a firrtl.module). This patch side-steps the problem by avoiding creating the TaskGroup in the unneeded case. See this issue for more details: https://github.com/llvm/circt/issues/993 Note that this isn't a really great solution for the general case of nested parallelism. A redesign of the TaskGroup stuff would be better, but would be a much more invasive change. Differential Revision: https://reviews.llvm.org/D101699
-
Marek Kurdej authored
This fixes another bogus build error on gcc, e.g. https://lab.llvm.org/buildbot/#/builders/118/builds/2504. /home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux-perf/llvm/clang/lib/Format/UnwrappedLineFormatter.cpp:424:42: error: binding ‘clang::format::FormatToken* const’ to reference of type ‘clang::format::FormatToken*&’ discards qualifiers auto IsElseLine = [&First = TheLine->First]() -> bool { ^
-
Frederik Gossen authored
Differential Revision: https://reviews.llvm.org/D101747
-
Marek Kurdej authored
-
Marek Kurdej authored
-
David Green authored
This can come up in rare situations, where a csel is created with identical operands. These can be folded simply to the original value, allowing the csel to be removed and further simplification to happen. This patch also removes FCSEL as it is unused, not being produced anywhere or lowered to anything. Differential Revision: https://reviews.llvm.org/D101687
-
Marek Kurdej authored
-
Marek Kurdej authored
-
Marek Kurdej authored
This fixes the bug http://llvm.org/pr50019. Reviewed By: MyDeveloperDay Differential Revision: https://reviews.llvm.org/D100727
-
Fangrui Song authored
Fix PR50111 Differential Revision: https://reviews.llvm.org/D101698
-
William S. Moses authored
Differential Revision: https://reviews.llvm.org/D101705
-
Anirudh Prasad authored
- Previously, https://reviews.llvm.org/D101308 removed prefixes from register while printing them out. This was especially needed for inline asm statements which used input/output operands. - However, the backend SystemZAsmParser, accepts both prefixed registers and prefix-less registers as part of its implementation - This patch aims to change that by ensuring that prefixed registers are only allowed for the ATT dialect. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D101665
-