- May 03, 2021
-
-
MaheshRavishankar authored
Given the source and destination shapes, if they are static, or if the expanded/collapsed dimensions are unit-extent, it is possible to compute the reassociation maps that can be used to reshape one type into another. Add a utility method to return the reassociation maps when possible. This utility function can be used to fuse a sequence of reshape ops, given the type of the source of the producer and the final result type. This pattern supercedes a more constrained folding pattern added to DropUnitDims pass. Differential Revision: https://reviews.llvm.org/D101343
-
Christopher Di Bella authored
Implements parts of: * P0896R4 The One Ranges Proposal` Depends on D100275. Differential Revision: https://reviews.llvm.org/D100278
-
Aart Bik authored
Test passes either way, but this is full name of dialect Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D101774
-
Dimitry Andric authored
This reverts commit ab40c027. Some additional test cases are influenced by the workaround, and I need to do a complete test run to identify and check them all.
-
Dimitry Andric authored
This fixes PR49821, and avoids "ld.lld: error: test.o:(.rodata.str1.1): offset is outside the section" errors when linking MIPS objects with negative R_MIPS_LO16 implicit addends. ld.lld handles R_MIPS_HI16/R_MIPS_LO16 separately, not as a whole, so it doesn't know that an R_MIPS_HI16 with implicit addend 1 and an R_MIPS_LO16 with implicit addend -32768 represents 32768, which is in range of a MergeInputSection. We could introduce a new RelExpr member (like R_RISCV_PC_INDIRECT for R_RISCV_PCREL_HI20 / R_RISCV_PCREL_LO12) but the complexity is unnecessary given that GNU as keeps the original symbol for this case as well. Reviewed By: atanasyan, MaskRay Differential Revision: https://reviews.llvm.org/D101773
-
Fangrui Song authored
Code patterns like this are common, `#` at the line beginning (https://google.github.io/styleguide/cppguide.html#Preprocessor_Directives), one space indentation for if/elif/else directives. ``` #if SANITIZER_LINUX # if defined(__aarch64__) # endif #endif ``` However, currently clang-format wants to reformat the code to ``` #if SANITIZER_LINUX #if defined(__aarch64__) #endif #endif ``` This significantly harms readability in my review. Use `IndentPPDirectives: AfterHash` to defeat the diagnostic. clang-format will now suggest: ``` #if SANITIZER_LINUX # if defined(__aarch64__) # endif #endif ``` Unfortunately there is no clang-format option using indent with 1 for just preprocessor directives. However, this is still one step forward from the current behavior. Reviewed By: #sanitizers, vitalybuka Differential Revision: https://reviews.llvm.org/D100238
-
Tomas Matheson authored
This reverts commit 75318503.
-
Christopher Di Bella authored
Implements parts of: * P0896R4 The One Ranges Proposal` Depends on D100271. Differential Revision: https://reviews.llvm.org/D100275
-
Teresa Johnson authored
When passingValueIsAlwaysUndefined scans for an instruction between an inst with a null or undef argument and its first use, it was checking for instructions that may have side effects, which is a superset of the instructions it intended to find (as per the comments, control flow changing instructions that would prevent reaching the uses). Switch to using isGuaranteedToTransferExecutionToSuccessor() instead. Without this change, when enabling -fwhole-program-vtables, which causes assumes to be inserted by clang, we can get different simplification decisions. In particular, when building with instrumentation FDO it can affect the optimizations decisions before FDO matching, leading to some mismatches. I had to modify d83507-knowledge-retention-bug.ll since this fix enables more aggressive optimization of that code such that it no longer tested the original bug it was meant to test. I removed the undef which still provokes the original failure (confirmed by temporarily reverting the fix) and also changed it to just invoke the passes of interest to narrow the testing. Similarly I needed to adjust code for UnreachableEliminate.ll to avoid an undef which was causing the function body to get optimized away with this fix. Differential Revision: https://reviews.llvm.org/D101507
-
Paulo Matos authored
WebAssembly instruction arguments should have their arguments ordered from the deepest to the shallowest on the stack.
-
Sanjay Patel authored
There's a TODO comment in the code and discussion in D99912 about generalizing this, but I wasn't sure how to implement that, so just going with a potential minimal fix to avoid crashing. The test is a reduction beyond useful code (there's no user of %user...), but it is based on https://llvm.org/PR50191, so this is asserting on real code. Differential Revision: https://reviews.llvm.org/D101772
-
MaheshRavishankar authored
Convert subtensor and subtensor_insert operations to use their rank-reduced versions to drop unit dimensions. Differential Revision: https://reviews.llvm.org/D101495
-
Valentin Clement authored
Add function to create the offload_maptypes and the offload_mapnames globals. These two functions are used in clang. They will be used in the Flang/MLIR lowering as well. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101503
-
Tomas Matheson authored
atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops. To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation. Floating point legalisation: f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32) Differential Revision: https://reviews.llvm.org/D101164 Originally submitted as 3338290c. Reverted in c7df6b12.
-
thomasraoux authored
The current implementation had a bug as it was relying on the target vector dimension sizes to calculate where to insert broadcast. If several dimensions have the same size we may insert the broadcast on the wrong dimension. The correct broadcast cannot be inferred from the type of the source and destination vector. Instead when we want to extend transfer ops we calculate an "inverse" map to the projected permutation and insert broadcast in place of the projected dimensions. Differential Revision: https://reviews.llvm.org/D101738
-
Frederik Gossen authored
Differential Revision: https://reviews.llvm.org/D101771
-
Frederik Gossen authored
Add dedicated pass `convert-linalg-tiled-loops-to-scf` to lower `linalg.tiled_loop`s. Differential Revision: https://reviews.llvm.org/D101768
-
Anirudh Prasad authored
[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("*") for Z PC-relative instructions. - This patch attempts to implement the location counter syntax (*) for the HLASM variant for PC-relative instructions. - In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " *" which is parsed as "<pc-rel-insn 0>" - For combinations of absolute values and relocatable values, we don't expect the "*" preceding the token. When you have a " * " what’s accepted is: ``` *<space>.*{.*} -> <pc-rel-insn> 0 *[+|-][constant-value] -> <pc-rel-insn> [+|-]constant-value ``` When you don’t have a " * " what’s accepted is: ``` brasl 1,func is allowed (MCSymbolRef type) brasl 1,func+4 is allowed (MCBinary type) brasl 1,4+func is allowed (MCBinary type) brasl 1,-4+func is allowed (MCBinary type) brasl 1,func-4 is allowed (MCBinary type) brasl 1,*func is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*+func is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*+func+4 is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*+4+func is not allowed (* cannot be used for non-MCConstantExprs) brasl 1,*-4+8+func is not allowed (* cannot be used for non-MCConstantExprs) ``` Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D100987
-
Mitch Phillips authored
The Scudo C unit tests are currently non-hermetic. In particular, adding or removing a transfer batch is a global state of the allocator that persists between tests. This can cause flakiness in ScudoWrappersCTest.MallInfo, because the creation or teardown of a batch causes mallinfo's uordblks or fordblks to move up or down by the size of a transfer batch on malloc/free. It's my opinion that uordblks and fordblks should track the statistics related to the user's malloc() and free() usage, and not the state of the internal allocator structures. Thus, excluding the transfer batches from stat collection does the trick and makes these tests pass. Repro instructions of the bug: 1. ninja ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test 2. ./projects/compiler-rt/lib/scudo/standalone/tests/ScudoCUnitTest-x86_64-Test --gtest_filter=ScudoWrappersCTest.MallInfo Reviewed By: cryptoad Differential Revision: https://reviews.llvm.org/D101653
-
Louis Dionne authored
This makes the libc++ tests more portable -- almost all of them should now work on Windows, except for some tests that assume a shell is available on the target. We should probably provide a way to exclude those anyway for the purpose of running tests on embedded targets. Differential Revision: https://reviews.llvm.org/D89495
-
Louis Dionne authored
This fixes the issue by implementing _And using the short-circuiting SFINAE trick that we previously used only in std::tuple. One thing we could look into is use the naive recursive implementation for disjunctions with a small number of arguments, and use that trick with larger numbers of arguments. It might be the case that the constant overhead for setting up the SFINAE trick makes it only worth doing for larger packs, but that's left for further work. This problem was raised in https://reviews.llvm.org/D96523. Differential Revision: https://reviews.llvm.org/D101661
-
Stella Laurenzo authored
* NFC but has some fixes for CMake glitches discovered along the way (things not cleaning properly, co-mingled depends). * Includes previously unsubmitted fix in D98681 and a TODO to fix it more appropriately in a smaller followup. Differential Revision: https://reviews.llvm.org/D101493
-
Louis Dionne authored
This patch gets rid of technical debt around std::pointer_safety which, I claim, is entirely unnecessary. I don't think anybody has used std::pointer_safety in actual code because we do not implement the underlying garbage collection support. In fact, P2186 even proposes removing these facilities entirely from a future C++ version. As such, I think it's entirely fine to get rid of complex workarounds whose goals were to avoid breaking the ABI back in 2017. I'm putting this up both to get reviews and to discuss this proposal for a breaking change. I think we should be comfortable with making these tiny breaks if we are confident they won't hurt anyone, which I'm fairly confident is the case here. Differential Revision: https://reviews.llvm.org/D100410
-
Paul Robinson authored
The comment about how to make use of debugger tuning within DwarfDebug really belongs inside the DwarfDebug declaration, where it will be easier to find.
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D101511
-
Stanislav Mekhanoshin authored
Extend the legalization of global SADDR loads and stores with changing to VADDR to the FLAT scratch instructions. Differential Revision: https://reviews.llvm.org/D101408
-
Zarko Todorovski authored
The previous implementation of the default AltiVec ABI marked registers V20-V31 as reserved. This failed to prevent reserved VFRC registers being allocated. In this patch instead of marking the registers reserved we remove unallowed registers from the allocation order completely. This is a slight rework of an implementation by @nemanjai Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D100050
-
Duncan P. N. Exon Smith authored
Remove an early return from an `else` block that's immediately followed by an equivalent early return after the `else` block. Differential Revision: https://reviews.llvm.org/D101671
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D101637
-
thomasraoux authored
Differential Revision: https://reviews.llvm.org/D101643
-
Matt Morehouse authored
-
Fabian Meumertzheim authored
In the overwrite branch of MutationDispatcher::ApplyDictionaryEntry in FuzzerMutate.cpp, the index Idx at which W.size() bytes are overwritten with the word W is chosen uniformly at random in the interval [0, Size - W.size()). This means that Idx + W.size() will always be strictly less than Size, i.e., the last byte of the current unit will never be overwritten. This is fixed by adding 1 to the exclusive upper bound. Addresses https://bugs.llvm.org/show_bug.cgi?id=49989. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D101625
-
Stanislav Mekhanoshin authored
Instead of legalizing saddr operand with a readfirstlane when address is moved from SGPR to VGPR we can just change the opcode. Differential Revision: https://reviews.llvm.org/D101405
-
Giorgis Georgakoudis authored
Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101739
-
Benjamin Kramer authored
The converter assumed that all operands have the same type, that's not true for select. Differential Revision: https://reviews.llvm.org/D101767
-
thomasraoux authored
Move TransposeOp lowering in its own populate function as in some cases it is better to keep it during ContractOp lowering to better canonicalize it rather than emiting scalar insert/extract. Differential Revision: https://reviews.llvm.org/D101647
-
Arthur Eubanks authored
Reviewed By: asbirlea, ychen Differential Revision: https://reviews.llvm.org/D100912
-
Uday Bondhugula authored
Add missing check in -test-affine-data-copy without which a test case that has no affine.loads at all would crash this test pass. Fix two clang-tidy warnings in the file while at this. (Not adding a test case given the triviality.) Differential Revision: https://reviews.llvm.org/D101719
-
Stella Laurenzo authored
* This makes them consistent with custom types/attributes, whose constructors will do a type checked conversion. Of course, the base classes can represent everything so never error. * More importantly, this makes it possible to subclass Type and Attribute out of tree in sensible ways. Differential Revision: https://reviews.llvm.org/D101734
-
Chris Lattner authored
This avoids the non-trivial overhead of creating a TaskGroup in these degenerate cases, but also exposes parallelism. It turns out that the default executor underlying TaskGroup prevents recursive parallelism - so an instance of a task group being alive will make nested ones become serial. This is a big issue in MLIR in some dialects, if they have a single instance of an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g. a firrtl.module). This patch side-steps the problem by avoiding creating the TaskGroup in the unneeded case. See this issue for more details: https://github.com/llvm/circt/issues/993 Note that this isn't a really great solution for the general case of nested parallelism. A redesign of the TaskGroup stuff would be better, but would be a much more invasive change. Differential Revision: https://reviews.llvm.org/D101699
-