- Oct 14, 2021
-
-
Brian Cain authored
This commit adds the system reg/regpair definitions and the corresponding register transfer instructions.
-
Andrew Savonichev authored
-
Andrew Savonichev authored
These registers are used as operands for instructions that expect an integer register, so they should be added to Int32Regs or Int64Regs register classes. Otherwise the machine verifier emits an error for the following LIT tests when LLVM_ENABLE_MACHINE_VERIFIER=1 environment variable is set: *** Bad machine code: Illegal physical register for instruction *** - function: kernel_func - basic block: %bb.0 entry (0x55c8903d5438) - instruction: %3:int64regs = LEA_ADDRi64 $vrframelocal, 0 - operand 1: $vrframelocal $vrframelocal is not a Int64Regs register. CodeGen/NVPTX/call-with-alloca-buffer.ll CodeGen/NVPTX/disable-opt.ll CodeGen/NVPTX/lower-alloca.ll CodeGen/NVPTX/lower-args.ll CodeGen/NVPTX/param-align.ll CodeGen/NVPTX/reg-types.ll DebugInfo/NVPTX/dbg-declare-alloca.ll DebugInfo/NVPTX/dbg-value-const-byref.ll Differential Revision: https://reviews.llvm.org/D110164
-
Florian Hahn authored
Running -vector-combine early can introduce new vector operations, blocking loop/SLP vectorization. The added test case could be better optimized by the SLPVectorizer if no new vector operations are added early.
-
Jonas Paulsson authored
-
Andrew Savonichev authored
The patch attempts to optimize a sequence of SIMD loads from the same base pointer: %0 = gep float*, float* base, i32 4 %1 = bitcast float* %0 to <4 x float>* %2 = load <4 x float>, <4 x float>* %1 ... %n1 = gep float*, float* base, i32 N %n2 = bitcast float* %n1 to <4 x float>* %n3 = load <4 x float>, <4 x float>* %n2 For AArch64 the compiler generates a sequence of LDR Qt, [Xn, #16]. However, 32-bit NEON VLD1/VST1 lack the [Wn, #imm] addressing mode, so the address is computed before every ld/st instruction: add r2, r0, #32 add r0, r0, #16 vld1.32 {d18, d19}, [r2] vld1.32 {d22, d23}, [r0] This can be improved by computing address for the first load, and then using a post-indexed form of VLD1/VST1 to load the rest: add r0, r0, #16 vld1.32 {d18, d19}, [r0]! vld1.32 {d22, d23}, [r0] In order to do that, the patch adds more patterns to DAGCombine: - (load (add ptr inc1)) and (add ptr inc2) are now folded if inc1 and inc2 are constants. - (or ptr inc) is now recognized as a pointer increment if ptr is sufficiently aligned. In addition to that, we now search for all possible base updates and then pick the best one. Differential Revision: https://reviews.llvm.org/D108988
-
Simon Pilgrim authored
Avoids unused assignment scan-build warning.
-
Simon Pilgrim authored
Fixes scan-build warning about dead initialization
-
Kirill Bobyrev authored
Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D111698
-
Nicolas Vasilache authored
-
Simon Pilgrim authored
Without SSE41 sext/zext instructions the extensions will be split, meaning that the MUL->PMADDWD fold will split the sext_i32(x) into zext_i32(sext_i16(x))
-
Simon Pilgrim authored
2 returns, one after the other - reported by coverity
-
Alex Zinenko authored
Improve support for variadic regions in ODS-generated operation view classes. In particular, make generated constructors take an extra argument that specifies the number of variadic regions if the operation has them. Previously, there was no mechanism to specify a non-zero number of variadic regions. Also generate named accessors to regions. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D111783
-
Alex Zinenko authored
MemRefType was using a wrong `isa` function in the bindings code, which could lead to invalid IR being constructed. Also run the verifier in memref dialect tests. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111784
-
Jeremy Morse authored
Some functions get opted out of instruction referencing if they're being compiled with no optimisations, however the LiveDebugValues pass picks one implementation and then sticks with it through the rest of compilation. This leads to a segfault if we encounter a function that doesn't use instr-ref (because it's optnone, for example), but we've already decided to use InstrRefBasedLDV which expects to be passed a DomTree. Solution: keep both implementations around in the pass, and pick whichever one is appropriate to the current function.
-
Uday Bondhugula authored
Fix assert crash when an unregistered dialect op is encountered during parsing and `-allow-unregistered-dialect' isn't on. Instead, emit an error. While on this, clean up "registered" vs "loaded" on `getDialect()` and local clang-tidy warnings. https://llvm.discourse.group/t/assert-behavior-on-unregistered-dialect-ops/4402 Differential Revision: https://reviews.llvm.org/D111628
-
Tobias Gysi authored
Setting the nofold attribute enables packing an operand. At the moment, the attribute is set by default. The pack introduces a callback to control the flag. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111718
-
Josh Mottley authored
This patch replaces all uses of std::vector with llvm::SmallVector in the flang-omp-report plugin. This is a one of several patches focusing on switching containers from STL to LLVM's ADT library. Reviewed By: Leporacanthicus Differential Revision: https://reviews.llvm.org/D111709
-
Tobias Gysi authored
After removing the last LinalgOps that have no region attached we can verify there is a region. The patch performs the following changes: - Move the SingleBlockImplicitTerminator trait further up the the structured op base class. - Adapt the LinalgOp verification since the trait only check if there is 0 or 1 block. - Introduce a getBlock method on the LinalgOp interface. - Access the LinalgOp body using either getBlock() or getBody() if the concrete operation type is known. This patch is a follow up to https://reviews.llvm.org/D111233. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D111393
-
Pavel Labath authored
-
Jonas Paulsson authored
This reverts 3562076d and includes some refactoring as well. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D111733
-
Jonas Paulsson authored
This patch fixes the bug that consisted of treating variable / immediate length mem operations (such as memcpy, memset, ...) differently. The variable length case needs to have the length minus 1 passed due to the use of EXRL target instructions. However, the DAGCombiner can convert a register length argument into a constant one, and whenever that happened one byte too little would end up being performed. This is also a refactorization by reducing the number of opcodes and variants involved. For any opcode (variable or constant length), only the length minus one is passed on to the ISD node. The rest of the logic is now instead handled during isel pseudo expansion. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D111729
-
Martin Storsjö authored
This makes the compiler generated code for accessing the thread local variable much simpler (no need for wrapper functions and weak pointers to potential init functions), and can avoid toolchain bugs regarding how to access TLS variables. In particular, this fixes LLDB when built with current GCC/binutils for MinGW, see https://github.com/msys2/MINGW-packages/issues/8868. Differential Revision: https://reviews.llvm.org/D111779
-
Pavel Labath authored
When we know the bounds of the array, print any embedded nuls instead of treating them as terminators. An exception to this rule is made for the nul character at the very end of the string. We don't print that, as otherwise 99% of the strings would end in \0. This way the strings usually come out the same as how the user typed it into the compiler (char foo[] = "with\0nuls"). It also matches how they come out in gdb. This resolves a FIXME left from D111399, and leaves another FIXME for dealing with nul characters in "escape-non-printables=false" mode. In this mode the characters cause the entire summary string to be terminated prematurely. Differential Revision: https://reviews.llvm.org/D111634
-
Max Kazantsev authored
Replace check with if ((ExitIfTrue && CI->isZero()) || (!ExitIfTrue && CI->isOne())) with equivalent and simpler version if (ExitIfTrue == CI->isZero())
-
Max Kazantsev authored
Check lightweight getter condition before calling all_of.
-
Arthur Eubanks authored
gcc does not support __has_feature(), so this was accidentally changed in D111581 when compiling with gcc.
-
Valentin Clement authored
Remove unsused variable that break Werror on some buildbots
-
Ben Shi authored
Opitimize immediate materialisation in the following way if profitable: 1. Use BCLRI for upper 32 bits if the lower 32 bits are negative int32. 2. Use BSETI for upper 32 bits if the lower 32 bits are positive int32. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D111508
-
Kazu Hirata authored
-
Abinav Puthan Purayil authored
The 24-bit mul intrinsics yields the low-order 32 bits. We should only do the transformation if the operands are known to be not wider than 24 bits and the result is known to be not wider than 32 bits. Differential Revision: https://reviews.llvm.org/D111523
-
Tom Stellard authored
Reviewed By: smeenai Differential Revision: https://reviews.llvm.org/D110976
-
Ben Shi authored
Use LUI+SLLI.UW to compose the upper bits instead of LUI+SLLI. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D111705
-
Ben Shi authored
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D111704
-
Stella Laurenzo authored
* Incorporates a reworked version of D106419 (which I have closed but has comments on it). * Extends the standalone example to include a minimal CAPI (for registering its dialect) and a test which, from out of tree, creates an aggregate dylib and links a little sample program against it. This will likely only work today in *static* MLIR builds (until the TypeID fiasco is finally put to bed). It should work on all platforms, though (including Windows - albeit I haven't tried this exact incarnation there). * This is the biggest pre-requisite to being able to build out of tree MLIR Python-based projects from an installed MLIR/LLVM. * I am rather nauseated by the CMake shenanigans I had to endure to get this working. The primary complexity, above and beyond the previous patch is because (with no reason given), it is impossible to export target properties that contain generator expressions... because, of course it isn't. In this case, the primary reason we use generator expressions on the individual embedded libraries is to support arbitrary ordering. Since that need doesn't apply to out of tree (which import everything via FindPackage at the outset), we fall back to a more imperative way of doing the same thing if we detect that the target was imported. Gross, but I don't expect it to need a lot of maintenance. * There should be a relatively straight-forward path from here to rebase libMLIR.so on top of this facility and also make it include the CAPI. Differential Revision: https://reviews.llvm.org/D111504
-
Lang Hames authored
-
wlei authored
The first LBR entry can be an external branch, we should ignore the whole trace. ``` 7f7448e889e4 0x7f7448e889e4/0x7f7448e88826/P/-/-/1 0x7f7448e8899f/0x7f7448e889d8/P/-/-/4 ... ``` Reviewed By: wenlei, hoy Differential Revision: https://reviews.llvm.org/D111749
-
wlei authored
With `ignore-stack-samples`, We can ignore the call stack before the samples aggregation which could reduce some redundant computations. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D111577
-
Lang Hames authored
SimpleRemoteEPC notionally allowed subclasses to override the createMemoryManager and createMemoryAccess methods to use custom objects, but could not actually be subclassed in practice (The construction process in SimpleRemoteEPC::Create could not be re-used). Instead of subclassing, this commit adds a SimpleRemoteEPC::Setup class that can be used by clients to set up the memory manager and memory access members. A default-constructed Setup object results in no change from previous behavior (EPCGeneric* memory manager and memory access objects used by default).
-
Lang Hames authored
-