- Feb 08, 2021
-
-
Tres Popp authored
Now the context is the first, rather than the last input. This better matches the rest of the infrastructure and makes it easier to move these types to being declaratively specified. Differential Revision: https://reviews.llvm.org/D96111
-
Vladislav Vinogradov authored
This will allow to use `NativeOpTrait` and Operations declared outside of `mlir` namespace. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D96128
- Feb 06, 2021
-
-
Tung D. Le authored
This patch fixes the following bug when calling --affine-loop-fusion Input program: ```mlir func @should_not_fuse_since_top_level_non_affine_non_result_users( %in0 : memref<32xf32>, %in1 : memref<32xf32>) { %c0 = constant 0 : index %cst_0 = constant 0.000000e+00 : f32 affine.for %d = 0 to 32 { %lhs = affine.load %in0[%d] : memref<32xf32> %rhs = affine.load %in1[%d] : memref<32xf32> %add = addf %lhs, %rhs : f32 affine.store %add, %in0[%d] : memref<32xf32> } store %cst_0, %in0[%c0] : memref<32xf32> affine.for %d = 0 to 32 { %lhs = affine.load %in0[%d] : memref<32xf32> %rhs = affine.load %in1[%d] : memref<32xf32> %add = addf %lhs, %rhs: f32 affine.store %add, %in0[%d] : memref<32xf32> } return } ``` call --affine-loop-fusion, we got an incorrect output: ```mlir func @should_not_fuse_since_top_level_non_affine_non_result_users(%arg0: memref<32xf32>, %arg1: memref<32xf32>) { %c0 = constant 0 : index %cst = constant 0.000000e+00 : f32 store %cst, %arg0[%c0] : memref<32xf32> affine.for %arg2 = 0 to 32 { %0 = affine.load %arg0[%arg2] : memref<32xf32> %1 = affine.load %arg1[%arg2] : memref<32xf32> %2 = addf %0, %1 : f32 affine.store %2, %arg0[%arg2] : memref<32xf32> %3 = affine.load %arg0[%arg2] : memref<32xf32> %4 = affine.load %arg1[%arg2] : memref<32xf32> %5 = addf %3, %4 : f32 affine.store %5, %arg0[%arg2] : memref<32xf32> } return } ``` This happened because when analyzing the source and destination nodes, affine loop fusion ignored non-result ops sandwitched between them. In other words, the MemRefDependencyGraph in the affine loop fusion ignored these non-result ops. This patch solves the issue by adding these non-result ops to the MemRefDependencyGraph. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D95668
-
Mehdi Amini authored
This new invoke will pack a list of argument before calling the `invokePacked` method. It accepts returned value as output argument wrapped in `ExecutionEngine::Result<T>`, and delegate the packing of arguments to a trait to allow for customization for some types. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D95961
-
- Feb 05, 2021
-
-
Lei Zhang authored
These patterns move vector.bitcast ops to be before insert ops or after extract ops where suitable. With them, bitcast will happen on smaller vectors and there are more chances to share extract/insert ops. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D96040
-
Lei Zhang authored
Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D96041
-
Lei Zhang authored
This patch introduces a few more straightforward patterns to convert vector ops operating on 1-4 element vectors to their corresponding SPIR-V counterparts. This patch also enables converting vector<1xT> to T. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D96042
-
Lei Zhang authored
This patch adds patterns to use vector.shape_cast to cast away leading 1-dimensions from a few vector operations. It allows exposing more canonical forms of vector.transfer_read, vector.transfer_write, vector_extract_strided_slice, and vector.insert_strided_slice. With this, we can have more opportunity to cancelling extract/insert ops or forwarding write/read ops. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D95873
-
Nicolas Vasilache authored
-
Alex Zinenko authored
Historically, Linalg To LLVM conversion subsumed numerous other conversions, including (affine) loop lowerings to CFG and conversions from the Standard and Vector dialects to the LLVM dialect. This was due to the insufficient support for partial conversions in the infrastructure that essentially required conversions that involve type change (in this case, !linalg.range to !llvm.struct) to be performed in a single conversion sweep. This is no longer the case so remove the subsumed conversions and run them as separate passes when necessary. Depends On D95317 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D96008
-
Vladislav Vinogradov authored
The `AffineMap` class follows the same semantic as Type and Attribute. It is immutable object, so it make sence to mark its methods as const. Also part of its API is already marked as const, this change just make the API consistent. Reviewed By: ftynse, bondhugula Differential Revision: https://reviews.llvm.org/D96026
-
Nicolas Vasilache authored
Differential Revision: https://reviews.llvm.org/D96116
-
Nicolas Vasilache authored
Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D96094
-
River Riddle authored
This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way. Differential Revision: https://reviews.llvm.org/D95841
-
- Feb 04, 2021
-
-
Diego Caballero authored
Reviewed By: mehdi_amini, rriddle Differential Revision: https://reviews.llvm.org/D95906
-
Mehdi Amini authored
-
Nicolas Vasilache authored
This revision takes advantage of recent extensions to vectorization to refactor contraction detection into a bona fide Linalg interface. The mlit-linalg-ods-gen parser is extended to support adding such interfaces. The detection that was originally enabling vectorization is refactored to serve as both a test on a generic LinalgOp as well as to verify ops that declare to conform to that interface. This is plugged through Linalg transforms and strategies but it quickly becomes evident that the complexity and rigidity of the C++ class based templating does not pay for itself. Therefore, this revision changes the API for vectorization patterns to get rid of templates as much as possible. Variadic templates are relegated to the internals of LinalgTransformationFilter as much as possible and away from the user-facing APIs. It is expected other patterns / transformations will follow the same path and drop as much C++ templating as possible from the class definition. Differential revision: https://reviews.llvm.org/D95973
-
Alexander Belyaev authored
Differential Revision: https://reviews.llvm.org/D96024
-
Nicolas Vasilache authored
This op is subsumed by rank-reducing SubViewOp and has become useless. Differential revision: https://reviews.llvm.org/D95317
-
Alex Zinenko authored
Historically, the Vector to LLVM dialect conversion subsumed the Standard to LLVM dialect conversion patterns. This was necessary because the conversion infrastructure did not have sufficient support for reconciling type conversions. This support is now available. Only keep the patterns related to the Vector dialect in the Vector to LLVM conversion and require type casts operations to be inserted if necessary. These casts will be removed by following conversions if possible. Update integration tests to also run the Standard to LLVM conversion. There is a significant amount of test churn, which is due to (a) unnecessarily strict tests in VectorToLLVM and (b) many patterns actually targeting Standard dialect ops instead of LLVM dialect ops leading to tests actually exercising a Vector->Standard->LLVM conversion. This churn is a good illustration of the reason to make the conversion partial: now the tests only check the code in the Vector to LLVM conversion and will not be randomly broken by changes in Standard to LLVM conversion. Arguably, it may be possible to extract Vector to Standard patterns into a separate pass, but given the ongoing splitting of the Standard dialect, such pass will be short-lived and will require further refactoring. Depends On D95626 Reviewed By: nicolasvasilache, aartbik Differential Revision: https://reviews.llvm.org/D95685
-
Alex Zinenko authored
In dialect conversion infrastructure, source materialization applies as part of the finalization procedure to results of the newly produced operations that replace previously existing values with values having a different type. However, such operations may be created to replace operations created in other patterns. At this point, it is possible that the results of the _original_ operation are still in use and have mismatching types, but the results of the _intermediate_ operation that performed the type change are not in use leading to the absence of source materialization. For example, %0 = dialect.produce : !dialect.A dialect.use %0 : !dialect.A can be replaced with %0 = dialect.other : !dialect.A %1 = dialect.produce : !dialect.A // replaced, scheduled for removal dialect.use %1 : !dialect.A and then with %0 = dialect.final : !dialect.B %1 = dialect.other : !dialect.A // replaced, scheduled for removal %2 = dialect.produce : !dialect.A // replaced, scheduled for removal dialect.use %2 : !dialect.A in the same rewriting, but only the %1->%0 replacement is currently considered. Change the logic in dialect conversion to look up all values that were replaced by the given value and performing source materialization if any of those values is still in use with mismatching types. This is performed by computing the inverse value replacement mapping. This arguably expensive manipulation is performed only if there were some type-changing replacements. An alternative could be to consider all replaced operations and not only those that resulted in type changes, but it would harm pattern-level composability: the pattern that performed the non-type-changing replacement would have to be made aware of the type converter in order to call the materialization hook. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95626
-
Nicolas Vasilache authored
This revision defines a Linalg contraction in general terms: 1. Has 2 input and 1 output shapes. 2. Has at least one reduction dimension. 3. Has only projected permutation indexing maps. 4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field (AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary operations that may change the type (e.g. for mixed-precision). As a consequence, when vectorization of such an op occurs, the only special behavior is that the (unique) MulOpType is vectorized into a `vector.contract`. All other ops are handled in a generic fashion. In the future, we may wish to allow more input arguments and elementwise and constant operations that do not involve the reduction dimension(s). A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32. Differential revision: https://reviews.llvm.org/D95939
-
Nicolas Vasilache authored
This separation improves the layering and paves the way for more interfaces coming up in the future. Differential revision: https://reviews.llvm.org/D95941
-
Mehdi Amini authored
We could extend this with an interface to allow dialect to perform a type conversion, but that would make the folder creating operation which isn't the case at the moment, and isn't necessarily always desirable. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D95991
-
- Feb 03, 2021
-
-
George authored
This just exposes a missing API Differential Revision: https://reviews.llvm.org/D95968
-
Christian Sigg authored
Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D94421
-
Matthew Parkinson authored
The AsyncRuntime declares prototypes for extern "C" functions inside a namespace in the header, but not inside that namespace in the definition. This causes Visual Studio to treat them as different entities and thus the dllexport is ignored for the definitions. Using the same namespace fixes this issue. Secondly, this commit moves the dllexport to be consistent with the JITs expectation. This is an update to https://reviews.llvm.org/D95386 that fixes the compile issues in old versions of Visual studio. Differential Revision: https://reviews.llvm.org/D95933
-
Lei Zhang authored
We should be check whether lb + step >= ub to determine whether this is a single iteration. Previously we were checking lb + lb >= ub. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D95440
-
Diego Caballero authored
Add the conversion pattern for vector.bitcast to lower it to the LLVM Dialect. Reviewed By: ThomasRaoux, aartbik Differential Revision: https://reviews.llvm.org/D95579
-
- Feb 02, 2021
-
-
Mehdi Amini authored
This reverts commit b7d80058. The mlir-windows buildbot is broken.
-
River Riddle authored
This revision adds two new classes, RewriterBase and IRRewriter. RewriterBase is a new shared base class between IRRewriter and PatternRewriter. PatternRewriter will continue to be the base class used to perform rewrites within a rewrite pattern. IRRewriter on the other hand, is a new class that allows for tracking IR rewrites from outside of a rewrite pattern. In this revision all of the old API from PatternRewriter is moved to RewriterBase, but the distinction between IRRewriter and PatternRewriter is kept on the chance that a necessary API divergence happens in the future. Currently if you want to have some utility that transforms a piece of IR and share it between pattern and non-pattern code, you have to duplicate it. This revision enables the creation of utilities that can be invoked from rewrite patterns and normal transformation code: ```c++ void someSharedUtility(RewriterBase &rewriter, ...) { // Some interesting IR mutation here. } // Some RewritePattern LogicalResult MyPattern::matchAndRewrite(Operation *op, PatternRewriter &rewriter) { ... someSharedUtility(rewriter, ...); ... } // Some Pass void MyPass::runOnOperation() { ... IRRewriter rewriter(...); someSharedUtility(rewriter, ...); } ``` Differential Revision: https://reviews.llvm.org/D94638
-
Matthew Parkinson authored
The MLIR Async runtime uses different namespacing for the header file, and the definitions of its C API. The header file places the extern "C" functions inside namespace mlir::runtime, and the definitions are not in a namespace. This causes issues in cl.exe. It treats the declaration and definition as different, and thus does not apply dllexport to the definition, which leads to the mlir_async_runtime.dll containing no definitions, and the mlir_async_runtime.lib not being generated. This patch moves the namespace to cover the definitions, and thus generates the dll correctly on Windows with cl.exe. This was tested with Visual Studio C++ 19.28.29336. Differential Revision: https://reviews.llvm.org/D95386
-
Christian Sigg authored
The __resume function trips up LLVM's 'X86 DAG->DAG Instruction Selection' unless optimizations are disabled. Only adding the __resume function when it's needed allows lowering through AsyncToLLVM and LLVM without '-O0' as long as the coroutine functionality is not used. Reviewed By: ezhulenev Differential Revision: https://reviews.llvm.org/D95868
-
Vladislav Vinogradov authored
Include the types into the error message. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D95854
-
Lei Zhang authored
This patch adds basic op definition, parser/printer, and verifier. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D95825
-
Lei Zhang authored
We should be check whether lb + step >= ub to determine whether this is a single iteration. Previously we were checking lb + lb >= ub. Differential Revision: https://reviews.llvm.org/D95440
-
Benjamin Kramer authored
-
Nicolas Vasilache authored
This revision unifies Linalg vectorization and paves the way for vectorization of Linalg ops with mixed-precision operations. The new algorithm traverses the ops in the linalg block in order and avoids recursion. It uses a BlockAndValueMapping to keep track of vectorized operations. The revision makes the following modifications but is otherwise NFC: 1. vector.transfer_read are created eagerly and may appear in a different order than the original order. 2. a more progressive vectorization to vector.contract results in only the multiply operation being converted to `vector.contract %a, %b, %zero`, where `%zero` is a constant of the proper type. Later vector canonicalizations are assumed to rewrite vector.contract %a, %b, %zero + add to a proper accumulate form. Differential revision: https://reviews.llvm.org/D95797
-