Commits · 511dd4f4383b1c2873beac4dbea2df302f1f9d0c · Lorenzo Albano / LLVM bpEVL

Feb 08, 2021

Revert "Reorder MLIRContext location in BuiltinAttributes.h" · 511dd4f4
Tres Popp authored Feb 08, 2021
```
This reverts commit 7827753f.
```
511dd4f4

Reorder MLIRContext location in BuiltinAttributes.h · 7827753f

Tres Popp authored Feb 05, 2021

Now the context is the first, rather than the last input.

This better matches the rest of the infrastructure and makes
it easier to move these types to being declaratively specified.

Differential Revision: https://reviews.llvm.org/D96111

7827753f

[mlir][ODS] Allow to specify custom namespace for `NativeOpTrait` · 035abe30

Vladislav Vinogradov authored Feb 05, 2021

This will allow to use `NativeOpTrait` and Operations
declared outside of `mlir` namespace.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D96128

035abe30

Feb 06, 2021

[MLIR] [affine-loop-fusion] Fix a bug about non-result ops in affine-loop-fusion · 05c6c648

Tung D. Le authored Feb 06, 2021

This patch fixes the following bug when calling --affine-loop-fusion

Input program:
 ```mlir
func @should_not_fuse_since_top_level_non_affine_non_result_users(
    %in0 : memref<32xf32>, %in1 : memref<32xf32>) {
  %c0 = constant 0 : index
  %cst_0 = constant 0.000000e+00 : f32

  affine.for %d = 0 to 32 {
    %lhs = affine.load %in0[%d] : memref<32xf32>
    %rhs = affine.load %in1[%d] : memref<32xf32>
    %add = addf %lhs, %rhs : f32
    affine.store %add, %in0[%d] : memref<32xf32>
  }
  store %cst_0, %in0[%c0] : memref<32xf32>
  affine.for %d = 0 to 32 {
    %lhs = affine.load %in0[%d] : memref<32xf32>
    %rhs = affine.load %in1[%d] : memref<32xf32>
    %add = addf %lhs, %rhs: f32
    affine.store %add, %in0[%d] : memref<32xf32>
  }
  return
}
```

call --affine-loop-fusion, we got an incorrect output:

```mlir
func @should_not_fuse_since_top_level_non_affine_non_result_users(%arg0: memref<32xf32>, %arg1: memref<32xf32>) {
  %c0 = constant 0 : index
  %cst = constant 0.000000e+00 : f32
  store %cst, %arg0[%c0] : memref<32xf32>
  affine.for %arg2 = 0 to 32 {
    %0 = affine.load %arg0[%arg2] : memref<32xf32>
    %1 = affine.load %arg1[%arg2] : memref<32xf32>
    %2 = addf %0, %1 : f32
    affine.store %2, %arg0[%arg2] : memref<32xf32>
    %3 = affine.load %arg0[%arg2] : memref<32xf32>
    %4 = affine.load %arg1[%arg2] : memref<32xf32>
    %5 = addf %3, %4 : f32
    affine.store %5, %arg0[%arg2] : memref<32xf32>
  }
  return
}
```

This happened because when analyzing the source and destination nodes,
affine loop fusion ignored non-result ops sandwitched between them. In
other words, the MemRefDependencyGraph in the affine loop fusion ignored
these non-result ops.

This patch solves the issue by adding these non-result ops to the
MemRefDependencyGraph.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D95668

05c6c648

Rework ExecutionEngine::invoke() to make it more friendly to use from C++ · d6efb6fc

Mehdi Amini authored Feb 06, 2021

This new invoke will pack a list of argument before calling the
`invokePacked` method. It accepts returned value as output argument
wrapped in `ExecutionEngine::Result<T>`, and delegate the packing of
arguments to a trait to allow for customization for some types.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D95961

d6efb6fc

Feb 05, 2021

[mlir][vector] Add pattern to shuffle bitcast ops · 7630520a

Lei Zhang authored Feb 05, 2021

These patterns move vector.bitcast ops to be before
insert ops or after extract ops where suitable.
With them, bitcast will happen on smaller vectors
and there are more chances to share extract/insert
ops.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D96040

7630520a

[mlir][vector] Add constant folding for fp16 to fp32 bitcast · 8dae9099
Lei Zhang authored Feb 05, 2021
```
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D96041
```
8dae9099

[mlir][spirv] Add more vector conversion patterns · 9f622b3d

Lei Zhang authored Feb 05, 2021

This patch introduces a few more straightforward patterns
to convert vector ops operating on 1-4 element vectors
to their corresponding SPIR-V counterparts.

This patch also enables converting vector<1xT> to T.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D96042

9f622b3d

[mlir][vector] Add patterns to cast away leading 1-dim · 874ce9b8

Lei Zhang authored Feb 05, 2021

This patch adds patterns to use vector.shape_cast to cast
away leading 1-dimensions from a few vector operations.
It allows exposing more canonical forms of vector.transfer_read,
vector.transfer_write, vector_extract_strided_slice, and
vector.insert_strided_slice. With this, we can have more
opportunity to cancelling extract/insert ops or forwarding
write/read ops.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D95873

874ce9b8

[mlir][Linalg] NFC - Improve usage of mlir::linalg::isaContractionOpInterface · 6da8d6c6
Nicolas Vasilache authored Feb 05, 2021

6da8d6c6

[mlir] Turn Linalg to LLVM into a partial conversion · 1b101038

Alex Zinenko authored Feb 04, 2021

Historically, Linalg To LLVM conversion subsumed numerous other conversions,
including (affine) loop lowerings to CFG and conversions from the Standard and
Vector dialects to the LLVM dialect. This was due to the insufficient support
for partial conversions in the infrastructure that essentially required
conversions that involve type change (in this case, !linalg.range to
!llvm.struct) to be performed in a single conversion sweep. This is no longer
the case so remove the subsumed conversions and run them as separate passes
when necessary.

Depends On D95317

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96008

1b101038

[mlir] Add `const` qualifiers to `AffineMap` methods · f349abc2

Vladislav Vinogradov authored Feb 04, 2021

The `AffineMap` class follows the same semantic as Type and Attribute.
It is immutable object, so it make sence to mark its methods as const.
Also part of its API is already marked as const, this change just make the API consistent.

Reviewed By: ftynse, bondhugula

Differential Revision: https://reviews.llvm.org/D96026

f349abc2

[mlir][Linalg] NFC - Refactor vectorization to be more composable · 0fcbbde2
Nicolas Vasilache authored Feb 05, 2021
```
Differential Revision: https://reviews.llvm.org/D96116
```
0fcbbde2
[mlir][linalg] Linalg.fill on tensor should not have side-effects · 7f58196e
Nicolas Vasilache authored Feb 05, 2021
```
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D96094
```
7f58196e

[mlir] Mark LogicalResult as LLVM_NODISCARD · e21adfa3

River Riddle authored Feb 04, 2021

This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way.

Differential Revision: https://reviews.llvm.org/D95841

e21adfa3

Feb 04, 2021

[mlir] Silence GCC warnings · f9f6b4f3

Diego Caballero authored Feb 04, 2021

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D95906

f9f6b4f3

Remove dead code from Linalg vectorization to fix GCC warning (NFC) · 215441fc
Mehdi Amini authored Feb 04, 2021

215441fc

[mlir][Linalg] Introduce a ContractionOpInterface · e4a503a2

Nicolas Vasilache authored Feb 04, 2021

This revision takes advantage of recent extensions to vectorization to refactor contraction detection into a bona fide Linalg interface.
The mlit-linalg-ods-gen parser is extended to support adding such interfaces.
The detection that was originally enabling vectorization is refactored to serve as both a test on a generic LinalgOp as well as to verify ops that declare to conform to that interface.

This is plugged through Linalg transforms and strategies but it quickly becomes evident that the complexity and rigidity of the C++ class based templating does not pay for itself.
Therefore, this revision changes the API for vectorization patterns to get rid of templates as much as possible.
Variadic templates are relegated to the internals of LinalgTransformationFilter as much as possible and away from the user-facing APIs.

It is expected other patterns / transformations will follow the same path and drop as much C++ templating as possible from the class definition.

Differential revision: https://reviews.llvm.org/D95973

e4a503a2

[mlir] Return scf.parallel ops resulted from tiling. · 09c18a66
Alexander Belyaev authored Feb 04, 2021
```
Differential Revision: https://reviews.llvm.org/D96024
```
09c18a66

[mlir][Linalg] Drop SliceOp · f4ac9f03

Nicolas Vasilache authored Feb 04, 2021

This op is subsumed by rank-reducing SubViewOp and has become useless.

Differential revision: https://reviews.llvm.org/D95317

f4ac9f03

[mlir] make vector to llvm conversion truly partial · ba87f991

Alex Zinenko authored Feb 04, 2021

Historically, the Vector to LLVM dialect conversion subsumed the Standard to
LLVM dialect conversion patterns. This was necessary because the conversion
infrastructure did not have sufficient support for reconciling type
conversions. This support is now available. Only keep the patterns related to
the Vector dialect in the Vector to LLVM conversion and require type casts
operations to be inserted if necessary. These casts will be removed by
following conversions if possible. Update integration tests to also run the
Standard to LLVM conversion.

There is a significant amount of test churn, which is due to (a) unnecessarily
strict tests in VectorToLLVM and (b) many patterns actually targeting Standard
dialect ops instead of LLVM dialect ops leading to tests actually exercising a
Vector->Standard->LLVM conversion. This churn is a good illustration of the
reason to make the conversion partial: now the tests only check the code in the
Vector to LLVM conversion and will not be randomly broken by changes in
Standard to LLVM conversion.

Arguably, it may be possible to extract Vector to Standard patterns into a
separate pass, but given the ongoing splitting of the Standard dialect, such
pass will be short-lived and will require further refactoring.

Depends On D95626

Reviewed By: nicolasvasilache, aartbik

Differential Revision: https://reviews.llvm.org/D95685

ba87f991

[mlir] Apply source materialization in case of transitive conversion · 5b91060d

Alex Zinenko authored Jan 28, 2021

In dialect conversion infrastructure, source materialization applies as part of
the finalization procedure to results of the newly produced operations that
replace previously existing values with values having a different type.
However, such operations may be created to replace operations created in other
patterns. At this point, it is possible that the results of the _original_
operation are still in use and have mismatching types, but the results of the
_intermediate_ operation that performed the type change are not in use leading
to the absence of source materialization. For example,

  %0 = dialect.produce : !dialect.A
  dialect.use %0 : !dialect.A

can be replaced with

  %0 = dialect.other : !dialect.A
  %1 = dialect.produce : !dialect.A  // replaced, scheduled for removal
  dialect.use %1 : !dialect.A

and then with

  %0 = dialect.final : !dialect.B
  %1 = dialect.other : !dialect.A    // replaced, scheduled for removal
  %2 = dialect.produce : !dialect.A  // replaced, scheduled for removal
  dialect.use %2 : !dialect.A

in the same rewriting, but only the %1->%0 replacement is currently considered.

Change the logic in dialect conversion to look up all values that were replaced
by the given value and performing source materialization if any of those values
is still in use with mismatching types. This is performed by computing the
inverse value replacement mapping. This arguably expensive manipulation is
performed only if there were some type-changing replacements. An alternative
could be to consider all replaced operations and not only those that resulted
in type changes, but it would harm pattern-level composability: the pattern
that performed the non-type-changing replacement would have to be made aware of
the type converter in order to call the materialization hook.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D95626

5b91060d

[mlir][Linalg] Generalize the definition of a Linalg contraction. · f245b7ad

Nicolas Vasilache authored Feb 04, 2021

This revision defines a Linalg contraction in general terms:

  1. Has 2 input and 1 output shapes.
  2. Has at least one reduction dimension.
  3. Has only projected permutation indexing maps.
  4. its body computes `u5(u1(c) + u2(u3(a) * u4(b)))` on some field
    (AddOpType, MulOpType), where u1, u2, u3, u4 and u5 represent scalar unary
    operations that may change the type (e.g. for mixed-precision).

As a consequence, when vectorization of such an op occurs, the only special
behavior is that the (unique) MulOpType is vectorized into a
`vector.contract`. All other ops are handled in a generic fashion.

 In the future, we may wish to allow more input arguments and elementwise and
 constant operations that do not involve the reduction dimension(s).

A test is added to demonstrate the proper vectorization of matmul_i8_i8_i32.

Differential revision: https://reviews.llvm.org/D95939

f245b7ad

[mlir][Linalg] NFC - Extract a standalone LinalgInterfaces · 1029c82c

Nicolas Vasilache authored Feb 03, 2021

This separation improves the layering and paves the way for more interfaces coming up in the future.

Differential revision: https://reviews.llvm.org/D95941

1029c82c

Make the folder more robust against op fold() methods that generate a type mismatch · a1d5bdf8

Mehdi Amini authored Feb 04, 2021

We could extend this with an interface to allow dialect to perform a type
conversion, but that would make the folder creating operation which isn't
the case at the moment, and isn't necessarily always desirable.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D95991

a1d5bdf8

Feb 03, 2021

Add API for adding arguments to blocks · 9db61142

George authored Feb 03, 2021

This just exposes a missing API

Differential Revision: https://reviews.llvm.org/D95968

9db61142

[mlir] Add gpu async integration test. · 8d73bee4
Christian Sigg authored Feb 03, 2021
```
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D94421
```
8d73bee4

Fix MLIR Async Runtime DLL on Windows · dd2dac2f

Matthew Parkinson authored Feb 03, 2021

The AsyncRuntime declares prototypes for extern "C" functions inside a
namespace in the header, but not inside that namespace in the
definition. This causes Visual Studio to treat them as different
entities and thus the dllexport is ignored for the definitions.

Using the same namespace fixes this issue.

Secondly, this commit moves the dllexport to be consistent with the
JITs expectation.

This is an update to https://reviews.llvm.org/D95386 that fixes the
compile issues in old versions of Visual studio.

Differential Revision: https://reviews.llvm.org/D95933

dd2dac2f

[mlir] Fix scf.for single iteration canonicalization check · 5b7619c9

Lei Zhang authored Feb 02, 2021

We should be check whether lb + step >= ub to determine
whether this is a single iteration. Previously we were
checking lb + lb >= ub.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D95440

5b7619c9

[mlir][Vector] Add lowering to LLVM for vector.bitcast · cf5c517c

Diego Caballero authored Feb 03, 2021

Add the conversion pattern for vector.bitcast to lower it to
the LLVM Dialect.

Reviewed By: ThomasRaoux, aartbik

Differential Revision: https://reviews.llvm.org/D95579

cf5c517c

Feb 02, 2021

Revert "Fix namespace for MLIR Async Runtime" · 29fffff8
Mehdi Amini authored Feb 02, 2021
```
This reverts commit b7d80058.

The mlir-windows buildbot is broken.
```
29fffff8

[mlir][Pattern] Create a new IRRewriter class to enable sharing code with pattern rewrites · ec10f066

River Riddle authored Feb 02, 2021

This revision adds two new classes, RewriterBase and IRRewriter. RewriterBase is a new shared base class between IRRewriter and PatternRewriter. PatternRewriter will continue to be the base class used to perform rewrites within a rewrite pattern. IRRewriter on the other hand, is a new class that allows for tracking IR rewrites from outside of a rewrite pattern. In this revision all of the old API from PatternRewriter is moved to RewriterBase, but the distinction between IRRewriter and PatternRewriter is kept on the chance that a necessary API divergence happens in the future.

Currently if you want to have some utility that transforms a piece of IR and share it between pattern and non-pattern code, you have to duplicate it. This revision enables the creation of utilities that can be invoked from rewrite patterns and normal transformation code:

```c++
void someSharedUtility(RewriterBase &rewriter, ...) {
  // Some interesting IR mutation here.
}

// Some RewritePattern
LogicalResult MyPattern::matchAndRewrite(Operation *op, PatternRewriter &rewriter) {
  ...
  someSharedUtility(rewriter, ...);
  ...
}

// Some Pass
void MyPass::runOnOperation() {
  ...
  IRRewriter rewriter(...);
  someSharedUtility(rewriter, ...);
}
```

Differential Revision: https://reviews.llvm.org/D94638

ec10f066

Fix namespace for MLIR Async Runtime · b7d80058

Matthew Parkinson authored Feb 02, 2021

The MLIR Async runtime uses different namespacing for the header file,
and the definitions of its C API. The header file places the extern "C"
functions inside namespace mlir::runtime, and the definitions are not
in a namespace. This causes issues in cl.exe. It treats the declaration
and definition as different, and thus does not apply dllexport to the
definition, which leads to the mlir_async_runtime.dll containing no
definitions, and the mlir_async_runtime.lib not being generated.

This patch moves the namespace to cover the definitions, and thus
generates the dll correctly on Windows with cl.exe.

This was tested with Visual Studio C++ 19.28.29336.

Differential Revision: https://reviews.llvm.org/D95386

b7d80058

[mlir] Delay adding the __resume function · 5b388169

Christian Sigg authored Feb 02, 2021

The __resume function trips up LLVM's 'X86 DAG->DAG Instruction Selection' unless optimizations are disabled.

Only adding the __resume function when it's needed allows lowering through AsyncToLLVM and LLVM without '-O0' as long as the coroutine functionality is not used.

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D95868

5b388169

[mlir] Print more verbose message in case of type inference error · 95935849
Vladislav Vinogradov authored Feb 02, 2021
```
Include the types into the error message.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D95854
```
95935849
Revert "[mlir] Fix scf.for single iteration canonicalization check" · a2e791e3
Lei Zhang authored Feb 02, 2021
```
This reverts commit b2b35697.
It gotten accidentially landed before LGTM.
```
a2e791e3

[mlir][spirv] Define sp.VectorShuffle · e901188c

Lei Zhang authored Feb 02, 2021

This patch adds basic op definition, parser/printer, and verifier.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D95825

e901188c

[mlir] Fix scf.for single iteration canonicalization check · b2b35697

Lei Zhang authored Feb 02, 2021

We should be check whether lb + step >= ub to determine
whether this is a single iteration. Previously we were
checking lb + lb >= ub.

Differential Revision: https://reviews.llvm.org/D95440

b2b35697

[mlir][Linalg] Fix unused variable warning in Release builds. NFC. · 94f540cc
Benjamin Kramer authored Feb 02, 2021

94f540cc

[mlir][Linalg] Refactor Linalg vectorization for better reuse and extensibility. · 0a2a260a

Nicolas Vasilache authored Feb 02, 2021

This revision unifies Linalg vectorization and paves the way for vectorization of Linalg ops with mixed-precision operations.
The new algorithm traverses the ops in the linalg block in order and avoids recursion.
It uses a BlockAndValueMapping to keep track of vectorized operations.

The revision makes the following modifications but is otherwise NFC:
1. vector.transfer_read are created eagerly and may appear in a different order than the original order.
2. a more progressive vectorization to vector.contract results in only the multiply operation being converted to `vector.contract %a, %b, %zero`, where `%zero` is a
constant of the proper type. Later vector canonicalizations are assumed to rewrite vector.contract %a, %b, %zero + add to a proper accumulate form.

Differential revision: https://reviews.llvm.org/D95797

0a2a260a