Commits · 2b0c8546ac9fb47e1bf9c5e54f1450420eadeab7 · Lorenzo Albano / LLVM bpEVL

May 28, 2020

[mlir][Linalg] Add pass to remove unit-extent dims from tensor · 2b0c8546

MaheshRavishankar authored May 28, 2020

operands of Generic ops.

Unit-extent dimensions are typically used for achieving broadcasting
behavior. The pattern added (along with canonicalization patterns
added previously) removes the use of unit-extent dimensions, and
instead uses a more canonical representation of the computation.  This
new pattern is not added as a canonicalization for now since it
entails adding additional reshape operations. A pass is added to
exercise these patterns, along with an API entry to populate a
patterns list with these patterns.

Differential Revision: https://reviews.llvm.org/D79766

2b0c8546

[mlir][GPU] Link relevant LLVM components in GPUCommon instead of test · 72ede60b

Alex Zinenko authored May 28, 2020

D80142 restructured MLIR-to-GPU-binary conversion to support multiple
targets. It also modified cmake files to link relevant LLVM components
in test/lib, which broke shared-library builds, and likely made the
conversions unusable outside mlir-opt (or other tools that link in test
library targets). Link these components to GPUCommon instead.

Differential Revision: https://reviews.llvm.org/D80739

72ede60b

[mlir] Use ValueRange instead of ArrayRef<Value> · fefe4366

Jacques Pienaar authored May 28, 2020

This allows constructing operand adaptor from existing op (useful for commonalizing verification as I want to do in a follow up).

I also add ability to use member initializers for the generated adaptor constructors for convenience.

Differential Revision: https://reviews.llvm.org/D80667

fefe4366

[mlir][gpu][mlir-cuda-runner] Refactor ConvertKernelFuncToCubin to be generic. · 061fb8eb

Wen-Heng (Jack) Chung authored May 22, 2020

Make ConvertKernelFuncToCubin pass to be generic:

- Rename to ConvertKernelFuncToBlob.
- Allow specifying triple, target chip, target features.
- Initializing LLVM backend is supplied by a callback function.
- Lowering process from MLIR module to LLVM module is via another callback.
- Change mlir-cuda-runner to adopt the revised pass.
- Add new tests for lowering to ROCm HSA code object (HSACO).
- Tests for CUDA and ROCm are kept in separate directories.

Differential Revision: https://reviews.llvm.org/D80142

061fb8eb

[MLIR] Add `num_elements` to the shape dialect · fdaa391e

Frederik Gossen authored May 28, 2020

The operation `num_elements` determines the number of elements for a given
shape.
That is the product of its dimensions.

Differential Revision: https://reviews.llvm.org/D80281

fdaa391e

[MLIR] Add `index_to_size` and `size_to_index` to the shape dialect · 6594d545

Frederik Gossen authored May 28, 2020

Add the two conversion operations `index_to_size` and `size_to_index` to the
shape dialect.
This facilitates the conversion of index types between the shape and the
standard dialect.

Differential Revision: https://reviews.llvm.org/D80280

6594d545

[MLIR] Add TensorFromElementsOp to Standard ops. · c3098e4f
Alexander Belyaev authored May 28, 2020
```
Differential Revision: https://reviews.llvm.org/D80705
```
c3098e4f

[MLIR] Move `ConcatOp` to its lexicographic position · e73bb4fb

Frederik Gossen authored May 28, 2020

Purely cosmetic change.
The operation implementations in `Shape.cpp` are now lexicographic order.

Differential Revision: https://reviews.llvm.org/D80277

e73bb4fb

Harden MLIR detection of misconfiguration when missing dialect registration · 213c6cdf

Mehdi Amini authored May 28, 2020

This changes will catch error where C++ op are used without being
registered, either through creation with the OpBuilder or when trying to
cast to the C++ op.

Differential Revision: https://reviews.llvm.org/D80651

213c6cdf

May 27, 2020

[mlir][Linalg] Add missing library linkage for shared library builds. · 0a072b8a
MaheshRavishankar authored May 27, 2020
```
Differential Revision: https://reviews.llvm.org/D80664
```
0a072b8a

[mlir][shape] Use IndexElementsAttr in Shape dialect. · 25132b36

Sean Silva authored May 26, 2020

Summary:
Index is the proper type for storing shapes when constant folding, so
this fixes the previous code (which was using i64).

Differential Revision: https://reviews.llvm.org/D80600

25132b36

[mlir][core] Add IndexElementsAttr helpers. · 9546d8b1

Sean Silva authored May 26, 2020

Summary:
In a follow-up, I'll update the Shape dialect to use this instead of
I64ElementsAttr.

Differential Revision: https://reviews.llvm.org/D80601

9546d8b1

[mlir][Linalg] Fix build failure from D80188 · c6fa2efd
MaheshRavishankar authored May 27, 2020
```
Differential Revision: https://reviews.llvm.org/D80657
```
c6fa2efd

[mlir] [VectorOps] Add 'vector.flat_transpose' operation · c295a65d

aartbik authored May 27, 2020

Summary:
Provides a representation of the linearized LLVM instrinsic.
With tests and lowering implementation to LLVM IR dialect.
Prepares better lowering for 2-D vector.transpose.

Reviewers: nicolasvasilache, ftynse, reidtatge, bkramer, dcaballe

Reviewed By: ftynse, dcaballe

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80419

c295a65d

[mlir][spirv] Lower allocation/deallocations of workgroup memory. · 4d6f44f5

MaheshRavishankar authored May 27, 2020

This allocation of a workgroup memory is lowered to a
spv.globalVariable. Only static size allocation with element type
being int or float is handled. The lowering does account for the
element type that are not supported in the lowered spv.module based on
the extensions/capabilities and adjusts the number of elements to get
the same byte length.

Differential Revision: https://reviews.llvm.org/D80411

4d6f44f5

[MLIR] [OpenMP] Add basic OpenMP parallel operation · 5ba874e4

David Truby authored May 05, 2020

Summary:
This includes a basic implementation for the OpenMP parallel
operation without a custom pretty-printer and parser.
The if, num_threads, private, shared, first_private, last_private,
proc_bind and default clauses are included in this implementation.

Currently the reduction clause is omitted as it is more complex and
requires analysis to see if we can share implementation with the loop
dialect. The allocate clause is also omitted.

A discussion about the design of this operation can be found here:
https://llvm.discourse.group/t/openmp-parallel-operation-design-issues/686

The current OpenMP Specification can be found here:
https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>

Reviewers: jdoerfert

Subscribers: mgorny, yaxunl, kristof.beyls, guansong, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79410

5ba874e4

[mlir] Add simple generator for return types · 31f40f60

Jacques Pienaar authored May 27, 2020

Take advantage of equality constrains to generate the type inference interface.
This is used for equality and trivially built types. The type inference method
is only generated when no type inference trait is specified already.

This reorders verification that changes some test error messages.

Differential Revision: https://reviews.llvm.org/D80484

31f40f60

[mlir] SCF: provide function_ref builders for IfOp · cadb7ccf

Alex Zinenko authored May 25, 2020

Now that OpBuilder is available in `build` functions, it becomes possible to
populate the "then" and "else" regions directly when building the "if"
operation. This is desirable in more structured forms of builders, especially
in when conditionals are mixed with loops. Provide new `build` APIs taking
callbacks for body constructors, similarly to scf::ForOp, and replace more
clunky edsc::BlockBuilder uses with these. The original APIs remain available
and go through the new implementation.

Differential Revision: https://reviews.llvm.org/D80527

cadb7ccf

[mlir][linalg] Allow promotion to use callbacks for · 0ed2d4c7

MaheshRavishankar authored May 26, 2020

alloc/dealloc/copies.

Add options to LinalgPromotion to use callbacks for implementating the
allocation, deallocation of buffers used for the promoted subviews,
and to copy data into and from the original subviews to the allocated
buffers.
Also some misc. cleanup of the code.

Differential Revision: https://reviews.llvm.org/D80365

0ed2d4c7

[mlir][Linalg] Avoid using scf.parallel for non-parallel loops in Linalg ops. · 5759e473

MaheshRavishankar authored May 26, 2020

Modifying the loop nest builder for generating scf.parallel loops to
not generate scf.parallel loops for non-parallel iterator types in
Linalg operations. The existing implementation incorrectly generated
scf.parallel for all tiled loops. It is rectified by refactoring logic
used while lowering to loops that accounted for this.

Differential Revision: https://reviews.llvm.org/D80188

5759e473

[mlir][shape] Add `shape.get_extent`. · cf42b704

Sean Silva authored May 21, 2020

Summary:
This op extracts an extent from a shape.

This also is the first op which constant folds to shape.const_size,
which revealed that shape.const_size needs a folder (ConstantLike ops
seem to always need folders for the constant folding infra to work).

Differential Revision: https://reviews.llvm.org/D80394

cf42b704

May 26, 2020

[mlir][Vector] Add more vector.contract -> outerproduct lowerings and fix... · ba10daa8

Nicolas Vasilache authored May 26, 2020

[mlir][Vector] Add more vector.contract -> outerproduct lowerings and fix vector.contract type inference.

This revision expands the types of vector contractions that can be lowered to vector.outerproduct.
All 8 permutation cases are support.
The idiomatic manipulation of AffineMap written declaratively makes this straightforward.

In the process a bug with the vector.contract verifier was uncovered.
The vector shape verification part of the contract op is rewritten to use AffineMap composition.
One bug in the vector `ops.mlir` test is fixed and a new case not yet captured is added
to the vector`invalid.mlir` test.

Differential Revision: https://reviews.llvm.org/D80393

ba10daa8

[MLIR] Helper class referencing MemRefType to unify runner implementations. · 222e0e58

Christian Sigg authored May 25, 2020

Summary:
Add DynamicMemRefType which can reference one of the statically ranked StridedMemRefType or a UnrankedMemRefType so that runner utils only need to be implemented once.

There is definitely room for more clean up and unification, but I will keep that for follow-ups.

Reviewers: nicolasvasilache

Reviewed By: nicolasvasilache

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80513

222e0e58

[mlir][Vector] Add vector contraction to outerproduct lowering · 9578a54f

Nicolas Vasilache authored May 26, 2020

This revision adds the additional lowering and exposes the patterns at a finer granularity for better programmatic reuse. The unit test makes use of the finer grained pattern for simpler checks.

As the ContractionOpLowering is exposed programmatically, cleanup opportunities appear and static class methods are turned into free functions with static visibility.

Differential Revision: https://reviews.llvm.org/D80375

9578a54f

May 25, 2020

Make mlir::Value's bool conversion operator explicit · a9b5edc5

Benjamin Kramer authored May 24, 2020

This still allows `if (value)` while requiring an explicit cast when not
in a boolean context. This means things like `std::set<Value>` will no
longer compile.

Differential Revision: https://reviews.llvm.org/D80497

a9b5edc5

[mlir] Expand operand adapter to take attributes · 4b8632e1

Jacques Pienaar authored May 24, 2020

* Enables using with more variadic sized operands;
* Generate convenience accessors for attributes;
  - The accessor are named the same as their name in ODS and returns attribute
    type (not convenience type) and no derived attributes.

This is first step to changing adapter to support verifying argument
constraints before the op is even created. This does not change the name of
adaptor nor does it require it except for ops with variadic operands to keep this change smaller.

Considered creating separate adapter but decided against that given operands also require attributes in general (and definitely for verification of operands and attributes).

Differential Revision: https://reviews.llvm.org/D80420

4b8632e1

May 21, 2020

[mlir][spirv] Enable composite instructions for cooperative matrix type. · 0712eac7

Thomas Raoux authored May 21, 2020

Enable inset/extract/construct composite ops as well as access chain for
cooperative matrix. ConstantComposite requires more change and will be done in
a separate patch. Also fix the getNumElements function for coopMatrix per
feedback from Jeff Bolz. The number of element is implementation dependent so
it cannot be known at compile time.

Differential Revision: https://reviews.llvm.org/D80321

0712eac7

[mlir][spirv] Add remaining cooperative matrix instructions · 15389cdc

Thomas Raoux authored May 21, 2020

Adds support for cooperative matrix support for arithmetic and cast
instructions. It also adds cooperative matrix store, muladd and matrixlength
instructions which are part of the extension.

Differential Revision: https://reviews.llvm.org/D80181

15389cdc

[mlir][rocdl] Exposing buffer load/store intrinsic · 9c53ac08

jerryyin authored May 19, 2020

Summary:
* Updated ROCDLOps tablegen
* Added parsing and printing function for new intrinsic
* Added unit tests

Reviewers: ftynse

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80233

9c53ac08

[mlir][gpu] Refactor ConvertGpuLaunchFuncToCudaCalls pass. · 2cbbc266

Wen-Heng (Jack) Chung authored May 18, 2020

Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.

In this commit:

- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.

Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.

Differential Revision: https://reviews.llvm.org/D80167

2cbbc266

[mlir] NFC - Add a builder to vector.transpose · 941005f5

Nicolas Vasilache authored May 21, 2020

Summary: Also expose some more vector ops to EDSCs.

Differential Revision: https://reviews.llvm.org/D80333

941005f5

Revert "[mlir][gpu] Refactor ConvertGpuLaunchFuncToCudaCalls pass." · 5c3ebd77

Mehdi Amini authored May 21, 2020

This reverts commit cdb6f05e.

The build is broken with:

You have called ADD_LIBRARY for library obj.MLIRGPUtoCUDATransforms without any source files. This typically indicates a problem with your CMakeLists.txt file

5c3ebd77

May 20, 2020

[mlir] NFC - Appease GCC 5 again.. · 3393cc4c
Nicolas Vasilache authored May 20, 2020

3393cc4c

[mlir][gpu] Refactor functions for workgroup and private buffer attributions. · ad398164

Wen-Heng (Jack) Chung authored May 06, 2020

Summary:

Consolidate interfaces adding workgroup and private buffer attributions in GPU
dialect.

Note all private buffer attributions must follow workgroup buffer attributions.

Reviewers: herhut

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm, #mlir

Differential Revision: https://reviews.llvm.org/D79508

ad398164

[mlir][gpu] Refactor ConvertGpuLaunchFuncToCudaCalls pass. · cdb6f05e

Wen-Heng (Jack) Chung authored May 18, 2020

Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.

In this commit:

- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.

Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.

Differential Revision: https://reviews.llvm.org/D80167

cdb6f05e

[mlir] NFC - Appease GCC 5 again.. · ebf14d9b
Nicolas Vasilache authored May 20, 2020

ebf14d9b

[mlir][spirv] Adapt subview legalization to the updated op semantics. · 0e88eb5c

MaheshRavishankar authored May 20, 2020

The subview semantics changes recently to allow for more natural
representation of constant offsets and strides. The legalization of
subview op for lowering to SPIR-V needs to account for this.
Also change the linearization to use the strides from the affine map
of a memref.

Differential Revision: https://reviews.llvm.org/D80270

0e88eb5c

[mlir][Linalg] Add producer-consumer fusion when producer is a ConstantOp · 071358e0
MaheshRavishankar authored May 20, 2020
```
and Consumer is a GenericOp.

Differential Revision: https://reviews.llvm.org/D79838
```
071358e0

[mlir][Vector] Add option to fully unroll for VectorTransfer to SCF lowering · 7c3c5b11

Nicolas Vasilache authored May 20, 2020

Summary:
Previously, the only support partial lowering from vector transfers to SCF was
going through loops. This requires a dedicated allocation and extra memory
roundtrips because LLVM aggregates cannot be indexed dynamically (for more
details see the [deep-dive](https://mlir.llvm.org/docs/Dialects/Vector/#deeperdive)).

This revision allows specifying full unrolling which removes this additional roundtrip.
This should be used carefully though because full unrolling will spill, negating the
benefits of removing the interim alloc in the first place.

Proper heuristics are left for a later time.

Differential Revision: https://reviews.llvm.org/D80100

7c3c5b11

[mlir] ensureRegionTerminator: take OpBuilder · 3ccf4a5b

Alex Zinenko authored May 20, 2020

The SingleBlockImplicitTerminator op trait provides a function
`ensureRegionTerminator` that injects an appropriate terminator into the block
if necessary, which is used during operation constructing and parsing.
Currently, this function directly modifies the IR using low-level APIs on
Operation and Block. If this function is called from a conversion pattern,
these manipulations are not reflected in the ConversionPatternRewriter and thus
cannot be undone or, worse, lead to tricky memory errors and malformed IR.
Change `ensureRegionTerminator` to take an instance of `OpBuilder` instead of
`Builder`, and use it to construct the block and the terminator when required.
Maintain overloads taking an instance of `Builder` and creating a simple
`OpBuilder` to use in parsers, which don't have an `OpBuilder` and cannot
interact with the dialect conversion mechanism. This change was one of the
reasons to make `<OpTy>::build` accept an `OpBuilder`.

Differential Revision: https://reviews.llvm.org/D80138

3ccf4a5b