Commits · 05a89312d812bb5dcec6deca8f1e28a198ce1167 · Lorenzo Albano / LLVM bpEVL

May 07, 2021

[mlir][Linalg] Allow folding to rank-zero tensor when using rank-reducing subtensors. · 05a89312

MaheshRavishankar authored May 06, 2021

The pattern to convert subtensor ops to their rank-reduced versions
(by dropping unit-dims in the result) can also convert to a zero-rank
tensor. Handle that case.
This also fixes a OOB access bug in the existing pattern for such
cases.

Differential Revision: https://reviews.llvm.org/D101949

05a89312

[mlir][tosa] Added div op, variadic concat. Removed placeholder. Spec v0.22 alignment. · d3e987c3

Rob Suderman authored May 06, 2021

Nearly complete alignment to spec v0.22
- Adds Div op
- Concat inputs now variadic
- Removes Placeholder op

Note: TF side PR https://github.com/tensorflow/tensorflow/pull/48921 deletes Concat legalizations to avoid breaking TensorFlow CI. This must be merged only after the TF PR has merged.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D101958

d3e987c3

[mlir] Update dstNode after DenseMap insertion in loop fusion pass. · 5dc1ed3f
Amy Zhuang authored May 06, 2021
```
Reviewed By: vinayaka-polymage

Differential Revision: https://reviews.llvm.org/D101794
```
5dc1ed3f

May 06, 2021

[mlir][spirv] NFC: Replace OwningSPIRVModuleRef with OwningOpRef · 41bc54cc
Lei Zhang authored May 06, 2021
```
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D102009
```
41bc54cc

[mlir] Store the flag for dynamic operand storage in the low bits · 6304c083

River Riddle authored May 06, 2021

It is currently stored in the high bits, which is disallowed on certain
platforms (e.g. android). This revision switches the representation to use
the low bits instead, fixing crashes/breakages on those platforms.

Differential Revision: https://reviews.llvm.org/D101969

6304c083

[mlir][vector] Fix typo · 71eb32d9
thomasraoux authored May 06, 2021

71eb32d9

[mlir][linalg][NFC] Make reshape folding control more fine grain · 52525cb2

thomasraoux authored May 06, 2021

This expose a lambda control instead of just a boolean to control unit
dimension folding.
This however gives more control to user to pick a good heuristic.
Folding reshapes helps fusion opportunities but may generate sub-optimal
generic ops.

Differential Revision: https://reviews.llvm.org/D101917

52525cb2

Fix array attribute in bindings for linalg.init_tensor · 1f109f9d
Denys Shabalin authored May 06, 2021
```
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D101998
```
1f109f9d
[mlir][NFC] Fix warning in VectorTransforms.cpp · 933551ea
thomasraoux authored May 06, 2021

933551ea
[mlir][vector] add pattern to cast away lead unit dimension for broadcast op · 0b303da6
thomasraoux authored May 05, 2021
```
Differential Revision: https://reviews.llvm.org/D101955
```
0b303da6
[mlir] Add support for ops with regions in 'gpu-async-region' rewriter. · a0d019fc
Christian Sigg authored May 05, 2021
```
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D101757
```
a0d019fc

[MLIR][GPU][NVVM] Add warp synchronous matrix-multiply accumulate ops · 875eb523

Navdeep Kumar authored May 06, 2021

Add warp synchronous matrix-multiply accumulate ops in GPU and NVVM
dialect. Add following three ops to GPU dialect :-
  1.) subgroup_mma_load_matrix
  2.) subgroup_mma_store_matrix
  3.) subgroup_mma_compute
Add following three ops to NVVM dialect :-
  1.) wmma.m16n16k16.load.[a,b,c].[f16,f32].row.stride
  2.) wmma.m16n16k16.store.d.[f16,f32].row.stride
  3.) wmma.m16n16k16.mma.row.row.[f16,f32].[f16,f32]

Reviewed By: bondhugula, ftynse, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D95330

875eb523

[mlir] Check generated IR of math_polynomial_approx.mlir · 3c952ab2

Emilio Cota authored May 05, 2021

Instead of just checking that we emit something.

Differential Revision: https://reviews.llvm.org/D101940

3c952ab2

[mlir][Linalg] Fix test to use new reshape op form. · 4b2d7ef3
MaheshRavishankar authored May 05, 2021
```
Differential Revision: https://reviews.llvm.org/D101956
```
4b2d7ef3

[mlir][Linalg] Fix element type of results when folding reshapes. · b6060b76

MaheshRavishankar authored May 05, 2021

Fixing a minor bug which lead to element type of the output being
modified when folding reshapes with generic op.

Differential Revision: https://reviews.llvm.org/D101942

b6060b76

May 05, 2021

[mlir] Add polynomial approximation for math::ExpM1 · 0edc4bc8

Emilio Cota authored May 05, 2021

This approximation matches the one in Eigen.

```
name                      old cpu/op  new cpu/op  delta
BM_mlir_Expm1_f32/10      90.9ns ± 4%  52.2ns ± 4%  -42.60%    (p=0.000 n=74+87)
BM_mlir_Expm1_f32/100      837ns ± 3%   231ns ± 4%  -72.43%    (p=0.000 n=79+69)
BM_mlir_Expm1_f32/1k      8.43µs ± 3%  1.58µs ± 5%  -81.30%    (p=0.000 n=77+83)
BM_mlir_Expm1_f32/10k     83.8µs ± 3%  15.4µs ± 5%  -81.65%    (p=0.000 n=83+69)
BM_eigen_s_Expm1_f32/10   68.8ns ±17%  72.5ns ±14%   +5.40%  (p=0.000 n=118+115)
BM_eigen_s_Expm1_f32/100   694ns ±11%   717ns ± 2%   +3.34%   (p=0.000 n=120+75)
BM_eigen_s_Expm1_f32/1k   7.69µs ± 2%  7.97µs ±11%   +3.56%   (p=0.000 n=95+117)
BM_eigen_s_Expm1_f32/10k  88.0µs ± 1%  89.3µs ± 6%   +1.45%   (p=0.000 n=74+106)
BM_eigen_v_Expm1_f32/10   44.3ns ± 6%  45.0ns ± 8%   +1.45%   (p=0.018 n=81+111)
BM_eigen_v_Expm1_f32/100   351ns ± 1%   360ns ± 9%   +2.58%    (p=0.000 n=73+99)
BM_eigen_v_Expm1_f32/1k   3.31µs ± 1%  3.42µs ± 9%   +3.37%   (p=0.000 n=71+100)
BM_eigen_v_Expm1_f32/10k  33.7µs ± 8%  34.1µs ± 9%   +1.04%    (p=0.007 n=99+98)
```

Reviewed By: ezhulenev

Differential Revision: https://reviews.llvm.org/D101852

0edc4bc8

[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv · 7abb56c7

Rob Suderman authored May 05, 2021

Implements support for undialated depthwise convolution using the existing
depthwise convolution operation. Once convolutions migrate to yaml defined
versions we can rewrite for cleaner implementation.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D101579

7abb56c7

[MC] Untangle MCContext and MCObjectFileInfo · 632ebc4a

Philipp Krones authored May 05, 2021

This untangles the MCContext and the MCObjectFileInfo. There is a circular
dependency between MCContext and MCObjectFileInfo. Currently this dependency
also exists during construction: You can't contruct a MOFI without a MCContext
without constructing the MCContext with a dummy version of that MOFI first.
This removes this dependency during construction. In a perfect world,
MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the
MCContext, like other MC information. This is future work.

This also shifts/adds more information to the MCContext making it more
available to the different targets. Namely:

- TargetTriple
- ObjectFileType
- SubtargetInfo

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101462

632ebc4a

[mlir][ArmSVE] Add masked arithmetic operations · 95861216

Javier Setoain authored Apr 19, 2021

These instructions map to SVE-specific instrinsics that accept a
predicate operand to support control flow in vector code.

Differential Revision: https://reviews.llvm.org/D100982

95861216

[mlir][Affine][Vector] Support vectorizing reduction loops · d80b04ab

Sergei Grechanik authored May 05, 2021

This patch adds support for vectorizing loops with 'iter_args'
implementing known reductions along the vector dimension. Comparing to
the non-vector-dimension case, two additional things are done during
vectorization of such loops:
- The resulting vector returned from the loop is reduced to a scalar
  using `vector.reduce`.
- In some cases a mask is applied to the vector yielded at the end of
  the loop to prevent garbage values from being written to the
  accumulator.

Vectorization of reduction loops is disabled by default. To enable it, a
map from loops to array of reduction descriptors should be explicitly passed to
`vectorizeAffineLoops`, or `vectorize-reductions=true` should be passed
to the SuperVectorize pass.

Current limitations:
- Loops with a non-unit step size are not supported.
- n-D vectorization with n > 1 is not supported.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100694

d80b04ab

[mlir][linalg] Fix bug in the fusion on tensors index op handling. · 4a6ee23d

Tobias Gysi authored May 05, 2021

The old index op handling let the new index operations point back to the
producer block. As a result, after fusion some index operations in the
fused block had back references to the old producer block resulting in
illegal IR. The patch now relies on a block and value mapping to avoid
such back references.

Differential Revision: https://reviews.llvm.org/D101887

4a6ee23d

[MLIR] Rename free function `verify` on OffsetSizeAndStrideOpInterface · 62851ea7
Uday Bondhugula authored May 05, 2021
```
Using a free function verify(<Op>) is error prone. Rename it.

Differential Revision: https://reviews.llvm.org/D101886
```
62851ea7
[mlir] Use ReassociationIndices instead of affine maps in linalg.reshape. · 2865d114
Alexander Belyaev authored May 05, 2021
```
Differential Revision: https://reviews.llvm.org/D101861
```
2865d114

[mlir][ArmSVE] Add basic arithmetic operations · 001d601a

Javier Setoain authored May 05, 2021

While we figure out how to best add Standard support for scalable
vectors, these instructions provide a workaround for basic arithmetic
between scalable vectors.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D100837

001d601a

[MLIR][SCF] Combine adjacent scf.if with same condition · f4a2dbfe
William S. Moses authored May 03, 2021
```
Differential Revision: https://reviews.llvm.org/D101798
```
f4a2dbfe

[mlir][sparse] Introduce proper sparsification passes · a2c9d4bb

Aart Bik authored May 03, 2021

This revision migrates more code from Linalg into the new permanent home of
SparseTensor. It replaces the test passes with proper compiler passes.

NOTE: the actual removal of the last glue and clutter in Linalg will follow

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D101811

a2c9d4bb

May 04, 2021

[mlir] Fix region successor bug in forward dataflow analysis · c1c1df63

River Riddle authored May 04, 2021

We weren't properly visiting region successors when the terminator wasn't return like, which could create incorrect results in the analysis. This revision ensures that we properly visit region successors, to avoid optimistically assuming a value is constant when it isn't.

Differential Revision: https://reviews.llvm.org/D101783

c1c1df63

[mlir][tosa] Fix tosa.concat by inserting linalg.fill after linalg.init · 1f7adf8c

Rob Suderman authored May 04, 2021

All linalg.init operations must be fed into a linalg operation before
subtensor. The inserted linalg.fill guarantees it executes correctly.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D101848

1f7adf8c

[MLIR] Add not icmp canonicalization documentation · cb395b84
William S. Moses authored May 04, 2021
```
See: https://reviews.llvm.org/D101710
```
cb395b84
[MLIR][SCF] Assume uses of condition in the body of scf.while is true · 8e211bf1
William S. Moses authored May 03, 2021
```
Differential Revision: https://reviews.llvm.org/D101801
```
8e211bf1
[MLIR] Replace a not of a comparison with appropriate comparison · 93297e4b
William S. Moses authored May 02, 2021
```
Differential Revision: https://reviews.llvm.org/D101710
```
93297e4b

[mlir][linalg] Always lower index operations during loop lowering. · 05d2297b

Tobias Gysi authored May 04, 2021

Ensure the index operations are lowered on all linalg loop lowering paths.

Differential Revision: https://reviews.llvm.org/D101827

05d2297b

[mlir] Add lowering from math.expm1 to LLVM. · 93537fab
Adrian Kuegel authored Feb 16, 2021
```
Differential Revision: https://reviews.llvm.org/D96776
```
93537fab

[mlir] Fix bug in TransferOpReduceRank when all dims are broadcasts · aa582819

Matthias Springer authored May 04, 2021

TransferReadOps that are a scalar read + broadcast are handled by TransferReadToVectorLoadLowering.

Differential Revision: https://reviews.llvm.org/D101808

aa582819

[mlir][tosa] Add lowerings for tosa.equal and tosa.arithmetic_right_shift · 07ce5c99

natashaknk authored May 03, 2021

Lowerings equal and arithmetic_right_shift for elementwise ops to linalg dialect using linalg.generic

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D101804

07ce5c99

[mlir] Linalg: add vector transfer lowering patterns to the contraction lowering · 9b67096f

Eugene Zhulenev authored May 03, 2021

This fixes a performance regression in vec-mat vectorization

Reviewed By: asaadaldien

Differential Revision: https://reviews.llvm.org/D101795

9b67096f

[mlir] Add polynomial approximation for math::Log1p · 1c0374e7

Emilio Cota authored May 03, 2021

This approximation matches the one in Eigen.

```
name                      old cpu/op  new cpu/op  delta
BM_mlir_Log1p_f32/10      83.2ns ± 7%  34.8ns ± 5%  -58.19%    (p=0.000 n=84+71)
BM_mlir_Log1p_f32/100      664ns ± 4%   129ns ± 4%  -80.57%    (p=0.000 n=82+82)
BM_mlir_Log1p_f32/1k      6.75µs ± 4%  0.81µs ± 3%  -88.07%    (p=0.000 n=88+79)
BM_mlir_Log1p_f32/10k     76.5µs ± 3%   7.8µs ± 4%  -89.84%    (p=0.000 n=80+80)
BM_eigen_s_Log1p_f32/10   70.1ns ±14%  72.6ns ±14%   +3.49%  (p=0.000 n=116+112)
BM_eigen_s_Log1p_f32/100   706ns ± 9%   717ns ± 3%   +1.60%   (p=0.018 n=117+80)
BM_eigen_s_Log1p_f32/1k   8.26µs ± 1%  8.26µs ± 1%     ~       (p=0.567 n=84+86)
BM_eigen_s_Log1p_f32/10k  92.1µs ± 5%  92.6µs ± 6%   +0.60%  (p=0.047 n=115+115)
BM_eigen_v_Log1p_f32/10   31.8ns ±24%  34.9ns ±17%   +9.72%    (p=0.000 n=98+96)
BM_eigen_v_Log1p_f32/100   169ns ±10%   177ns ± 5%   +4.66%   (p=0.000 n=119+81)
BM_eigen_v_Log1p_f32/1k   1.42µs ± 4%  1.46µs ± 8%   +2.70%   (p=0.000 n=93+113)
BM_eigen_v_Log1p_f32/10k  14.4µs ± 5%  14.9µs ± 8%   +3.61%  (p=0.000 n=115+110)
```

Reviewed By: ezhulenev, ftynse

Differential Revision: https://reviews.llvm.org/D101765

1c0374e7

May 03, 2021

[mlir][Linalg] Add a utility method to get reassociations maps for reshape. · a6e09391

MaheshRavishankar authored May 03, 2021

Given the source and destination shapes, if they are static, or if the
expanded/collapsed dimensions are unit-extent, it is possible to
compute the reassociation maps that can be used to reshape one type
into another. Add a utility method to return the reassociation maps
when possible.

This utility function can be used to fuse a sequence of reshape ops,
given the type of the source of the producer and the final result
type. This pattern supercedes a more constrained folding pattern added
to DropUnitDims pass.

Differential Revision: https://reviews.llvm.org/D101343

a6e09391

[mlir][sparse] fixed typo: sparse -> sparse_tensor · 90d18e10

Aart Bik authored May 03, 2021

Test passes either way, but this is full name of dialect

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D101774

90d18e10

[mlir][Linalg] Use rank-reduced versions of subtensor and subtensor insert when possible. · fd15e2b8

MaheshRavishankar authored May 03, 2021

Convert subtensor and subtensor_insert operations to use their
rank-reduced versions to drop unit dimensions.

Differential Revision: https://reviews.llvm.org/D101495

fd15e2b8