Skip to content
  1. Mar 19, 2021
    • Andrew Young's avatar
      [mlir] Support use-def cycles in graph regions during regionDCE · f178c13f
      Andrew Young authored
      When deleting operations in DCE, the algorithm uses a post-order walk of
      the IR to ensure that value uses were erased before value defs. Graph
      regions do not have the same structural invariants as SSA CFG, and this
      post order walk could delete value defs before uses.  This problem is
      guaranteed to occur when there is a cycle in the use-def graph.
      
      This change stops DCE from visiting the operations and blocks in any
      meaningful order.  Instead, we rely on explicitly dropping all uses of a
      value before deleting it.
      
      Reviewed By: mehdi_amini, rriddle
      
      Differential Revision: https://reviews.llvm.org/D98919
      f178c13f
  2. Mar 15, 2021
    • Julian Gross's avatar
      [MLIR] Create memref dialect and move dialect-specific ops from std. · e2310704
      Julian Gross authored
      Create the memref dialect and move dialect-specific ops
      from std dialect to this dialect.
      
      Moved ops:
      AllocOp -> MemRef_AllocOp
      AllocaOp -> MemRef_AllocaOp
      AssumeAlignmentOp -> MemRef_AssumeAlignmentOp
      DeallocOp -> MemRef_DeallocOp
      DimOp -> MemRef_DimOp
      MemRefCastOp -> MemRef_CastOp
      MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp
      GetGlobalMemRefOp -> MemRef_GetGlobalOp
      GlobalMemRefOp -> MemRef_GlobalOp
      LoadOp -> MemRef_LoadOp
      PrefetchOp -> MemRef_PrefetchOp
      ReshapeOp -> MemRef_ReshapeOp
      StoreOp -> MemRef_StoreOp
      SubViewOp -> MemRef_SubViewOp
      TransposeOp -> MemRef_TransposeOp
      TensorLoadOp -> MemRef_TensorLoadOp
      TensorStoreOp -> MemRef_TensorStoreOp
      TensorToMemRefOp -> MemRef_BufferCastOp
      ViewOp -> MemRef_ViewOp
      
      The roadmap to split the memref dialect from std is discussed here:
      https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667
      
      Differential Revision: https://reviews.llvm.org/D98041
      e2310704
    • Alex Zinenko's avatar
      Revert "[Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants." · 40d8e4d3
      Alex Zinenko authored
      This reverts commit b5d9a3c9.
      
      The commit introduced a memory error in canonicalization/operation
      walking that is exposed when compiled with ASAN. It leads to crashes in
      some "release" configurations.
      40d8e4d3
    • Chris Lattner's avatar
      [m_Constant] Check #operands/results before hasTrait() · 91a6ad5a
      Chris Lattner authored
      We know that all ConstantLike operations have one result and no operands,
      so check this first before doing the trait check.  This change speeds up
      Canonicalize on a CIRCT testcase by ~5%.
      
      Differential Revision: https://reviews.llvm.org/D98615
      91a6ad5a
    • Chris Lattner's avatar
      [Canonicalizer] Process regions top-down instead of bottom up & reuse existing constants. · b5d9a3c9
      Chris Lattner authored
      Two changes:
       1) Change the canonicalizer to walk the function in top-down order instead of
          bottom-up order.  This composes well with the "top down" nature of constant
          folding and simplification, reducing iterations and re-evaluation of ops in
          simple cases.
       2) Explicitly enter existing constants into the OperationFolder table before
          canonicalizing.  Previously we would "constant fold" them and rematerialize
          them, wastefully recreating a bunch fo constants, which lead to pointless
          memory traffic.
      
      Both changes together provide a 33% speedup for canonicalize on some mid-size
      CIRCT examples.
      
      One artifact of this change is that the constants generated in normal pattern
      application get inserted at the top of the function as the patterns are applied.
      Because of this, we get "inverted" constants more often, which is an aethetic
      change to the IR but does permute some testcases.
      
      Differential Revision: https://reviews.llvm.org/D98609
      b5d9a3c9
  3. Mar 11, 2021
  4. Mar 09, 2021
  5. Mar 03, 2021
    • River Riddle's avatar
      [mlir][IR] Refactor the internal implementation of Value · 3dfa8614
      River Riddle authored
      The current implementation of Value involves a pointer int pair with several different kinds of owners, i.e. BlockArgumentImpl*, Operation *, TrailingOpResult*. This design arose from the desire to save memory overhead for operations that have a very small number of results (generally 0-2). There are, unfortunately, many problematic aspects of the current implementation that make Values difficult to work with or just inefficient.
      
      Operation result types are stored as a separate array on the Operation. This is very inefficient for many reasons: we use TupleType for multiple results, which can lead to huge amounts of memory usage if multi-result operations change types frequently(they do). It also means that simple methods like Value::getType/Value::setType now require complex logic to get to the desired type.
      
      Value only has one pointer bit free, severely limiting the ability to use it in things like PointerUnion/PointerIntPair. Given that we store the kind of a Value along with the "owner" pointer, we only leave one bit free for users of Value. This creates situations where we end up nesting PointerUnions to be able to use Value in one.
      
      As noted above, most of the methods in Value need to branch on at least 3 different cases which is both inefficient, possibly error prone, and verbose. The current storage of results also creates problems for utilities like ValueRange/TypeRange, which want to efficiently store base pointers to ranges (of which Operation* isn't really useful as one).
      
      This revision greatly simplifies the implementation of Value by the introduction of a new ValueImpl class. This class contains all of the state shared between all of the various derived value classes; i.e. the use list, the type, and the kind. This shared implementation class provides several large benefits:
      
      * Most of the methods on value are now branchless, and often one-liners.
      
      * The "kind" of the value is now stored in ValueImpl instead of Value
      This frees up all of Value's pointer bits, allowing for users to take full advantage of PointerUnion/PointerIntPair/etc. It also allows for storing more operation results as "inline", 6 now instead of 2, freeing up 1 word per new inline result.
      
      * Operation result types are now stored in the result, instead of a side array
      This drops the size of zero-result operations by 1 word. It also removes the memory crushing use of TupleType for operations results (which could lead up to hundreds of megabytes of "dead" TupleTypes in the context). This also allowed restructured ValueRange, making it simpler and one word smaller.
      
      This revision does come with two conceptual downsides:
      * Operation::getResultTypes no longer returns an ArrayRef<Type>
      This conceptually makes some usages slower, as the iterator increment is slightly more complex.
      * OpResult::getOwner is slightly more expensive, as it now requires a little bit of arithmetic
      
      From profiling, neither of the conceptual downsides have resulted in any perceivable hit to performance. Given the advantages of the new design, most compiles are slightly faster.
      
      Differential Revision: https://reviews.llvm.org/D97804
      3dfa8614
  6. Mar 02, 2021
    • KareemErgawy-TomTom's avatar
      [MLIR][LinAlg] Detensorize interal function control flow. · 3b021fbd
      KareemErgawy-TomTom authored
      This patch continues detensorizing implementation by detensoring
      internal control flow in functions.
      
      In order to detensorize functions, all the non-entry block's arguments
      are detensored and branches between such blocks are properly updated to
      reflect the detensored types as well. Function entry block (signature)
      is left intact.
      
      This continues work towards handling github/google/iree#1159.
      
      Reviewed By: silvas
      
      Differential Revision: https://reviews.llvm.org/D97148
      3b021fbd
    • Vladislav Vinogradov's avatar
      [mlir][NFC] Rename `MemRefType::getMemorySpace` to `getMemorySpaceAsInt` · 37eca08e
      Vladislav Vinogradov authored
      Just a pure method renaming.
      
      It is a preparation step for replacing "memory space as raw integer"
      with more generic "memory space as attribute", which will be done in
      separate commit.
      
      The `MemRefType::getMemorySpace` method will return `Attribute` and
      become the main API, while `getMemorySpaceAsInt` will be declared as
      deprecated and will be replaced in all in-tree dialects (also in separate
      commits).
      
      Reviewed By: mehdi_amini, rriddle
      
      Differential Revision: https://reviews.llvm.org/D97476
      37eca08e
  7. Feb 27, 2021
  8. Feb 26, 2021
  9. Feb 25, 2021
  10. Feb 24, 2021
  11. Feb 23, 2021
  12. Feb 22, 2021
  13. Feb 21, 2021
  14. Feb 18, 2021
  15. Feb 16, 2021
    • Adam Straw's avatar
      separate AffineMapAccessInterface from AffineRead/WriteOpInterface · 99c0458f
      Adam Straw authored
      Separating the AffineMapAccessInterface from AffineRead/WriteOp interface so that dialects which extend Affine capabilities (e.g. PlaidML PXA = parallel extensions for Affine) can utilize relevant passes (e.g. MemRef normalization).
      
      Reviewed By: bondhugula
      
      Differential Revision: https://reviews.llvm.org/D96284
      99c0458f
    • Nicolas Vasilache's avatar
      [mlir] Drop reliance of SliceAnalysis on specific ops. · d01ea0ed
      Nicolas Vasilache authored
      SliceAnalysis originally was developed in the context of affine.for within mlfunc.
      It predates the notion of region.
      This revision updates it to not hardcode specific ops like scf::ForOp.
      When rooted at an op, the behavior of the slice computation changes as it recurses into the regions of the op. This does not support gathering all values transitively depending on a loop induction variable anymore.
      Additional variants rooted at a Value are added to also support the existing behavior.
      
      Differential revision: https://reviews.llvm.org/D96702
      d01ea0ed
  16. Feb 12, 2021
  17. Feb 11, 2021
  18. Feb 10, 2021
  19. Feb 09, 2021
    • River Riddle's avatar
      [mlir][IR] Remove the concept of `OperationProperties` · fe7c0d90
      River Riddle authored
      These properties were useful for a few things before traits had a better integration story, but don't really carry their weight well these days. Most of these properties are already checked via traits in most of the code. It is better to align the system around traits, and improve the performance/cost of traits in general.
      
      Differential Revision: https://reviews.llvm.org/D96088
      fe7c0d90
  20. Feb 06, 2021
    • Tung D. Le's avatar
      [MLIR] [affine-loop-fusion] Fix a bug about non-result ops in affine-loop-fusion · 05c6c648
      Tung D. Le authored
      This patch fixes the following bug when calling --affine-loop-fusion
      
      Input program:
       ```mlir
      func @should_not_fuse_since_top_level_non_affine_non_result_users(
          %in0 : memref<32xf32>, %in1 : memref<32xf32>) {
        %c0 = constant 0 : index
        %cst_0 = constant 0.000000e+00 : f32
      
        affine.for %d = 0 to 32 {
          %lhs = affine.load %in0[%d] : memref<32xf32>
          %rhs = affine.load %in1[%d] : memref<32xf32>
          %add = addf %lhs, %rhs : f32
          affine.store %add, %in0[%d] : memref<32xf32>
        }
        store %cst_0, %in0[%c0] : memref<32xf32>
        affine.for %d = 0 to 32 {
          %lhs = affine.load %in0[%d] : memref<32xf32>
          %rhs = affine.load %in1[%d] : memref<32xf32>
          %add = addf %lhs, %rhs: f32
          affine.store %add, %in0[%d] : memref<32xf32>
        }
        return
      }
      ```
      
      call --affine-loop-fusion, we got an incorrect output:
      
      ```mlir
      func @should_not_fuse_since_top_level_non_affine_non_result_users(%arg0: memref<32xf32>, %arg1: memref<32xf32>) {
        %c0 = constant 0 : index
        %cst = constant 0.000000e+00 : f32
        store %cst, %arg0[%c0] : memref<32xf32>
        affine.for %arg2 = 0 to 32 {
          %0 = affine.load %arg0[%arg2] : memref<32xf32>
          %1 = affine.load %arg1[%arg2] : memref<32xf32>
          %2 = addf %0, %1 : f32
          affine.store %2, %arg0[%arg2] : memref<32xf32>
          %3 = affine.load %arg0[%arg2] : memref<32xf32>
          %4 = affine.load %arg1[%arg2] : memref<32xf32>
          %5 = addf %3, %4 : f32
          affine.store %5, %arg0[%arg2] : memref<32xf32>
        }
        return
      }
      ```
      
      This happened because when analyzing the source and destination nodes,
      affine loop fusion ignored non-result ops sandwitched between them. In
      other words, the MemRefDependencyGraph in the affine loop fusion ignored
      these non-result ops.
      
      This patch solves the issue by adding these non-result ops to the
      MemRefDependencyGraph.
      
      Reviewed By: bondhugula
      
      Differential Revision: https://reviews.llvm.org/D95668
      05c6c648
  21. Feb 05, 2021
  22. Feb 04, 2021
    • Alex Zinenko's avatar
      [mlir] Apply source materialization in case of transitive conversion · 5b91060d
      Alex Zinenko authored
      In dialect conversion infrastructure, source materialization applies as part of
      the finalization procedure to results of the newly produced operations that
      replace previously existing values with values having a different type.
      However, such operations may be created to replace operations created in other
      patterns. At this point, it is possible that the results of the _original_
      operation are still in use and have mismatching types, but the results of the
      _intermediate_ operation that performed the type change are not in use leading
      to the absence of source materialization. For example,
      
        %0 = dialect.produce : !dialect.A
        dialect.use %0 : !dialect.A
      
      can be replaced with
      
        %0 = dialect.other : !dialect.A
        %1 = dialect.produce : !dialect.A  // replaced, scheduled for removal
        dialect.use %1 : !dialect.A
      
      and then with
      
        %0 = dialect.final : !dialect.B
        %1 = dialect.other : !dialect.A    // replaced, scheduled for removal
        %2 = dialect.produce : !dialect.A  // replaced, scheduled for removal
        dialect.use %2 : !dialect.A
      
      in the same rewriting, but only the %1->%0 replacement is currently considered.
      
      Change the logic in dialect conversion to look up all values that were replaced
      by the given value and performing source materialization if any of those values
      is still in use with mismatching types. This is performed by computing the
      inverse value replacement mapping. This arguably expensive manipulation is
      performed only if there were some type-changing replacements. An alternative
      could be to consider all replaced operations and not only those that resulted
      in type changes, but it would harm pattern-level composability: the pattern
      that performed the non-type-changing replacement would have to be made aware of
      the type converter in order to call the materialization hook.
      
      Reviewed By: rriddle
      
      Differential Revision: https://reviews.llvm.org/D95626
      5b91060d
    • Mehdi Amini's avatar
      Make the folder more robust against op fold() methods that generate a type mismatch · a1d5bdf8
      Mehdi Amini authored
      We could extend this with an interface to allow dialect to perform a type
      conversion, but that would make the folder creating operation which isn't
      the case at the moment, and isn't necessarily always desirable.
      
      Reviewed By: rriddle
      
      Differential Revision: https://reviews.llvm.org/D95991
      a1d5bdf8
  23. Feb 02, 2021
    • Alex Zinenko's avatar
      [mlir] Keep track of region signature conversions as argument replacements · 0409eb28
      Alex Zinenko authored
      In dialect conversion, signature conversions essentially perform block argument
      replacement and are added to the general value remapping. However, the replaced
      values were not tracked, so if a signature conversion was rolled back, the
      construction of operand lists for the following patterns could have obtained
      block arguments from the mapping and give them to the pattern leading to
      use-after-free. Keep track of signature conversions similarly to normal block
      argument replacement, and erase such replacements from the general mapping when
      the conversion is rolled back.
      
      Reviewed By: rriddle
      
      Differential Revision: https://reviews.llvm.org/D95688
      0409eb28
  24. Jan 29, 2021
  25. Jan 25, 2021
    • Diego Caballero's avatar
      [mlir][Affine] Add support for multi-store producer fusion · c8fc5c03
      Diego Caballero authored
      This patch adds support for producer-consumer fusion scenarios with
      multiple producer stores to the AffineLoopFusion pass. The patch
      introduces some changes to the producer-consumer algorithm, including:
      
      * For a given consumer loop, producer-consumer fusion iterates over its
      producer candidates until a fixed point is reached.
      
      * Producer candidates are gathered beforehand for each iteration of the
      consumer loop and visited in reverse program order (not strictly guaranteed)
      to maximize the number of loops fused per iteration.
      
      In general, these changes were needed to simplify the multi-store producer
      support and remove some of the workarounds that were introduced in the past
      to support more fusion cases under the single-store producer limitation.
      
      This patch also preserves the existing functionality of AffineLoopFusion with
      one minor change in behavior. Producer-consumer fusion didn't fuse scenarios
      with escaping memrefs and multiple outgoing edges (from a single store).
      Multi-store producer scenarios will usually (always?) have multiple outgoing
      edges so we couldn't fuse any with escaping memrefs, which would greatly limit
      the applicability of this new feature. Therefore, the patch enables fusion for
      these scenarios. Please, see modified tests for specific details.
      
      Reviewed By: andydavis1, bondhugula
      
      Differential Revision: https://reviews.llvm.org/D92876
      c8fc5c03
  26. Jan 22, 2021
    • mikeurbach's avatar
      [mlir] Support FuncOpSignatureConversion for more FunctionLike ops. · 0a7a1ac7
      mikeurbach authored
      This extracts the implementation of getType, setType, and getBody from
      FunctionSupport.h into the mlir::impl namespace and defines them
      generically in FunctionSupport.cpp. This allows them to be used
      elsewhere for any FunctionLike ops that use FunctionType for their
      type signature.
      
      Using the new helpers, FuncOpSignatureConversion is generalized to
      work with all such FunctionLike ops. Convenience helpers are added to
      configure the pattern for a given concrete FunctionLike op type.
      
      Reviewed By: rriddle
      
      Differential Revision: https://reviews.llvm.org/D95021
      0a7a1ac7
  27. Jan 20, 2021
Loading