Skip to content
  1. Feb 22, 2021
  2. Feb 19, 2021
  3. Feb 18, 2021
  4. Feb 15, 2021
    • Duncan P. N. Exon Smith's avatar
      TransformUtils: Fix metadata handling in CloneModule (and improve CloneFunctionInto) · 22a52dfd
      Duncan P. N. Exon Smith authored
      This commit fixes how metadata is handled in CloneModule to be sound,
      and improves how it's handled in CloneFunctionInto (although the latter
      is still awkward when called within a module).
      
      Ruiling Song pointed out in PR48841 that CloneModule was changed to
      unsoundly use the RF_ReuseAndMutateDistinctMDs flag (renamed in
      fa35c1f8 for clarity). This flag papered
      over a crash caused by other various changes made to CloneFunctionInto
      over the past few years that made it unsound to use cloning between
      different modules.
      
      (This commit partially addresses PR48841, fixing the repro from
      preprocessed source but not textual IR. MDNodeMapper::mapDistinctNode
      became unsound in df763188 and this
      commit does not address that regression.)
      
      RF_ReuseAndMutateDistinctMDs is designed for the IRMover to use,
      avoiding unnecessary clones of all referenced metadata when linking
      between modules (with IRMover, the source module is discarded after
      linking). It never makes sense to use when you're not discarding the
      source. This commit drops its incorrect use in CloneModule.
      
      Sadly, the right thing to do with metadata when cloning a function is
      complicated, and this patch doesn't totally fix it.
      
      The first problem is that there are two different types of referenceable
      metadata and it's not obvious what to with one of them when remapping.
      
      - `!0 = !{!1}` is metadata's version of a constant. Programatically it's
        called "uniqued" (probably a better term would be "constant") because,
        like `ConstantArray`, it's stored in uniquing tables. Once it's
        constructed, it's illegal to change its arguments.
      - `!0 = distinct !{!1}` is a bit closer to a global variable. It's legal
        to change the operands after construction.
      
      What should be done with distinct metadata when cloning functions within
      the same module?
      
      - Should new, cloned nodes be created?
      - Should all references point to the same, old nodes?
      
      The answer depends on whether that metadata is effectively owned by a
      function.
      
      And that's the second problem. Referenceable metadata's ownership model
      is not clear or explicit. Technically, it's all stored on an
      LLVMContext. However, any metadata that is `distinct`, that transitively
      references a `distinct` node, or that transitively references a
      GlobalValue is specific to a Module and is effectively owned by it. More
      specifically, some metadata is effectively owned by a specific Function
      within a module.
      
      Effectively function-local metadata was introduced somewhere around
      c10d0e5c, which made it illegal for two
      functions to share a DISubprogram attachment.
      
      When cloning a function within a module, you need to clone the
      function-local debug info and suppress cloning of global debug info (the
      status quo suppresses cloning some global debug info but not all). When
      cloning a function to a new/different module, you need to clone all of
      the debug info.
      
      Here's what I think we should do (eventually? soon? not this patch
      though):
      - Distinguish explicitly (somehow) between pure constant metadata owned
        by the LLVMContext, global metadata owned by the Module, and local
        metadata owned by a GlobalValue (such as a function).
      - Update CloneFunctionInto to trigger cloning of all "local" metadata
        (only), perhaps by adding a bit to RemapFlag. Alternatively, split
        out a separate function CloneFunctionMetadataInto to prime the
        metadata map that callers are updated to call ahead of time as
        appropriate.
      
      Here's the somewhat more isolated fix in this patch:
      - Converted the `ModuleLevelChanges` parameter to `CloneFunctionInto` to
        an enum called `CloneFunctionChangeType` that is one of
        LocalChangesOnly, GlobalChanges, DifferentModule, and ClonedModule.
      - The code maintaining the "functions uniquely own subprograms"
        invariant is now only active in the first two cases, where a function
        is being cloned within a single module. That's necessary because this
        code inhibits cloning of (some) "global" metadata that's effectively
        owned by the module.
      - The code maintaining the "all compile units must be explicitly
        referenced by !llvm.dbg.cu" invariant is now only active in the
        DifferentModule case, where a function is being cloned into a new
        module in isolation.
      - CoroSplit.cpp's call to CloneFunctionInto in CoroCloner::create
        uses LocalChangeOnly, since fa635d73
        only set `ModuleLevelChanges` to trigger cloning of local metadata.
      - CloneModule drops its unsound use of RF_ReuseAndMutateDistinctMDs
        and special handling of !llvm.dbg.cu.
      - Fixed some outdated header docs and left a couple of FIXMEs.
      
      Differential Revision: https://reviews.llvm.org/D96531
      22a52dfd
  5. Feb 12, 2021
  6. Feb 11, 2021
    • Duncan P. N. Exon Smith's avatar
      ValueMapper: Rename RF_MoveDistinctMDs => RF_ReuseAndMutateDistinctMDs, NFC · fa35c1f8
      Duncan P. N. Exon Smith authored
      Rename the `RF_MoveDistinctMDs` flag passed into `MapValue` and
      `MapMetadata` to `RF_ReuseAndMutateDistinctMDs` in order to more
      precisely describe its effect and clarify the header documentation.
      
      Found this while helping to investigate PR48841, which pointed out an
      unsound use of the flag in `CloneModule()`. For now I've just added a
      FIXME there, but I'm hopeful that the new (more precise) name will
      prevent other similar errors.
      fa35c1f8
  7. Jan 27, 2021
    • Sanjay Patel's avatar
      [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes) · ab93c18c
      Sanjay Patel authored
      I am trying to untangle the fast-math-flags propagation logic
      in the vectorizers (see a6f02212 for SLP).
      
      The loop vectorizer has a mix of checking FP function attributes,
      IR-level FMF, and just wrong assumptions.
      
      I am trying to avoid regressions while fixing this, and I think
      the IR-level logic is good enough for that, but it's hard to say
      for sure. This would be the 1st step in the clean-up.
      
      The existing test that I changed to include 'fast' actually shows
      a miscompile: the function only had the equivalent of nnan, but we
      created new instructions that had fast (all FMF set). This is
      similar to the example in https://llvm.org/PR35538
      
      Differential Revision: https://reviews.llvm.org/D95452
      ab93c18c
  8. Jan 11, 2021
    • Florian Hahn's avatar
      [VPlan] Unify value/recipe printing after VPDef transition. · eb0371e4
      Florian Hahn authored
      This patch unifies the way recipes and VPValues are printed after the
      transition to VPDef.
      
      VPSlotTracker has been updated to iterate over all recipes and all
      their defined values to number those. There is no need to number
      values in Value2VPValue.
      
      It also updates a few places that only used slot numbers for
      VPInstruction. All recipes now can produce numbered VPValues.
      eb0371e4
  9. Jan 08, 2021
    • David Green's avatar
      [LV] Don't sink into replication regions · 72fb5ba0
      David Green authored
      The new test case here contains a first order recurrences and an
      instruction that is replicated. The first order recurrence forces an
      instruction to be sunk _into_, as opposed to after the replication
      region. That causes several things to go wrong including registering
      vector instructions multiple times and failing to create dominance
      relations correctly.
      
      Instead we should be sinking to after the replication region, which is
      what this patch makes sure happens.
      
      Differential Revision: https://reviews.llvm.org/D93629
      72fb5ba0
  10. Jan 01, 2021
  11. Dec 22, 2020
    • Ta-Wei Tu's avatar
      [LoopNest] Extend `LPMUpdater` and adaptor to handle loop-nest passes · d7a6f3a1
      Ta-Wei Tu authored
      This is a follow-up patch of D87045.
      
      The patch implements "loop-nest mode" for `LPMUpdater` and `FunctionToLoopPassAdaptor` in which only top-level loops are operated.
      
      `createFunctionToLoopPassAdaptor` decides whether the returned adaptor is in loop-nest mode or not based on the given pass. If the pass is a loop-nest pass or the pass is a `LoopPassManager` which contains only loop-nest passes, the loop-nest version of adaptor is returned; otherwise, the normal (loop) version of adaptor is returned.
      
      Reviewed By: Whitney
      
      Differential Revision: https://reviews.llvm.org/D87531
      d7a6f3a1
  12. Dec 21, 2020
  13. Dec 18, 2020
    • Whitney Tsang's avatar
      Ensure SplitEdge to return the new block between the two given blocks · 2a814cd9
      Whitney Tsang authored
      This PR implements the function splitBasicBlockBefore to address an
      issue
      that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
      The issue occurs in SplitEdge when the Succ has a single predecessor
      and the edge between the BB and Succ is not critical. This produces
      the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
      was added to splitBlockBefore to handle the issue and now produces
      the correct result ‘BB->New->Succ’.
      
      Below is an example of splitting the block bb1 at its first instruction.
      
      /// Original IR
      bb0:
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlock
      bb0:
      	br bb1
      bb1:
      	br bb1.split
      bb1.split:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
      bb0:
      	br bb1.split
      bb1.split
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      
      Differential Revision: https://reviews.llvm.org/D92200
      2a814cd9
  14. Dec 17, 2020
    • Bangtian Liu's avatar
      511cfe94
    • Bangtian Liu's avatar
      Ensure SplitEdge to return the new block between the two given blocks · d20e0c34
      Bangtian Liu authored
      This PR implements the function splitBasicBlockBefore to address an
      issue
      that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
      The issue occurs in SplitEdge when the Succ has a single predecessor
      and the edge between the BB and Succ is not critical. This produces
      the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
      was added to splitBlockBefore to handle the issue and now produces
      the correct result ‘BB->New->Succ’.
      
      Below is an example of splitting the block bb1 at its first instruction.
      
      /// Original IR
      bb0:
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlock
      bb0:
      	br bb1
      bb1:
      	br bb1.split
      bb1.split:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
      bb0:
      	br bb1.split
      bb1.split
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      
      Differential Revision: https://reviews.llvm.org/D92200
      d20e0c34
  15. Dec 16, 2020
    • Roman Lebedev's avatar
      [SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree · 49dac4ac
      Roman Lebedev authored
      ... so just ensure that we pass DomTreeUpdater it into it.
      
      Fixes DomTree preservation for a large number of tests,
      all of which are marked as such so that they do not regress.
      49dac4ac
    • Whitney Tsang's avatar
      [LoopNest] Handle loop-nest passes in LoopPassManager · fa3693ad
      Whitney Tsang authored
      Per http://llvm.org/OpenProjects.html#llvm_loopnest, the goal of this
      patch (and other following patches) is to create facilities that allow
      implementing loop nest passes that run on top-level loop nests for the
      New Pass Manager.
      
      This patch extends the functionality of LoopPassManager to handle
      loop-nest passes by specializing the definition of LoopPassManager that
      accepts both kinds of passes in addPass.
      
      Only loop passes are executed if L is not a top-level one, and both
      kinds of passes are executed if L is top-level. Currently, loop nest
      passes should have the following run method:
      
      PreservedAnalyses run(LoopNest &, LoopAnalysisManager &,
      LoopStandardAnalysisResults &, LPMUpdater &);
      
      Reviewed By: Whitney, ychen
      Differential Revision: https://reviews.llvm.org/D87045
      fa3693ad
    • Bangtian Liu's avatar
      c1075720
    • Bangtian Liu's avatar
      Ensure SplitEdge to return the new block between the two given blocks · cf638d79
      Bangtian Liu authored
      This PR implements the function splitBasicBlockBefore to address an
      issue
      that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
      The issue occurs in SplitEdge when the Succ has a single predecessor
      and the edge between the BB and Succ is not critical. This produces
      the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
      was added to splitBlockBefore to handle the issue and now produces
      the correct result ‘BB->New->Succ’.
      
      Below is an example of splitting the block bb1 at its first instruction.
      
      /// Original IR
      bb0:
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlock
      bb0:
      	br bb1
      bb1:
      	br bb1.split
      bb1.split:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
      bb0:
      	br bb1.split
      bb1.split
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      
      Differential Revision: https://reviews.llvm.org/D92200
      cf638d79
  16. Dec 15, 2020
  17. Dec 03, 2020
  18. Nov 29, 2020
  19. Nov 25, 2020
  20. Nov 17, 2020
    • Florian Hahn's avatar
      [VPlan] Add VPDef class. · 52f3714d
      Florian Hahn authored
      This patch introduces a new VPDef class, which can be used to
      manage VPValues defined by recipes/VPInstructions.
      
      The idea here is to mirror VPUser for values defined by a recipe. A
      VPDef can produce either zero (e.g. a store recipe), one (most recipes)
      or multiple (VPInterleaveRecipe) result VPValues.
      
      To traverse the def-use chain from a VPDef to its users, one has to
      traverse the users of all values defined by a VPDef.
      
      VPValues now contain a pointer to their corresponding VPDef, if one
      exists. To traverse the def-use chain upwards from a VPValue, we first
      need to check if the VPValue is defined by a VPDef. If it does not have
      a VPDef, this means we have a VPValue that is not directly defined
      iniside the plan and we are done.
      
      If we have a VPDef, it is defined inside the region by a recipe, which
      is a VPUser, and the upwards def-use chain traversal continues by
      traversing all its operands.
      
      Note that we need to add an additional field to to VPVAlue to link them
      to their defs. The space increase is going to be offset by being able to
      remove the SubclassID field in future patches.
      
      Reviewed By: Ayal
      
      Differential Revision: https://reviews.llvm.org/D90558
      52f3714d
  21. Nov 06, 2020
    • Giorgis Georgakoudis's avatar
      [CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers · 700d2417
      Giorgis Georgakoudis authored
      CodeExtractor handles bitcasts in the extracted region that have
      lifetime markers users in the outer region as outputs. That
      creates unnecessary alloca/reload instructions and extra lifetime
      markers. The patch identifies those cases, and replaces uses in
      out-of-region lifetime markers with new bitcasts in the outer region.
      
      **Example**
      ```
      define void @foo() {
      entry:
        %0 = alloca i32
        br label %extract
      
      extract:
        %1 = bitcast i32* %0 to i8*
        call void @llvm.lifetime.start.p0i8(i64 4, i8* %1)
        call void @use(i32* %0)
        br label %exit
      
      exit:
        call void @use(i32* %0)
        call void @llvm.lifetime.end.p0i8(i64 4, i8* %1)
        ret void
      }
      ```
      
      **Current extraction**
      ```
      define void @foo() {
      entry:
        %.loc = alloca i8*, align 8
        %0 = alloca i32, align 4
        br label %codeRepl
      
      codeRepl:                                         ; preds = %entry
        %lt.cast = bitcast i8** %.loc to i8*
        call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast)
        %lt.cast1 = bitcast i32* %0 to i8*
        call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
        call void @foo.extract(i32* %0, i8** %.loc)
        %.reload = load i8*, i8** %.loc, align 8
        call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast)
        br label %exit
      
      exit:                                             ; preds = %codeRepl
        call void @use(i32* %0)
        call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload)
        ret void
      }
      
      define internal void @foo.extract(i32* %0, i8** %.out) {
      newFuncRoot:
        br label %extract
      
      exit.exitStub:                                    ; preds = %extract
        ret void
      
      extract:                                          ; preds = %newFuncRoot
        %1 = bitcast i32* %0 to i8*
        store i8* %1, i8** %.out, align 8
        call void @use(i32* %0)
        br label %exit.exitStub
      }
      ```
      
      **Extraction with patch**
      ```
      define void @foo() {
      entry:
        %0 = alloca i32, align 4
        br label %codeRepl
      
      codeRepl:                                         ; preds = %entry
        %lt.cast1 = bitcast i32* %0 to i8*
        call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
        call void @foo.extract(i32* %0)
        br label %exit
      
      exit:                                             ; preds = %codeRepl
        call void @use(i32* %0)
        %lt.cast = bitcast i32* %0 to i8*
        call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast)
        ret void
      }
      
      define internal void @foo.extract(i32* %0) {
      newFuncRoot:
        br label %extract
      
      exit.exitStub:                                    ; preds = %extract
        ret void
      
      extract:                                          ; preds = %newFuncRoot
        %1 = bitcast i32* %0 to i8*
        call void @use(i32* %0)
        br label %exit.exitStub
      }
      ```
      
      Reviewed By: vsk
      
      Differential Revision: https://reviews.llvm.org/D90689
      700d2417
  22. Oct 05, 2020
    • Florian Hahn's avatar
      [VPlan] Clean up uses/operands on VPBB deletion. · 348d85a6
      Florian Hahn authored
      Update the code responsible for deleting VPBBs and recipes to properly
      update users and release operands.
      
      This is another preparation for D84680 & following patches towards
      enabling modeling def-use chains in VPlan.
      348d85a6
  23. Oct 04, 2020
  24. Oct 03, 2020
  25. Sep 30, 2020
    • Florian Hahn's avatar
      [VPlan] Change recipes to inherit from VPUser instead of a member var. · d8563654
      Florian Hahn authored
      Now that VPUser is not inheriting from VPValue, we can take the next
      step and turn the recipes that already manage their operands via VPUser
      into VPUsers directly. This is another small step towards traversing
      def-use chains in VPlan.
      
      This is NFC with respect to the generated code, but makes the interface
      more powerful.
      d8563654
  26. Sep 22, 2020
  27. Sep 19, 2020
    • Florian Hahn's avatar
      [SCEVExpander] Support expanding nonintegral pointers with constant base. · 1d8f2e52
      Florian Hahn authored
      Currently SCEVExpander creates inttoptr for non-integral pointers if the
      base is a null constant for example. This results in invalid IR.
      
      This patch changes InsertNoopCastOfTo to emit a GEP & bitcast to convert
      to a non-integral pointer. First, a GEP of i8* null is generated and the
      integral value is used as index. The GEP is then bitcasted to the target
      type.
      
      This was exposed by D71539.
      
      Reviewed By: efriedma
      
      Differential Revision: https://reviews.llvm.org/D87827
      1d8f2e52
  28. Sep 16, 2020
    • Wenlei He's avatar
      [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults · 2ea4c2c5
      Wenlei He authored
      ~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~
      
      ~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~
      
      ~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~
      ~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~
      ~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~
      
      ~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~
      
      See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 .
      
      This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes.
      
      Testing
      Ninja check
      
      Reviewed By: asbirlea, nikic
      
      Differential Revision: https://reviews.llvm.org/D86156
      2ea4c2c5
  29. Aug 21, 2020
    • Amy Huang's avatar
      [Cloning] Fix to cloning DISubprograms. · 5e3fd471
      Amy Huang authored
      When trying to enable -debug-info-kind=constructor there was an assert
      that occurs during debug info cloning ("mismatched subprogram between
      llvm.dbg.value variable and !dbg attachment").
      It appears that during llvm::CloneFunctionInto, a DISubprogram could be
      duplicated when MapMetadata is called, and then added to the MD map again
      when DIFinder gets a list of subprograms. This results in two different
      versions of the DISubprogram.
      
      This patch switches the order so that the DIFinder subprograms are
      added before MapMetadata is called.
      
      Fixes https://bugs.llvm.org/show_bug.cgi?id=46784
      
      Differential Revision: https://reviews.llvm.org/D86185
      5e3fd471
  30. Jul 30, 2020
  31. Jul 29, 2020
    • Florian Hahn's avatar
      Reland "[SCEVExpander] Add option to preserve LCSSA directly." · f75564ad
      Florian Hahn authored
      This reverts the revert commit dc286757.
      
      It includes a fix for Polly, which uses SCEVExpander on IR that is not
      in LCSSA form. Set PreserveLCSSA = false in that case, to ensure we do
      not introduce LCSSA phis where there were none before.
      f75564ad
    • Florian Hahn's avatar
      Revert "[SCEVExpander] Add option to preserve LCSSA directly." · dc286757
      Florian Hahn authored
      This reverts commit 99166fd4, because it
      breaks the polly builders.
      
      polly/test/Isl/CodeGen/invariant_load_escaping_second_scop.ll fails
      because a apparently unnecessary LCSSA phi node is introduced.
      
      Make the bots green again, while I take a closer look.
      dc286757
    • Florian Hahn's avatar
      [SCEVExpander] Add option to preserve LCSSA directly. · 99166fd4
      Florian Hahn authored
      This patch teaches SCEVExpander to directly preserve LCSSA.
      
      As it is currently, SCEV does not look through PHI nodes in loops,
      as it might break LCSSA form. Once SCEVExpander can preserve
      LCSSA form, it should be safe for SCEV to look through PHIs.
      
      To preserve LCSSA form, this patch uses formLCSSAForInstructions
      on operands of newly created instructions, if the definition is inside
      a different loop than the new instruction.
      
      The final value we return from expandCodeFor may also need LCSSA
      phis, depending on the insert point. As no user for it exists there yet,
      create a temporary instruction at the insert point, which can be passed
      to formLCSSAForInstructions. This temporary instruction is removed
      after LCSSA construction.
      
      Reviewed By: mkazantsev
      
      Differential Revision: https://reviews.llvm.org/D71538
      99166fd4
Loading