Skip to content
  1. Mar 19, 2021
    • Andrei Elovikov's avatar
      [VPlan] Add plain text (not DOT's digraph) dumps · 93a9d2de
      Andrei Elovikov authored
      I foresee two uses for this:
      1) It's easier to use those in debugger.
      2) Once we start implementing more VPlan-to-VPlan transformations (especially
         inner loop massaging stuff), using the vectorized LLVM IR as CHECK targets in
         LIT test would become too obscure. I can imagine that we'd want to CHECK
         against VPlan dumps after multiple transformations instead. That would be
         easier with plain text dumps than with DOT format.
      
      Reviewed By: fhahn
      
      Differential Revision: https://reviews.llvm.org/D96628
      93a9d2de
  2. Mar 18, 2021
    • Mehdi Amini's avatar
      Revert "[VPlan] Add plain text (not DOT's digraph) dumps" · 3614df35
      Mehdi Amini authored
      This reverts commit 6b053c98.
      The build is broken:
      
      ld.lld: error: undefined symbol: llvm::VPlan::printDOT(llvm::raw_ostream&) const
      >>> referenced by LoopVectorize.cpp
      >>>               LoopVectorize.cpp.o:(llvm::LoopVectorizationPlanner::printPlans(llvm::raw_ostream&)) in archive lib/libLLVMVectorize.a
      3614df35
    • Andrei Elovikov's avatar
      [VPlan] Add plain text (not DOT's digraph) dumps · 6b053c98
      Andrei Elovikov authored
      I foresee two uses for this:
      1) It's easier to use those in debugger.
      2) Once we start implementing more VPlan-to-VPlan transformations (especially
         inner loop massaging stuff), using the vectorized LLVM IR as CHECK targets in
         LIT test would become too obscure. I can imagine that we'd want to CHECK
         against VPlan dumps after multiple transformations instead. That would be
         easier with plain text dumps than with DOT format.
      
      Reviewed By: fhahn
      
      Differential Revision: https://reviews.llvm.org/D96628
      6b053c98
  3. Mar 17, 2021
    • Max Kazantsev's avatar
      [BasicAA] Drop dependency on Loop Info. PR43276 · a6074b09
      Max Kazantsev authored
      BasicAA stores a reference to LoopInfo inside. This imposes an implicit
      requirement of keeping it up to date whenever we modify the IR (in particular,
      whenever we modify terminators of blocks that belong to loops). Failing
      to do so leads to incorrect state of the LoopInfo.
      
      Because general AA does not require loop info updates and provides to API to
      update it properly, the users of AA reasonably assume that there is no need to
      update the loop info. It may be a reason of bugs, as example in PR43276 shows.
      
      This patch drops dependence of BasicAA on LoopInfo to avoid this problem.
      
      This may potentially pessimize the result of queries to BasicAA.
      
      Differential Revision: https://reviews.llvm.org/D98627
      Reviewed By: nikic
      a6074b09
  4. Mar 13, 2021
  5. Mar 12, 2021
    • Roman Lebedev's avatar
      [SCEV] Improve modelling for (null) pointer constants · 61f006ac
      Roman Lebedev authored
      This is a continuation of D89456.
      
      As it was suggested there, now that SCEV models `PtrToInt`,
      we can try to improve SCEV's pointer handling.
      In particular, i believe, i will need this in the future
      to further fix `SCEVAddExpr`operation type handling.
      
      This removes special handling of `ConstantPointerNull`
      from `ScalarEvolution::createSCEV()`, and add constant folding
      into `ScalarEvolution::getPtrToIntExpr()`.
      This way, `null` constants stay as such in SCEV's,
      but gracefully become zero integers when asked.
      
      Reviewed By: Meinersbur
      
      Differential Revision: https://reviews.llvm.org/D98147
      61f006ac
  6. Mar 11, 2021
  7. Mar 10, 2021
  8. Mar 08, 2021
    • Stephen Tozer's avatar
      Fix 2: [DebugInfo] Support DIArgList in DbgVariableIntrinsic · 57a0e0d4
      Stephen Tozer authored
      Changes to function calls in LocalTest resulted in comparisons between
      unsigned values and signed literals; the latter have been updated to be
      unsigned to prevent this warning.
      57a0e0d4
    • gbtozers's avatar
      [DebugInfo] Support DIArgList in DbgVariableIntrinsic · e5d958c4
      gbtozers authored
      This patch updates DbgVariableIntrinsics to support use of a DIArgList for the
      location operand, resulting in a significant change to its interface. This patch
      does not update all IR passes to support multiple location operands in a
      dbg.value; the only change is to update the DbgVariableIntrinsic interface and
      its uses. All code outside of the intrinsic classes assumes that an intrinsic
      will always have exactly one location operand; they will still support
      DIArgLists, but only if they contain exactly one Value.
      
      Among other changes, the setOperand and setArgOperand functions in
      DbgVariableIntrinsic have been made private. This is to prevent code from
      setting the operands of these intrinsics directly, which could easily result in
      incorrect/invalid operands being set. This does not prevent these functions from
      being called on a debug intrinsic at all, as they can still be called on any
      CallInst pointer; it is assumed that any code directly setting the operands on a
      generic call instruction is doing so safely. The intention for making these
      functions private is to prevent DIArgLists from being overwritten by code that's
      naively trying to replace one of the Values it points to, and also to fail fast
      if a DbgVariableIntrinsic is updated to use a DIArgList without a valid
      corresponding DIExpression.
      e5d958c4
  9. Feb 22, 2021
  10. Feb 19, 2021
  11. Feb 18, 2021
  12. Feb 15, 2021
    • Duncan P. N. Exon Smith's avatar
      TransformUtils: Fix metadata handling in CloneModule (and improve CloneFunctionInto) · 22a52dfd
      Duncan P. N. Exon Smith authored
      This commit fixes how metadata is handled in CloneModule to be sound,
      and improves how it's handled in CloneFunctionInto (although the latter
      is still awkward when called within a module).
      
      Ruiling Song pointed out in PR48841 that CloneModule was changed to
      unsoundly use the RF_ReuseAndMutateDistinctMDs flag (renamed in
      fa35c1f8 for clarity). This flag papered
      over a crash caused by other various changes made to CloneFunctionInto
      over the past few years that made it unsound to use cloning between
      different modules.
      
      (This commit partially addresses PR48841, fixing the repro from
      preprocessed source but not textual IR. MDNodeMapper::mapDistinctNode
      became unsound in df763188 and this
      commit does not address that regression.)
      
      RF_ReuseAndMutateDistinctMDs is designed for the IRMover to use,
      avoiding unnecessary clones of all referenced metadata when linking
      between modules (with IRMover, the source module is discarded after
      linking). It never makes sense to use when you're not discarding the
      source. This commit drops its incorrect use in CloneModule.
      
      Sadly, the right thing to do with metadata when cloning a function is
      complicated, and this patch doesn't totally fix it.
      
      The first problem is that there are two different types of referenceable
      metadata and it's not obvious what to with one of them when remapping.
      
      - `!0 = !{!1}` is metadata's version of a constant. Programatically it's
        called "uniqued" (probably a better term would be "constant") because,
        like `ConstantArray`, it's stored in uniquing tables. Once it's
        constructed, it's illegal to change its arguments.
      - `!0 = distinct !{!1}` is a bit closer to a global variable. It's legal
        to change the operands after construction.
      
      What should be done with distinct metadata when cloning functions within
      the same module?
      
      - Should new, cloned nodes be created?
      - Should all references point to the same, old nodes?
      
      The answer depends on whether that metadata is effectively owned by a
      function.
      
      And that's the second problem. Referenceable metadata's ownership model
      is not clear or explicit. Technically, it's all stored on an
      LLVMContext. However, any metadata that is `distinct`, that transitively
      references a `distinct` node, or that transitively references a
      GlobalValue is specific to a Module and is effectively owned by it. More
      specifically, some metadata is effectively owned by a specific Function
      within a module.
      
      Effectively function-local metadata was introduced somewhere around
      c10d0e5c, which made it illegal for two
      functions to share a DISubprogram attachment.
      
      When cloning a function within a module, you need to clone the
      function-local debug info and suppress cloning of global debug info (the
      status quo suppresses cloning some global debug info but not all). When
      cloning a function to a new/different module, you need to clone all of
      the debug info.
      
      Here's what I think we should do (eventually? soon? not this patch
      though):
      - Distinguish explicitly (somehow) between pure constant metadata owned
        by the LLVMContext, global metadata owned by the Module, and local
        metadata owned by a GlobalValue (such as a function).
      - Update CloneFunctionInto to trigger cloning of all "local" metadata
        (only), perhaps by adding a bit to RemapFlag. Alternatively, split
        out a separate function CloneFunctionMetadataInto to prime the
        metadata map that callers are updated to call ahead of time as
        appropriate.
      
      Here's the somewhat more isolated fix in this patch:
      - Converted the `ModuleLevelChanges` parameter to `CloneFunctionInto` to
        an enum called `CloneFunctionChangeType` that is one of
        LocalChangesOnly, GlobalChanges, DifferentModule, and ClonedModule.
      - The code maintaining the "functions uniquely own subprograms"
        invariant is now only active in the first two cases, where a function
        is being cloned within a single module. That's necessary because this
        code inhibits cloning of (some) "global" metadata that's effectively
        owned by the module.
      - The code maintaining the "all compile units must be explicitly
        referenced by !llvm.dbg.cu" invariant is now only active in the
        DifferentModule case, where a function is being cloned into a new
        module in isolation.
      - CoroSplit.cpp's call to CloneFunctionInto in CoroCloner::create
        uses LocalChangeOnly, since fa635d73
        only set `ModuleLevelChanges` to trigger cloning of local metadata.
      - CloneModule drops its unsound use of RF_ReuseAndMutateDistinctMDs
        and special handling of !llvm.dbg.cu.
      - Fixed some outdated header docs and left a couple of FIXMEs.
      
      Differential Revision: https://reviews.llvm.org/D96531
      22a52dfd
  13. Feb 12, 2021
  14. Feb 11, 2021
    • Duncan P. N. Exon Smith's avatar
      ValueMapper: Rename RF_MoveDistinctMDs => RF_ReuseAndMutateDistinctMDs, NFC · fa35c1f8
      Duncan P. N. Exon Smith authored
      Rename the `RF_MoveDistinctMDs` flag passed into `MapValue` and
      `MapMetadata` to `RF_ReuseAndMutateDistinctMDs` in order to more
      precisely describe its effect and clarify the header documentation.
      
      Found this while helping to investigate PR48841, which pointed out an
      unsound use of the flag in `CloneModule()`. For now I've just added a
      FIXME there, but I'm hopeful that the new (more precise) name will
      prevent other similar errors.
      fa35c1f8
  15. Jan 27, 2021
    • Sanjay Patel's avatar
      [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes) · ab93c18c
      Sanjay Patel authored
      I am trying to untangle the fast-math-flags propagation logic
      in the vectorizers (see a6f02212 for SLP).
      
      The loop vectorizer has a mix of checking FP function attributes,
      IR-level FMF, and just wrong assumptions.
      
      I am trying to avoid regressions while fixing this, and I think
      the IR-level logic is good enough for that, but it's hard to say
      for sure. This would be the 1st step in the clean-up.
      
      The existing test that I changed to include 'fast' actually shows
      a miscompile: the function only had the equivalent of nnan, but we
      created new instructions that had fast (all FMF set). This is
      similar to the example in https://llvm.org/PR35538
      
      Differential Revision: https://reviews.llvm.org/D95452
      ab93c18c
  16. Jan 11, 2021
    • Florian Hahn's avatar
      [VPlan] Unify value/recipe printing after VPDef transition. · eb0371e4
      Florian Hahn authored
      This patch unifies the way recipes and VPValues are printed after the
      transition to VPDef.
      
      VPSlotTracker has been updated to iterate over all recipes and all
      their defined values to number those. There is no need to number
      values in Value2VPValue.
      
      It also updates a few places that only used slot numbers for
      VPInstruction. All recipes now can produce numbered VPValues.
      eb0371e4
  17. Jan 08, 2021
    • David Green's avatar
      [LV] Don't sink into replication regions · 72fb5ba0
      David Green authored
      The new test case here contains a first order recurrences and an
      instruction that is replicated. The first order recurrence forces an
      instruction to be sunk _into_, as opposed to after the replication
      region. That causes several things to go wrong including registering
      vector instructions multiple times and failing to create dominance
      relations correctly.
      
      Instead we should be sinking to after the replication region, which is
      what this patch makes sure happens.
      
      Differential Revision: https://reviews.llvm.org/D93629
      72fb5ba0
  18. Jan 01, 2021
  19. Dec 22, 2020
    • Ta-Wei Tu's avatar
      [LoopNest] Extend `LPMUpdater` and adaptor to handle loop-nest passes · d7a6f3a1
      Ta-Wei Tu authored
      This is a follow-up patch of D87045.
      
      The patch implements "loop-nest mode" for `LPMUpdater` and `FunctionToLoopPassAdaptor` in which only top-level loops are operated.
      
      `createFunctionToLoopPassAdaptor` decides whether the returned adaptor is in loop-nest mode or not based on the given pass. If the pass is a loop-nest pass or the pass is a `LoopPassManager` which contains only loop-nest passes, the loop-nest version of adaptor is returned; otherwise, the normal (loop) version of adaptor is returned.
      
      Reviewed By: Whitney
      
      Differential Revision: https://reviews.llvm.org/D87531
      d7a6f3a1
  20. Dec 21, 2020
  21. Dec 18, 2020
    • Whitney Tsang's avatar
      Ensure SplitEdge to return the new block between the two given blocks · 2a814cd9
      Whitney Tsang authored
      This PR implements the function splitBasicBlockBefore to address an
      issue
      that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
      The issue occurs in SplitEdge when the Succ has a single predecessor
      and the edge between the BB and Succ is not critical. This produces
      the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
      was added to splitBlockBefore to handle the issue and now produces
      the correct result ‘BB->New->Succ’.
      
      Below is an example of splitting the block bb1 at its first instruction.
      
      /// Original IR
      bb0:
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlock
      bb0:
      	br bb1
      bb1:
      	br bb1.split
      bb1.split:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
      bb0:
      	br bb1.split
      bb1.split
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      
      Differential Revision: https://reviews.llvm.org/D92200
      2a814cd9
  22. Dec 17, 2020
    • Bangtian Liu's avatar
      511cfe94
    • Bangtian Liu's avatar
      Ensure SplitEdge to return the new block between the two given blocks · d20e0c34
      Bangtian Liu authored
      This PR implements the function splitBasicBlockBefore to address an
      issue
      that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
      The issue occurs in SplitEdge when the Succ has a single predecessor
      and the edge between the BB and Succ is not critical. This produces
      the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
      was added to splitBlockBefore to handle the issue and now produces
      the correct result ‘BB->New->Succ’.
      
      Below is an example of splitting the block bb1 at its first instruction.
      
      /// Original IR
      bb0:
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlock
      bb0:
      	br bb1
      bb1:
      	br bb1.split
      bb1.split:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
      bb0:
      	br bb1.split
      bb1.split
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      
      Differential Revision: https://reviews.llvm.org/D92200
      d20e0c34
  23. Dec 16, 2020
    • Roman Lebedev's avatar
      [SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree · 49dac4ac
      Roman Lebedev authored
      ... so just ensure that we pass DomTreeUpdater it into it.
      
      Fixes DomTree preservation for a large number of tests,
      all of which are marked as such so that they do not regress.
      49dac4ac
    • Whitney Tsang's avatar
      [LoopNest] Handle loop-nest passes in LoopPassManager · fa3693ad
      Whitney Tsang authored
      Per http://llvm.org/OpenProjects.html#llvm_loopnest, the goal of this
      patch (and other following patches) is to create facilities that allow
      implementing loop nest passes that run on top-level loop nests for the
      New Pass Manager.
      
      This patch extends the functionality of LoopPassManager to handle
      loop-nest passes by specializing the definition of LoopPassManager that
      accepts both kinds of passes in addPass.
      
      Only loop passes are executed if L is not a top-level one, and both
      kinds of passes are executed if L is top-level. Currently, loop nest
      passes should have the following run method:
      
      PreservedAnalyses run(LoopNest &, LoopAnalysisManager &,
      LoopStandardAnalysisResults &, LPMUpdater &);
      
      Reviewed By: Whitney, ychen
      Differential Revision: https://reviews.llvm.org/D87045
      fa3693ad
    • Bangtian Liu's avatar
      c1075720
    • Bangtian Liu's avatar
      Ensure SplitEdge to return the new block between the two given blocks · cf638d79
      Bangtian Liu authored
      This PR implements the function splitBasicBlockBefore to address an
      issue
      that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore.
      The issue occurs in SplitEdge when the Succ has a single predecessor
      and the edge between the BB and Succ is not critical. This produces
      the result ‘BB->Succ->New’. The new function splitBasicBlockBefore
      was added to splitBlockBefore to handle the issue and now produces
      the correct result ‘BB->New->Succ’.
      
      Below is an example of splitting the block bb1 at its first instruction.
      
      /// Original IR
      bb0:
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlock
      bb0:
      	br bb1
      bb1:
      	br bb1.split
      bb1.split:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore
      bb0:
      	br bb1.split
      bb1.split
      	br bb1
      bb1:
              %0 = mul i32 1, 2
      	br bb2
      bb2:
      
      Differential Revision: https://reviews.llvm.org/D92200
      cf638d79
  24. Dec 15, 2020
  25. Dec 03, 2020
  26. Nov 29, 2020
  27. Nov 25, 2020
  28. Nov 17, 2020
    • Florian Hahn's avatar
      [VPlan] Add VPDef class. · 52f3714d
      Florian Hahn authored
      This patch introduces a new VPDef class, which can be used to
      manage VPValues defined by recipes/VPInstructions.
      
      The idea here is to mirror VPUser for values defined by a recipe. A
      VPDef can produce either zero (e.g. a store recipe), one (most recipes)
      or multiple (VPInterleaveRecipe) result VPValues.
      
      To traverse the def-use chain from a VPDef to its users, one has to
      traverse the users of all values defined by a VPDef.
      
      VPValues now contain a pointer to their corresponding VPDef, if one
      exists. To traverse the def-use chain upwards from a VPValue, we first
      need to check if the VPValue is defined by a VPDef. If it does not have
      a VPDef, this means we have a VPValue that is not directly defined
      iniside the plan and we are done.
      
      If we have a VPDef, it is defined inside the region by a recipe, which
      is a VPUser, and the upwards def-use chain traversal continues by
      traversing all its operands.
      
      Note that we need to add an additional field to to VPVAlue to link them
      to their defs. The space increase is going to be offset by being able to
      remove the SubclassID field in future patches.
      
      Reviewed By: Ayal
      
      Differential Revision: https://reviews.llvm.org/D90558
      52f3714d
  29. Nov 06, 2020
    • Giorgis Georgakoudis's avatar
      [CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers · 700d2417
      Giorgis Georgakoudis authored
      CodeExtractor handles bitcasts in the extracted region that have
      lifetime markers users in the outer region as outputs. That
      creates unnecessary alloca/reload instructions and extra lifetime
      markers. The patch identifies those cases, and replaces uses in
      out-of-region lifetime markers with new bitcasts in the outer region.
      
      **Example**
      ```
      define void @foo() {
      entry:
        %0 = alloca i32
        br label %extract
      
      extract:
        %1 = bitcast i32* %0 to i8*
        call void @llvm.lifetime.start.p0i8(i64 4, i8* %1)
        call void @use(i32* %0)
        br label %exit
      
      exit:
        call void @use(i32* %0)
        call void @llvm.lifetime.end.p0i8(i64 4, i8* %1)
        ret void
      }
      ```
      
      **Current extraction**
      ```
      define void @foo() {
      entry:
        %.loc = alloca i8*, align 8
        %0 = alloca i32, align 4
        br label %codeRepl
      
      codeRepl:                                         ; preds = %entry
        %lt.cast = bitcast i8** %.loc to i8*
        call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast)
        %lt.cast1 = bitcast i32* %0 to i8*
        call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
        call void @foo.extract(i32* %0, i8** %.loc)
        %.reload = load i8*, i8** %.loc, align 8
        call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast)
        br label %exit
      
      exit:                                             ; preds = %codeRepl
        call void @use(i32* %0)
        call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload)
        ret void
      }
      
      define internal void @foo.extract(i32* %0, i8** %.out) {
      newFuncRoot:
        br label %extract
      
      exit.exitStub:                                    ; preds = %extract
        ret void
      
      extract:                                          ; preds = %newFuncRoot
        %1 = bitcast i32* %0 to i8*
        store i8* %1, i8** %.out, align 8
        call void @use(i32* %0)
        br label %exit.exitStub
      }
      ```
      
      **Extraction with patch**
      ```
      define void @foo() {
      entry:
        %0 = alloca i32, align 4
        br label %codeRepl
      
      codeRepl:                                         ; preds = %entry
        %lt.cast1 = bitcast i32* %0 to i8*
        call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1)
        call void @foo.extract(i32* %0)
        br label %exit
      
      exit:                                             ; preds = %codeRepl
        call void @use(i32* %0)
        %lt.cast = bitcast i32* %0 to i8*
        call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast)
        ret void
      }
      
      define internal void @foo.extract(i32* %0) {
      newFuncRoot:
        br label %extract
      
      exit.exitStub:                                    ; preds = %extract
        ret void
      
      extract:                                          ; preds = %newFuncRoot
        %1 = bitcast i32* %0 to i8*
        call void @use(i32* %0)
        br label %exit.exitStub
      }
      ```
      
      Reviewed By: vsk
      
      Differential Revision: https://reviews.llvm.org/D90689
      700d2417
  30. Oct 05, 2020
    • Florian Hahn's avatar
      [VPlan] Clean up uses/operands on VPBB deletion. · 348d85a6
      Florian Hahn authored
      Update the code responsible for deleting VPBBs and recipes to properly
      update users and release operands.
      
      This is another preparation for D84680 & following patches towards
      enabling modeling def-use chains in VPlan.
      348d85a6
Loading