Skip to content
  1. Mar 15, 2021
  2. Mar 14, 2021
  3. Mar 13, 2021
    • Nikita Popov's avatar
      [MemCpyOpt] Handle read from lifetime.start with offset · 55566609
      Nikita Popov authored
      This fixes a regression from the MemDep-based implementation:
      MemDep completely ignores lifetime.start intrinsics that aren't
      MustAlias -- this is probably unsound, but it does mean that the
      MemDep based implementation successfully eliminated memcpy's from
      lifetime.start if the memcpy happens at an offset, rather than
      the base address of the alloca.
      
      Add a special case for the case where the lifetime.start spans the
      whole alloca (which is pretty much the only kind of lifetime.start
      that frontends ever emit), as we don't need to figure out our exact
      aliasing relationship in that case, the whole alloca is dead prior
      to the call.
      
      If this doesn't cover all practically relevant cases, then it
      would be possible to make use of the recently added PartialAlias
      clobber offsets to make this more precise.
      55566609
    • Sanjay Patel's avatar
      [InstCombine] avoid creating an extra instruction in zext fold and possible inf-loop · 4224a369
      Sanjay Patel authored
      The structure of this fold is suspect vs. most of instcombine
      because it creates instructions and tries to delete them
      immediately after.
      
      If we don't have the operand types for the icmps, then we are
      not behaving as assumed. And as shown in PR49475, we can inf-loop.
      4224a369
    • Roman Lebedev's avatar
      [LSR] Don't try to fixup uses in 'EH pad' instructions · 6e9b9978
      Roman Lebedev authored
      The added test case crashes before this fix:
      ```
      opt: /repositories/llvm-project/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:5172: BasicBlock::iterator (anonymous namespace)::LSRInstance::AdjustInsertPositionForExpand(BasicBlock::iterator, const (anonymous namespace)::LSRFixup &, const (anonymous namespace)::LSRUse &, llvm::SCEVExpander &) const: Assertion `!isa<PHINode>(LowestIP) && !LowestIP->isEHPad() && !isa<DbgInfoIntrinsic>(LowestIP) && "Insertion point must be a normal instruction"' failed.
      ```
      This is fully analogous to the previous commit,
      with the pointer constant replaced to be something non-null.
      
      The comparison here can be strength-reduced,
      but the second operand of the comparison happens to be identical
      to the constant pointer in the `catch` case of `landingpad`.
      
      While LSRInstance::CollectLoopInvariantFixupsAndFormulae()
      already gave up on uses in blocks ending up with EH pads,
      it didn't consider this case.
      
      Eventually, `LSRInstance::AdjustInsertPositionForExpand()`
      will be called, but the original insertion point it will get
      is the user instruction itself, and it doesn't want to
      deal with EH pads, and asserts as much.
      
      It would seem that this basically never happens in-the-wild,
      otherwise it would have been reported already,
      so it seems safe to take the cautious approach,
      and just not deal with such users.
      6e9b9978
    • Nikita Popov's avatar
      [MemCpyOpt] Use AA to check for MustAlias between memset and memcpy · 2902bdee
      Nikita Popov authored
      Rather than checking for simple equality, check for MustAlias, as
      we do in other transforms. This catches equivalent GEPs.
      2902bdee
    • Nikita Popov's avatar
      [MemCpyOpt] Don't generate zero-size memset · 9080444f
      Nikita Popov authored
      If a memset destination is overwritten by a memcpy and the sizes
      are exactly the same, then the memset is simply dead. We can
      directly drop it, instead of replacing it with a memset of zero
      size, which is particularly ugly for the case of a dynamic size.
      9080444f
  4. Mar 12, 2021
    • Wei Mi's avatar
      [IndirectCallPromotion] Recommit "Don't strip ".__uniq." suffix when it strips · ef9d7db7
      Wei Mi authored
      ".llvm." suffix".
      
      The recommit fixed a bug that symbols with "." at the beginning is not
      properly handled in the last commit.
      
      Original commit message:
      Currently IndirectCallPromotion simply strip everything after the first "."
      in LTO mode, in order to match the symbol name and the name with ".llvm."
      suffix in the value profile. However, if -funique-internal-linkage-names
      and thinlto are both enabled, the name may have both ".__uniq." suffix and
      ".llvm." suffix, and the current mechanism will strip them both, which is
      unexpected. The patch fixes the problem.
      
      Differential Revision: https://reviews.llvm.org/D98389
      ef9d7db7
    • Nikita Popov's avatar
      [OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC) · 42eb658f
      Nikita Popov authored
      This removes some (but not all) uses of type-less CreateGEP()
      and CreateInBoundsGEP() APIs, which are incompatible with opaque
      pointers.
      
      There are a still a number of tricky uses left, as well as many
      more variation APIs for CreateGEP.
      42eb658f
    • Florian Hahn's avatar
      [LV] Account IV recipes being uniform in VPTransformState::get(). · fb3ca707
      Florian Hahn authored
      This patch fixes a crash when trying to get a scalar value using
      VPTransformState::get() for uniform induction values or truncated
      induction values. IVs and truncated IVs can be uniform and the updated
      code accounts for that, fixing the crash.
      
      This should fix
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=31981
      fb3ca707
    • Sanjay Patel's avatar
      [SimplifyCFG] avoid sinking insts within an infinite-loop · bd197ed0
      Sanjay Patel authored
      The test is reduced from a C source example in:
      https://llvm.org/PR49541
      
      It's possible that the test could be reduced further or
      the predicate generalized further, but it seems to require
      a few ingredients (including the "late" SimplifyCFG options
      on the RUN line) to fall into the infinite-loop trap.
      bd197ed0
    • Hans Wennborg's avatar
      Revert "[InstrProfiling] Don't generate __llvm_profile_runtime_user" · f50aef74
      Hans Wennborg authored
      This broke the check-profile tests on Mac, see comment on the code
      review.
      
      > This is no longer needed, we can add __llvm_profile_runtime directly
      > to llvm.compiler.used or llvm.used to achieve the same effect.
      >
      > Differential Revision: https://reviews.llvm.org/D98325
      
      This reverts commit c7712087.
      
      Also reverting the dependent follow-up commit:
      
      Revert "[InstrProfiling] Generate runtime hook for ELF platforms"
      
      > When using -fprofile-list to selectively apply instrumentation only
      > to certain files or functions, we may end up with a binary that doesn't
      > have any counters in the case where no files were selected. However,
      > because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
      > runtime would still be pulled in and incur some non-trivial overhead,
      > especially in the case when the continuous or runtime counter relocation
      > mode is being used. A better way would be to pull in the profile runtime
      > only when needed by declaring the __llvm_profile_runtime symbol in the
      > translation unit only when needed.
      >
      > This approach was already used prior to 9a041a75, but we changed it
      > to always generate the __llvm_profile_runtime due to a TAPI limitation.
      > Since TAPI is only used on Mach-O platforms, we could use the early
      > emission of __llvm_profile_runtime there, and on other platforms we
      > could change back to the earlier approach where the symbol is generated
      > later only when needed. We can stop passing -u__llvm_profile_runtime to
      > the linker on Linux and Fuchsia since the generated undefined symbol in
      > each translation unit that needed it serves the same purpose.
      >
      > Differential Revision: https://reviews.llvm.org/D98061
      
      This reverts commit 87fd09b2.
      f50aef74
    • Serguei Katkov's avatar
      Revert "Mark gc.relocate and gc.result as readnone" · cfe8f8e0
      Serguei Katkov authored
      As readnone function they become movable and LICM can hoist them
      out of a loop. As a result in LCSSA form phi node of type token
      is created. No one is ready that GCRelocate first operand is phi node
      but expects to be token.
      
      GVN test were also updated, it seems it does not do what is expected.
      Test for LICM is also added.
      
      This reverts commit f352463a.
      cfe8f8e0
    • Johannes Doerfert's avatar
      [Attributor] Derive `willreturn` based on `mustprogress` · ff256c13
      Johannes Doerfert authored
      Since D86233 we have `mustprogress` which, in combination with
      `readonly`, implies `willreturn`. The idea is that every side-effect
      has to be modeled as a "write". Consequently, `readonly` means there
      is no side-effect, and `mustprogress` guarantees that we cannot "loop"
      forever without side-effect.
      
      Reviewed By: fhahn
      
      Differential Revision: https://reviews.llvm.org/D94125
      ff256c13
  5. Mar 11, 2021
    • Nikita Popov's avatar
      [Attributor] Don't access pointer elem type in constructPointer (NFC) · 2fe85dd2
      Nikita Popov authored
      Splitting this out as the change is non-trivial: The way this code
      handled pointer types doesn't really make sense, as GEPs can only
      apply an offset to the outermost pointer, but can't drill down
      into interior pointer types (which would require dereferencing
      memory).
      
      Instead give special treatment to the first (pointer) index.
      I've hardcoded it to zero as that's the only way the function is
      used right now, but handling non-zero indexes would be
      straightforward.
      
      The original goal here was to have an element type for CreateGEP.
      2fe85dd2
    • Petr Hosek's avatar
      [InstrProfiling] Generate runtime hook for ELF platforms · 87fd09b2
      Petr Hosek authored
      When using -fprofile-list to selectively apply instrumentation only
      to certain files or functions, we may end up with a binary that doesn't
      have any counters in the case where no files were selected. However,
      because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the
      runtime would still be pulled in and incur some non-trivial overhead,
      especially in the case when the continuous or runtime counter relocation
      mode is being used. A better way would be to pull in the profile runtime
      only when needed by declaring the __llvm_profile_runtime symbol in the
      translation unit only when needed.
      
      This approach was already used prior to 9a041a75, but we changed it
      to always generate the __llvm_profile_runtime due to a TAPI limitation.
      Since TAPI is only used on Mach-O platforms, we could use the early
      emission of __llvm_profile_runtime there, and on other platforms we
      could change back to the earlier approach where the symbol is generated
      later only when needed. We can stop passing -u__llvm_profile_runtime to
      the linker on Linux and Fuchsia since the generated undefined symbol in
      each translation unit that needed it serves the same purpose.
      
      Differential Revision: https://reviews.llvm.org/D98061
      87fd09b2
    • Valery N Dmitriev's avatar
      [SLP] Fix crash when matching associative reduction for integer min/max. · 73f94969
      Valery N Dmitriev authored
      Associative reduction matcher in SLP begins with select instruction but when
      it reached call to llvm.umax (or alike) via def-use chain the latter also matched
      as UMax kind. The routine's later code assumes matched instruction to be a select
      and thus it merely died on the first encountered cast that did not fit.
      
      Differential Revision: https://reviews.llvm.org/D98432
      73f94969
    • Wenlei He's avatar
      [SamplePGO] Skip inlinee profile scaling for sample loader inlining · 051f2c14
      Wenlei He authored
      For CGSCC inline, we need to scale down a function's branch weights and entry counts when thee it's inlined at a callsite. This is done through updateCallProfile. Additionally, we also scale the weigths for the inlined clone based on call site count in updateCallerBFI. Neither is needed for inlining during sample profile loader as it's using context profile that is separated from inlinee's own profile. This change skip the inlinee profile scaling for sample loader inlining.
      
      Differential Revision: https://reviews.llvm.org/D98187
      051f2c14
    • Hiroshi Yamauchi's avatar
      [PGO] Fix two issues in PGOMemOPSizeOpt. · 365b225d
      Hiroshi Yamauchi authored
      1. PGOMemOPSizeOpt grabs only the first, up to five (by default) entries from
      the value profile metadata and preserves the remaining entries for the fallback
      memop call site. If there are more than five entries, the rest of the entries
      would get dropped. This is fine for PGOMemOPSizeOpt itself as it only promotes
      up to 3 (by default) values, but potentially not for other downstream passes
      that may use the value profile metadata.
      
      2. PGOMemOPSizeOpt originally assumed that only values 0 through 8 are kept
      track of. When the range buckets were introduced, it was changed to skip the
      range buckets, but since it does not grab all entries (only five), if some range
      buckets exist in the first five entries, it could potentially cause fewer
      promotion opportunities (eg. if 4 out of 5 were range buckets, it may be able to
      promote up to one non-range bucket, as opposed to 3.) Also, combined with 1, it
      means that wrong entries may be preserved, as it didn't correctly keep track of
      which were entries were skipped.
      
      To fix this, PGOMemOPSizeOpt now grabs all the entries (up to the maximum number
      of value profile buckets), keeps track of which entries were skipped, and
      preserves all the remaining entries.
      
      Differential Revision: https://reviews.llvm.org/D97592
      365b225d
    • Stephen Tozer's avatar
      Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs... · f40976bd
      Stephen Tozer authored
      Revert "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
      
      This reverts commit c0f3dfb9.
      
      Reverted due to an error on the clang-x64-windows-msvc buildbot.
      f40976bd
    • Nikita Popov's avatar
      [OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC) · 46354bac
      Nikita Popov authored
      Explicitly pass loaded type when creating loads, in preparation
      for the deprecation of these APIs.
      
      There are still a couple of uses left.
      46354bac
    • gbtozers's avatar
      [DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands · c0f3dfb9
      gbtozers authored
      This patch improves salvageDebugInfoImpl by allowing it to salvage arithmetic
      operations with two or more non-const operands; this includes the GetElementPtr
      instruction, and most Binary Operator instructions. These salvages produce
      DIArgList locations and are only valid for dbg.values, as currently variadic
      DIExpressions must use DW_OP_stack_value. This functionality is also only added
      for salvageDebugInfoForDbgValues; other functions that directly call
      salvageDebugInfoImpl (such as in ISel or Coroutine frame building) can be
      updated in a later patch.
      
      Differential Revision: https://reviews.llvm.org/D91722
      c0f3dfb9
    • Nikita Popov's avatar
      Reapply [LICM] Make promotion faster · 403da6a6
      Nikita Popov authored
      Relative to the previous implementation, this always uses
      aliasesUnknownInst() instead of aliasesPointer() to correctly
      handle atomics. The added test case was previously miscompiled.
      
      -----
      
      Even when MemorySSA-based LICM is used, an AST is still populated
      for scalar promotion. As the AST has quadratic complexity, a lot
      of time is spent in this step despite the existing access count
      limit. This patch optimizes the identification of promotable stores.
      
      The idea here is pretty simple: We're only interested in must-alias
      mod sets of loop invariant pointers. As such, only populate the AST
      with loop-invariant loads and stores (anything else is definitely
      not promotable) and then discard any sets which alias with any of
      the remaining, definitely non-promotable accesses.
      
      If we promoted something, check whether this has made some other
      accesses loop invariant and thus possible promotion candidates.
      
      This is much faster in practice, because we need to perform AA
      queries for O(NumPromotable^2 + NumPromotable*NumNonPromotable)
      instead of O(NumTotal^2), and NumPromotable tends to be small.
      Additionally, promotable accesses have loop invariant pointers,
      for which AA is cheaper.
      
      This has a signicant positive compile-time impact. We save ~1.8%
      geomean on CTMark at O3, with 6% on lencod in particular and 25%
      on individual files.
      
      Conceptually, this change is NFC, but may not be so in practice,
      because the AST is only an approximation, and can produce
      different results depending on the order in which accesses are
      added. However, there is at least no impact on the number of promotions
      (licm.NumPromoted) in test-suite O3 configuration with this change.
      
      Differential Revision: https://reviews.llvm.org/D89264
      403da6a6
    • Djordje Todorovic's avatar
      [Debugify][OriginalDIMode] Export the report into JSON file · 9f41c03f
      Djordje Todorovic authored
      By using the original-di check with debugify in the combination with
      the llvm/utils/llvm-original-di-preservation.py it becomes very user
      friendly tool. An example of the HTML page with the issues
      related to debug info can be found at [0].
      
      [0] https://djolertrk.github.io/di-checker-html-report-example/
      
      Differential Revision: https://reviews.llvm.org/D82546
      9f41c03f
    • Petr Hosek's avatar
      [InstrProfiling] Don't generate __llvm_profile_runtime_user · c7712087
      Petr Hosek authored
      This is no longer needed, we can add __llvm_profile_runtime directly
      to llvm.compiler.used or llvm.used to achieve the same effect.
      
      Differential Revision: https://reviews.llvm.org/D98325
      c7712087
    • Ruiling Song's avatar
      [ValueMapper] Add debug output for metadata remapping · 8b7d3bed
      Ruiling Song authored
      This is useful for debugging which pointers are updated during remapping
      process.
      
      Differential Revision: https://reviews.llvm.org/D95775
      8b7d3bed
  6. Mar 10, 2021
Loading