Skip to content
  1. Nov 16, 2018
  2. Nov 15, 2018
  3. Nov 13, 2018
  4. Nov 11, 2018
  5. Nov 10, 2018
  6. Nov 09, 2018
    • Florian Hahn's avatar
      [IPSCCP,PM] Preserve DT in the new pass manager. · a1062f4b
      Florian Hahn authored
      After D45330, Dominators are required for IPSCCP and can be preserved.
      
      This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.
      
      Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima
      
      Reviewed By: chandlerc, kuhar, NutshellySima
      
      Differential Revision: https://reviews.llvm.org/D47259
      
      llvm-svn: 346486
      a1062f4b
  7. Nov 08, 2018
    • Pirama Arumuga Nainar's avatar
      [LTO] Drop non-prevailing definitions only if linkage is not local or appending · e61652a3
      Pirama Arumuga Nainar authored
      Summary:
      This fixes PR 37422
      
      In ELF, non-weak symbols can also be non-prevailing.  In this particular
      PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
      dropped - causing multiply-defined errors with lld.
      
      Also add a test, strong_non_prevailing.ll, to ensure that multiple
      copies of a strong symbol are dropped.
      
      To fix the test regressions exposed by this fix,
      - do not mark prevailing copies for symbols with 'appending' linkage.
      There's no one prevailing copy for such symbols.
      - fix the prevailing version in dead-strip-fulllto.ll
      - explicitly pass exported symbols to llvm-lto in fumcimport.ll and
      funcimport_var.ll
      
      Reviewers: tejohnson, pcc
      
      Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
      dang, srhines, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D54125
      
      llvm-svn: 346436
      e61652a3
    • whitequark's avatar
      [MergeFuncs] Improve ordering of equal functions · 73cb9784
      whitequark authored
      Summary:
      MergeFunctions currently tries to process strong functions before
      weak functions, because weak functions can simply call strong
      functions, while a strong/weak function cannot call a weak function
      (a backing strong function is needed).
      
      This patch additionally tries to process external functions before
      local functions, because we definitely have to keep the external
      function, but may be able to drop the local one (and definitely
      can if it is also unnamed_addr).
      
      Unfortunately, this exposes an existing bug in the implementation:
      The FnTree and FNodesInTree structures can currently go out of
      sync in the case where two weak functions are merged, because the
      function in FnTree/FNodesInTree is RAUWed. This leaves it behind in
      FnTree (this is intended, as it is the strong backing function which
      should be used for further merges), while it is replaced in
      FNodesInTree (this is not intended).
      
      This is fixed by switching FNodesInTree from using a ValueMap to
      using a DenseMap of AssertingVH.
      
      This exposes another minor issue: Currently FNodesInTree is not
      cleared after MergeFunctions finishes running. Currently, this is
      potentially dangerous (e.g. if something else wants to RAUW a function
      with a non-function), but at the very least it is unnecessary/inefficient.
      After the change to use AssertingVH it becomes more problematic,
      because there are certainly passes that remove functions.
      
      This issue is fixed by clearing FNodesInTree at the end of the pass.
      
      Reviewers: jfb, whitequark
      
      Reviewed By: whitequark
      
      Subscribers: rkruppe, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D53271
      
      llvm-svn: 346386
      73cb9784
    • whitequark's avatar
      [MergeFuncs] Call removeUsers() prior to unnamed_addr RAUW · 3580ac61
      whitequark authored
      Summary:
      For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities.
      
      Fix this by calling removeUsers() prior to RAUW.
      
      Reviewers: jfb, whitequark
      
      Reviewed By: whitequark
      
      Subscribers: rkruppe, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D53262
      
      llvm-svn: 346385
      3580ac61
  8. Nov 06, 2018
    • Teresa Johnson's avatar
      [ThinLTO] Split NotEligibleToImport into legality and inlinability flags · cb397461
      Teresa Johnson authored
      Summary:
      The NotEligibleToImport flag on the GlobalValueSummary was set if it
      isn't legal to import (e.g. because it references unpromotable locals)
      and when it can't be inlined (in which case importing is pointless).
      
      I split out the inlinable piece into a separate flag on the
      FunctionSummary (doesn't make sense for aliases or global variables),
      because in the future we may want to import for reasons other than
      inlining.
      
      Reviewers: davidxl
      
      Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D53345
      
      llvm-svn: 346261
      cb397461
  9. Nov 05, 2018
  10. Oct 31, 2018
  11. Oct 29, 2018
    • Vedant Kumar's avatar
      [HotColdSplitting] Allow outlining single-block cold regions · dd4be53b
      Vedant Kumar authored
      It can be profitable to outline single-block cold regions because they
      may be large.
      
      Allow outlining single-block regions if they have over some threshold of
      non-debug, non-terminator instructions. I chose 3 as the threshold after
      experimenting with several internal frameworks.
      
      In practice, reducing the threshold further did not give much
      improvement, whereas increasing it resulted in substantial regressions.
      
      Differential Revision: https://reviews.llvm.org/D53824
      
      llvm-svn: 345524
      dd4be53b
  12. Oct 25, 2018
    • Vedant Kumar's avatar
      [HotColdSplitting] Identify larger cold regions using domtree queries · c2990068
      Vedant Kumar authored
      The current splitting algorithm works in three stages:
      
        1) Identify cold blocks, then
        2) Use forward/backward propagation to mark hot blocks, then
        3) Grow a SESE region of blocks *outside* of the set of hot blocks and
        start outlining.
      
      While testing this pass on Apple internal frameworks I noticed that some
      kinds of control flow (e.g. loops) are never outlined, even though they
      unconditionally lead to / follow cold blocks. I noticed two other issues
      related to how cold regions are identified:
      
        - An inconsistency can arise in the internal state of the hotness
        propagation stage, as a block may end up in both the ColdBlocks set
        and the HotBlocks set. Further inconsistencies can arise as these sets
        do not match what's in ProfileSummaryInfo.
      
        - It isn't necessary to limit outlining to single-exit regions.
      
      This patch teaches the splitting algorithm to identify maximal cold
      regions and outline them. A maximal cold region is defined as the set of
      blocks post-dominated by a cold sink block, or dominated by that sink
      block. This approach can successfully outline loops in the cold path. As
      a side benefit, it maintains less internal state than the current
      approach.
      
      Due to a limitation in CodeExtractor, blocks within the maximal cold
      region which aren't dominated by a single entry point (a so-called "max
      ancestor") are filtered out.
      
      Results:
        - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
        47KB pre-patch, or a ~3x improvement. Did not see a performance impact
        across two runs.
        - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
        of TEXT were outlined. Ditto re: performance impact.
        - Outlining results improve marginally in the internal frameworks I
        tested.
      
      Follow-ups:
        - Outline more than once per function, outline large single basic
        blocks, & try to remove unconditional branches in outlined functions.
      
      Differential Revision: https://reviews.llvm.org/D53627
      
      llvm-svn: 345209
      c2990068
  13. Oct 24, 2018
    • Teresa Johnson's avatar
      [hot-cold-split] Name split functions with ".cold" suffix · c8dba682
      Teresa Johnson authored
      Summary:
      The current default of appending "_"+entry block label to the new
      extracted cold function breaks demangling. Change the deliminator from
      "_" to "." to enable demangling. Because the header block label will
      be empty for release compile code, use "extracted" after the "." when
      the label is empty.
      
      Additionally, add a mechanism for the client to pass in an alternate
      suffix applied after the ".", and have the hot cold split pass use
      "cold."+Count, where the Count is currently 1 but can be used to
      uniquely number multiple cold functions split out from the same function
      with D53588.
      
      Reviewers: sebpop, hiraditya
      
      Subscribers: llvm-commits, erik.pilkington
      
      Differential Revision: https://reviews.llvm.org/D53534
      
      llvm-svn: 345178
      c8dba682
    • Wei Mi's avatar
      [PM] keeping history when original SCC split and then merge into itself · 80a0c97e
      Wei Mi authored
      in the same round of SCC update.
      
      In https://reviews.llvm.org/rL309784, inline history is added to prevent
      infinite inlining across multiple run of inliner and SCC update, but the
      history will only be kept when new SCC is actually generated during SCC update.
      
      We found a case that SCC can be split and then merge into itself in the same
      round of SCC update, so the same SCC will be pop out from UR.CWorklist and
      then added back immediately, without any new SCC generated, that is why the
      existing patch cannot catch the infinite inline case.
      
      What the patch does is even if no new SCC is generated, if only the current
      SCC appears in UR.CWorklist again, then keep the inline history.
      
      Differential Revision: https://reviews.llvm.org/D52915
      
      llvm-svn: 345103
      80a0c97e
  14. Oct 23, 2018
    • Vedant Kumar's avatar
      [HotColdSplitting] Attach MinSize to outlined code · 50315461
      Vedant Kumar authored
      Outlined code is cold by assumption, so it makes sense to optimize it
      for minimal code size rather than performance.
      
      After r344869 moved the splitting pass to the end of the IR pipeline,
      this does not result in much of a code size reduction. This is probably
      because a comparatively small number backend transforms make use of the
      MinSize hint.
      
      Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of
      919 bytes of TEXT reduction. I didn't measure a significant performance
      impact.
      
      Differential Revision: https://reviews.llvm.org/D53518
      
      llvm-svn: 345072
      50315461
    • Jordan Rupprecht's avatar
      [DebugInfo][GlobalOpt] Fix -debugify for globalopt shrinking globals to booleans. · 2fed6ac1
      Jordan Rupprecht authored
      Summary:
      TryToShrinkGlobalToBoolean, when possible, will split store <value> + load <value> into store <bool> + select <bool ? value : 0>. This preserves DebugLoc during that pass.
      
      Fixes PR37959. The test case here is the simplified .ll for:
      
      ```
      static int foo;
      int bar() {
        foo = 5;
        return foo;
      }
      ```
      
      Reviewers: dblaikie, gbedwell, aprantl
      
      Reviewed By: dblaikie
      
      Subscribers: mehdi_amini, JDevlieghere, dexonsmith, llvm-commits
      
      Tags: #debug-info
      
      Differential Revision: https://reviews.llvm.org/D53531
      
      llvm-svn: 345046
      2fed6ac1
  15. Oct 22, 2018
  16. Oct 21, 2018
  17. Oct 18, 2018
  18. Oct 17, 2018
  19. Oct 16, 2018
  20. Oct 15, 2018
    • Sebastian Pop's avatar
      [hot-cold-split] fix static analysis of cold regions · 542e522b
      Sebastian Pop authored
      Make the code of blockEndsInUnreachable to match the function
      blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have
      added a note to make sure the code of this function will not be
      modified unless the back-end version is also modified.
      
      An early return before outlining has been added to avoid
      outlining the full function body when the first block in the
      function is marked cold.
      
      The static analysis of cold code has been amended to avoid
      marking the whole function as cold by back-propagation
      because the back-propagation would mark blocks with return
      statements as cold.
      
      The patch adds debug statements to help discover these problems.
      
      Differential Revision: https://reviews.llvm.org/D52904
      
      llvm-svn: 344558
      542e522b
    • Chandler Carruth's avatar
      [TI removal] Make variables declared as `TerminatorInst` and initialized · edb12a83
      Chandler Carruth authored
      by `getTerminator()` calls instead be declared as `Instruction`.
      
      This is the biggest remaining chunk of the usage of `getTerminator()`
      that insists on the narrow type and so is an easy batch of updates.
      Several files saw more extensive updates where this would cascade to
      requiring API updates within the file to use `Instruction` instead of
      `TerminatorInst`. All of these were trivial in nature (pervasively using
      `Instruction` instead just worked).
      
      llvm-svn: 344502
      edb12a83
  21. Oct 13, 2018
  22. Oct 12, 2018
  23. Oct 11, 2018
    • Richard Smith's avatar
      Add a flag to remap manglings when reading profile data information. · 6c676628
      Richard Smith authored
      This can be used to preserve profiling information across codebase
      changes that have widespread impact on mangled names, but across which
      most profiling data should still be usable. For example, when switching
      from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
      or even from a 32-bit to a 64-bit build.
      
      The user can provide a remapping file specifying parts of mangled names
      that should be treated as equivalent (eg, std::__1 should be treated as
      equivalent to std::__cxx11), and profile data will be treated as
      applying to a particular function if its name is equivalent to the name
      of a function in the profile data under the provided equivalences. See
      the documentation change for a description of how this is configured.
      
      Remapping is supported for both sample-based profiling and instruction
      profiling. We do not support remapping indirect branch target
      information, but all other profile data should be remapped
      appropriately.
      
      Support is only added for the new pass manager. If someone wants to also
      add support for this for the old pass manager, doing so should be
      straightforward.
      
      This is the LLVM side of Clang r344199.
      
      Reviewers: davidxl, tejohnson, dlj, erik.pilkington
      
      Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D51249
      
      llvm-svn: 344200
      6c676628
  24. Oct 10, 2018
    • George Burgess IV's avatar
      Replace most users of UnknownSize with LocationSize::unknown(); NFC · 6ef8002c
      George Burgess IV authored
      Moving away from UnknownSize is part of the effort to migrate us to
      LocationSizes (e.g. the cleanup promised in D44748).
      
      This doesn't entirely remove all of the uses of UnknownSize; some uses
      require tweaks to assume that UnknownSize isn't just some kind of int.
      This patch is intended to just be a trivial replacement for all places
      where LocationSize::unknown() will Just Work.
      
      llvm-svn: 344186
      6ef8002c
  25. Oct 08, 2018
    • Xin Tong's avatar
      [ThinLTO] Keep non-prevailing (linkonce|weak)_odr symbols live · bfdad33b
      Xin Tong authored
      Summary:
      If we have a symbol with (linkonce|weak)_odr linkage, we do not want
      to dead strip it even it is not prevailing.
      
      IR level (linkonce|weak)_odr symbol can become non-prevailing when we mix
      ELF objects and IR objects where the (linkonce|weak)_odr symbol in the ELF
      object is prevailing and the ones in the IR objects are not. Stripping
      them will prevent us from doing optimizations with them.
      
      By not dead stripping them, We will convert these symbols to
      available_externally linkage as a result of non-prevailing and eventually
      dropping them after inlining.
      
      I modified cache-prevailing.ll to use linkonce linkage as it is
      testing whether cache prevailing bit is effective or not, not
      we should treat linkonce_odr alive or not
      
      Reviewers: tejohnson, pcc
      
      Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D52893
      
      llvm-svn: 343970
      bfdad33b
  26. Oct 03, 2018
  27. Oct 01, 2018
  28. Sep 28, 2018
    • whitequark's avatar
      Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints" · 29b29801
      whitequark authored
      This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.
      
      Broke the bots, and should really be in Transforms/Coroutines
      instead.
      
      llvm-svn: 343337
      29b29801
    • whitequark's avatar
      [LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints · 937afbc3
      whitequark authored
      Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.
      
      Reviewers: whitequark, deadalnix
      
      Reviewed By: whitequark
      
      Subscribers: mehdi_amini, modocache, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D51642
      
      llvm-svn: 343336
      937afbc3
    • Florian Hahn's avatar
      Revert r343308: [LoopInterchange] Turn into a loop pass. · 8d72ecc3
      Florian Hahn authored
      llvm-svn: 343310
      8d72ecc3
    • Florian Hahn's avatar
      [LoopInterchange] Turn into a loop pass. · 0694c159
      Florian Hahn authored
      This patch turns LoopInterchange into a loop pass. It now only
      considers top-level loops and tries to move the innermost loop to the
      optimal position within the loop nest. By only looking at top-level
      loops, we might miss a few opportunities the function pass would get
      (e.g. if we have a loop nest of 3 loops, in the function pass
      we might process loops at level 1 and 2 and move the inner most loop to
      level 1, and then we process loops at levels 0, 1, 2 and interchange
      again, because we now have a different inner loop). But I think it would
      be better to handle such cases by picking the best inner loop from the
      start and avoid re-visiting the same loops again.
      
      The biggest advantage of it being a function pass is that it interacts
      nicely with the other loop passes. Without this patch, there are some
      performance regressions on AArch64 with loop interchanging enabled,
      where no loops were interchanged, but we missed out on some other loop
      optimizations.
      
      It also removes the SimplifyCFG run. We are just changing branches, so
      the CFG should not be more complicated, besides the additional 'unique'
      preheaders this pass might create.
      
      
      Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00
      
      Reviewed By: xbolva00
      
      Differential Revision: https://reviews.llvm.org/D51702
      
      llvm-svn: 343308
      0694c159
Loading