Skip to content
  1. Dec 28, 2016
    • Chandler Carruth's avatar
      [PM] Teach the inliner's call graph update to handle inserting new edges · 9900d18b
      Chandler Carruth authored
      when they are call edges at the leaf but may (transitively) be reached
      via ref edges.
      
      It turns out there is a simple rule: insert everything as a ref edge
      which is a safe conservative default. Then we let the existing update
      logic handle promoting some of those to call edges.
      
      Note that it would be fairly cheap to make these call edges right away
      if that is desirable by testing whether there is some existing call path
      from the source to the target. It just seemed like slightly more
      complexity in this code path that isn't strictly necessary. If anyone
      feels strongly about handling this differently I'm happy to change it.
      
      llvm-svn: 290649
      9900d18b
  2. Dec 27, 2016
    • Chandler Carruth's avatar
      [PM] Add one of the features left out of the initial inliner patch: · 141bf5d1
      Chandler Carruth authored
      skipping indirectly recursive inline chains.
      
      To do this, we implicitly build an inline stack for each callsite and
      check prior to inlining that doing so would not form a cycle. This uses
      the exact same technique and even shares some code with the legacy PM
      inliner.
      
      This solution remains deeply unsatisfying to me because it means we
      cannot actually iterate the inliner externally. Doing so would not be
      able to easily detect and avoid such cycles. Some day I would very much
      like to have a solution that works without this internal state to detect
      cycles, but this is not that day.
      
      llvm-svn: 290590
      141bf5d1
    • Chandler Carruth's avatar
      [PM] Teach the inliner in the new PM to merge attributes after inlining. · 03130d98
      Chandler Carruth authored
      Also enable the new PM in the attributes test case which caught this
      issue.
      
      llvm-svn: 290572
      03130d98
    • Chandler Carruth's avatar
      [PM] Teach the always inliner in the new pass manager to support · 6e9bb7e0
      Chandler Carruth authored
      removing fully-dead comdats without removing dead entries in comdats
      with live members.
      
      This factors the core logic out of the current inliner's internals to
      a reusable utility and leverages that in both places. The factored out
      code should also be (minorly) more efficient in cases where we have very
      few dead functions or dead comdats to consider.
      
      I've added a test case to cover this behavior of the always inliner.
      This is the last significant bug in the new PM's always inliner I've
      found (so far).
      
      llvm-svn: 290557
      6e9bb7e0
  3. Dec 26, 2016
  4. Dec 24, 2016
    • Chandler Carruth's avatar
      [PM] Teach the always inlining test case to be much more strict about · 4eaff12b
      Chandler Carruth authored
      whether functions are removed, and fix the new PM's always inliner to
      actually pass this test.
      
      Without this, the new PM's always inliner leaves all the functions
      kicking around which won't work out very well given the semantics of
      always inline.
      
      Doing this really highlights how frustrating the current alwaysinline
      semantic contract is though -- why can we put it on *external*
      functions, etc?
      
      Also I've added a number of tricky and interesting test cases for
      removing functions with the always inliner. There is one remaining case
      not handled -- fully removing comdats -- and I've left a FIXME about
      this.
      
      llvm-svn: 290457
      4eaff12b
  5. Dec 23, 2016
  6. Dec 22, 2016
  7. Dec 21, 2016
    • Adam Nemet's avatar
      [LDist] Match behavior between invoking via optimization pipeline or opt -loop-distribute · 32e6a34c
      Adam Nemet authored
      In r267672, where the loop distribution pragma was introduced, I tried
      it hard to keep the old behavior for opt: when opt is invoked
      with -loop-distribute, it should distribute the loop (it's off by
      default when ran via the optimization pipeline).
      
      As MichaelZ has discovered this has the unintended consequence of
      breaking a very common developer work-flow to reproduce compilations
      using opt: First you print the pass pipeline of clang
      with -debug-pass=Arguments and then invoking opt with the returned
      arguments.
      
      clang -debug-pass will include -loop-distribute but the pass is invoked
      with default=off so nothing happens unless the loop carries the pragma.
      While through opt (default=on) we will try to distribute all loops.
      
      This changes opt's default to off as well to match clang.  The tests are
      modified to explicitly enable the transformation.
      
      llvm-svn: 290235
      32e6a34c
    • Peter Collingbourne's avatar
      IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI. · 598bd2a2
      Peter Collingbourne authored
      No existing client is passing a non-null value here. This will come back
      in a slightly different form as part of the type identifier summary work.
      
      Differential Revision: https://reviews.llvm.org/D28006
      
      llvm-svn: 290222
      598bd2a2
  8. Dec 20, 2016
    • Chandler Carruth's avatar
      [PM] Provide an initial, minimal port of the inliner to the new pass manager. · 1d963114
      Chandler Carruth authored
      This doesn't implement *every* feature of the existing inliner, but
      tries to implement the most important ones for building a functional
      optimization pipeline and beginning to sort out bugs, regressions, and
      other problems.
      
      Notable, but intentional omissions:
      - No alloca merging support. Why? Because it isn't clear we want to do
        this at all. Active discussion and investigation is going on to remove
        it, so for simplicity I omitted it.
      - No support for trying to iterate on "internally" devirtualized calls.
        Why? Because it adds what I suspect is inappropriate coupling for
        little or no benefit. We will have an outer iteration system that
        tracks devirtualization including that from function passes and
        iterates already. We should improve that rather than approximate it
        here.
      - Optimization remarks. Why? Purely to make the patch smaller, no other
        reason at all.
      
      The last one I'll probably work on almost immediately. But I wanted to
      skip it in the initial patch to try to focus the change as much as
      possible as there is already a lot of code moving around and both of
      these *could* be skipped without really disrupting the core logic.
      
      A summary of the different things happening here:
      
      1) Adding the usual new PM class and rigging.
      
      2) Fixing minor underlying assumptions in the inline cost analysis or
         inline logic that don't generally hold in the new PM world.
      
      3) Adding the core pass logic which is in essence a loop over the calls
         in the nodes in the call graph. This is a bit duplicated from the old
         inliner, but only a handful of lines could realistically be shared.
         (I tried at first, and it really didn't help anything.) All told,
         this is only about 100 lines of code, and most of that is the
         mechanics of wiring up analyses from the new PM world.
      
      4) Updating the LazyCallGraph (in the new PM) based on the *newly
         inlined* calls and references. This is very minimal because we cannot
         form cycles.
      
      5) When inlining removes the last use of a function, eagerly nuking the
         body of the function so that any "one use remaining" inline cost
         heuristics are immediately refined, and queuing these functions to be
         completely deleted once inlining is complete and the call graph
         updated to reflect that they have become dead.
      
      6) After all the inlining for a particular function, updating the
         LazyCallGraph and the CGSCC pass manager to reflect the
         function-local simplifications that are done immediately and
         internally by the inline utilties. These are the exact same
         fundamental set of CG updates done by arbitrary function passes.
      
      7) Adding a bunch of test cases to specifically target CGSCC and other
         subtle aspects in the new PM world.
      
      Many thanks to the careful review from Easwaran and Sanjoy and others!
      
      Differential Revision: https://reviews.llvm.org/D24226
      
      llvm-svn: 290161
      1d963114
    • Adrian Prantl's avatar
      [IR] Remove the DIExpression field from DIGlobalVariable. · bceaaa96
      Adrian Prantl authored
      This patch implements PR31013 by introducing a
      DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
      DIExpression.
      
      Currently, DIGlobalVariables holds a DIExpression. This is not the
      best way to model this:
      
      (1) The DIGlobalVariable should describe the source level variable,
          not how to get to its location.
      
      (2) It makes it unsafe/hard to update the expressions when we call
          replaceExpression on the DIGLobalVariable.
      
      (3) It makes it impossible to represent a global variable that is in
          more than one location (e.g., a variable with multiple
          DW_OP_LLVM_fragment-s).  We also moved away from attaching the
          DIExpression to DILocalVariable for the same reasons.
      
      This reapplies r289902 with additional testcase upgrades and a change
      to the Bitcode record for DIGlobalVariable, that makes upgrading the
      old format unambiguous also for variables without DIExpressions.
      
      <rdar://problem/29250149>
      https://llvm.org/bugs/show_bug.cgi?id=31013
      Differential Revision: https://reviews.llvm.org/D26769
      
      llvm-svn: 290153
      bceaaa96
  9. Dec 19, 2016
  10. Dec 17, 2016
  11. Dec 16, 2016
    • Adrian Prantl's avatar
      Revert "[IR] Remove the DIExpression field from DIGlobalVariable." · 73ec0656
      Adrian Prantl authored
      This reverts commit 289920 (again).
      I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable
      has not DIExpression. Unfortunately it is not possible to safely upgrade
      these variables without adding a flag to the bitcode record indicating which
      version they are.
      My plan of record is to roll the planned follow-up patch that adds a
      unit: field to DIGlobalVariable into this patch before recomitting.
      This way we only need one Bitcode upgrade for both changes (with a
      version flag in the bitcode record to safely distinguish the record
      formats).
      
      Sorry for the churn!
      
      llvm-svn: 289982
      73ec0656
    • Adrian Prantl's avatar
      [IR] Remove the DIExpression field from DIGlobalVariable. · 74a835cd
      Adrian Prantl authored
      This patch implements PR31013 by introducing a
      DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
      DIExpression.
      
      Currently, DIGlobalVariables holds a DIExpression. This is not the
      best way to model this:
      
      (1) The DIGlobalVariable should describe the source level variable,
          not how to get to its location.
      
      (2) It makes it unsafe/hard to update the expressions when we call
          replaceExpression on the DIGLobalVariable.
      
      (3) It makes it impossible to represent a global variable that is in
          more than one location (e.g., a variable with multiple
          DW_OP_LLVM_fragment-s).  We also moved away from attaching the
          DIExpression to DILocalVariable for the same reasons.
      
      This reapplies r289902 with additional testcase upgrades.
      
      <rdar://problem/29250149>
      https://llvm.org/bugs/show_bug.cgi?id=31013
      Differential Revision: https://reviews.llvm.org/D26769
      
      llvm-svn: 289920
      74a835cd
    • Teresa Johnson's avatar
      [ThinLTO] Thin link efficiency: More efficient export list computation · edddca22
      Teresa Johnson authored
      Summary:
      Instead of checking whether a global referenced by a function being
      imported is defined in the same module, speculatively always add the
      referenced globals to the module's export list. After all imports are
      computed, for each module prune any not in its defined set from its
      export list.
      
      For a huge C++ app with aggressive importing thresholds, even with
      D27687 we spent a lot of time invoking modulePath() from
      exportGlobalInModule (modulePath() was still the 2nd hottest routine in
      profile). The reason is that with comdat/linkonce the summary lists for
      each GUID can be long. For the app in question, for example, we were
      invoking exportGlobalInModule almost 2 million times, and we traversed
      an average of 63 entries in the summary list each time.
      
      This patch reduced the thin link time for the app by about 10% (on top
      of D27687) when using aggressive importing thresholds, and about 3.5% on
      average with default importing thresholds.
      
      Reviewers: mehdi_amini
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27755
      
      llvm-svn: 289918
      edddca22
    • Adrian Prantl's avatar
      Revert "[IR] Remove the DIExpression field from DIGlobalVariable." · 03c6d31a
      Adrian Prantl authored
      This reverts commit 289902 while investigating bot berakage.
      
      llvm-svn: 289906
      03c6d31a
    • Peter Collingbourne's avatar
      Add missing library dep. · 7a4be21d
      Peter Collingbourne authored
      llvm-svn: 289903
      7a4be21d
    • Adrian Prantl's avatar
      [IR] Remove the DIExpression field from DIGlobalVariable. · ce139357
      Adrian Prantl authored
      This patch implements PR31013 by introducing a
      DIGlobalVariableExpression that holds a pair of DIGlobalVariable and
      DIExpression.
      
      Currently, DIGlobalVariables holds a DIExpression. This is not the
      best way to model this:
      
      (1) The DIGlobalVariable should describe the source level variable,
          not how to get to its location.
      
      (2) It makes it unsafe/hard to update the expressions when we call
          replaceExpression on the DIGLobalVariable.
      
      (3) It makes it impossible to represent a global variable that is in
          more than one location (e.g., a variable with multiple
          DW_OP_LLVM_fragment-s).  We also moved away from attaching the
          DIExpression to DILocalVariable for the same reasons.
      
      <rdar://problem/29250149>
      https://llvm.org/bugs/show_bug.cgi?id=31013
      Differential Revision: https://reviews.llvm.org/D26769
      
      llvm-svn: 289902
      ce139357
    • Peter Collingbourne's avatar
      IPO: Introduce ThinLTOBitcodeWriter pass. · 1398a32e
      Peter Collingbourne authored
      This pass prepares a module containing type metadata for ThinLTO by splitting
      it into regular and thin LTO parts if possible, and writing both parts to
      a multi-module bitcode file. Modules that do not contain type metadata are
      written unmodified as a single module.
      
      All globals with type metadata are added to the regular LTO module, and
      the rest are added to the thin LTO module.
      
      Differential Revision: https://reviews.llvm.org/D27324
      
      llvm-svn: 289899
      1398a32e
    • Teresa Johnson's avatar
      [ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC) · 19f2aa78
      Teresa Johnson authored
      Summary:
      We were reinvoking exportGlobalInModule numerous times redundantly.
      No need to re-export globals referenced by a global that was already
      imported from its module. This resulted in a large speedup in the thin
      link for a big application, particularly when importing aggressiveness
      was cranked up.
      
      Reviewers: mehdi_amini
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27687
      
      llvm-svn: 289896
      19f2aa78
  12. Dec 15, 2016
    • Teresa Johnson's avatar
      [ThinLTO] Revert part of r289843 that belonged to another patch. · eb0ac241
      Teresa Johnson authored
      The code change for D27687 accidentally got committed along with the
      main change in r289843. Revert it temporarily, so that I can recommit it
      along with its test as intended.
      
      llvm-svn: 289875
      eb0ac241
    • Teresa Johnson's avatar
      [ThinLTO] Remove stale comment (NFC) · 0c3f57b1
      Teresa Johnson authored
      This should have been removed with r288446.
      
      llvm-svn: 289871
      0c3f57b1
    • Teresa Johnson's avatar
      [ThinLTO] Thin link efficiency: skip candidate added later with higher threshold (NFC) · 475b51a7
      Teresa Johnson authored
      Summary:
      Thin link efficiency improvement. After adding an importing candidate to
      the worklist we might have later added it again with a higher threshold.
      Skip it when popped from the worklist if we recorded a higher threshold
      than the current worklist entry, it will get processed again at the
      higher threshold when that entry is popped.
      
      This required adding the summary's GUID to the worklist, so that it can
      be used to query the recorded highest threshold for it when we pop from the
      worklist.
      
      Reviewers: mehdi_amini
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27696
      
      llvm-svn: 289867
      475b51a7
    • Teresa Johnson's avatar
      [ThinLTO] Ensure callees get hot threshold when first seen on cold path · 1b859a23
      Teresa Johnson authored
      This is split out from D27696, since it turned out to be a bug fix and
      not part of the NFC efficiency change.
      
      Keep the same adjusted (possibly decayed) threshold in both the worklist
      and the ImportList. Otherwise if we encountered it first along a cold
      path, the callee would be added to the worklist with a lower decayed
      threshold than when it is later encountered along a hot path. But the
      logic uses the threshold recorded in the ImportList entry to check if
      we should re-add it, and without this patch the threshold recorded there
      is the same along both paths so we don't re-add it. Using the
      same possibly decayed threshold in the ImportList ensures we re-add it
      later with the higher non-decayed hot path threshold.
      
      llvm-svn: 289843
      1b859a23
    • Hal Finkel's avatar
      Remove the AssumptionCache · 3ca4a6bc
      Hal Finkel authored
      After r289755, the AssumptionCache is no longer needed. Variables affected by
      assumptions are now found by using the new operand-bundle-based scheme. This
      new scheme is more computationally efficient, and also we need much less
      code...
      
      llvm-svn: 289756
      3ca4a6bc
  13. Dec 14, 2016
    • Dehao Chen's avatar
      Only sets profile summary when it was not preset. · 40dd8c51
      Dehao Chen authored
      Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass.
      
      Reviewers: eraman, davidxl
      
      Subscribers: mehdi_amini, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27733
      
      llvm-svn: 289725
      40dd8c51
    • Dehao Chen's avatar
      Fix the bug in r289714 (NFC). · fb699619
      Dehao Chen authored
      llvm-svn: 289724
      fb699619
    • Dehao Chen's avatar
      Create SampleProfileLoader pass in llvm instead of clang · a99e082e
      Dehao Chen authored
      Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.
      
      Reviewers: tejohnson, davidxl, dnovillo
      
      Subscribers: llvm-commits, mehdi_amini
      
      Differential Revision: https://reviews.llvm.org/D27743
      
      llvm-svn: 289714
      a99e082e
    • Geoff Berry's avatar
      [GVNHoist] Move GVNHoist to function simplification part of pipeline. · ca11a1e1
      Geoff Berry authored
      Summary:
      Move GVNHoist to later in the optimization pipeline, specifically, to
      the function simplification part of the pipeline.  The new pipeline
      location allows GVNHoist to run on a function after its callees have
      been inlined but before the function has been considered for inlining
      into its callers, exposing more opportunities for hoisting.
      
      Performance results on AArch64 kryo:
      Improvements:
        Benchmarks/CoyoteBench/fftbench  -24.952%
        spec2006/bzip2                    -4.071%
        internal bmark                    -3.177%
        Benchmarks/PAQ8p/paq8p            -1.754%
        spec2000/perlbmk                  -1.328%
        spec2006/h264ref                  -1.140%
      
      Regressions:
        internal bmark                    +1.818%
        Benchmarks/mafft/pairlocalalign   +1.084%
      
      Reviewers: sebpop, dberlin, hiraditya
      
      Subscribers: aemerson, mehdi_amini, mcrosier, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D27722
      
      llvm-svn: 289696
      ca11a1e1
    • Dehao Chen's avatar
      revert r289669 which breaks bots · 23025f84
      Dehao Chen authored
      llvm-svn: 289676
      23025f84
    • Dehao Chen's avatar
      Create SampleProfileLoader pass in llvm instead of clang · cb61c94d
      Dehao Chen authored
      Summary: We used to create SampleProfileLoader pass in clang. This makes LTO/ThinLTO unable to add this pass in the linker plugin. This patch moves the SampleProfileLoader pass creation from clang to llvm pass manager builder.
      
      Reviewers: tejohnson, davidxl, dnovillo
      
      Subscribers: llvm-commits, mehdi_amini
      
      Differential Revision: https://reviews.llvm.org/D27743
      
      llvm-svn: 289669
      cb61c94d
  14. Dec 13, 2016
  15. Dec 12, 2016
  16. Dec 09, 2016
Loading