Skip to content
  1. Jan 22, 2017
  2. Jan 21, 2017
    • Anmol P. Paralkar's avatar
      MergeFunctions: Preserve debug info in thunks, under option -mergefunc-preserve-debug-info · 910dc8de
      Anmol P. Paralkar authored
      Summary:
      Under option -mergefunc-preserve-debug-info we:
      - Do not create a new function for a thunk.
      - Retain the debug info for a thunk's parameters (and associated
        instructions for the debug info) from the entry block.
        Note: -debug will display the algorithm at work.
      - Create debug-info for the call (to the shared implementation) made by
        a thunk and its return value.
      - Erase the rest of the function, retaining the (minimally sized) entry
        block to create a thunk.
      - Preserve a thunk's call site to point to the thunk even when both occur
        within the same translation unit, to aid debugability. Note that this
        behaviour differs from the underlying -mergefunc implementation which
        modifies the thunk's call site to point to the shared implementation
        when both occur within the same translation unit.
      
      Reviewers: echristo, eeckstein, dblaikie, aprantl, friss
      
      Reviewed By: aprantl
      
      Subscribers: davide, fhahn, jfb, mehdi_amini, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28075
      
      llvm-svn: 292702
      910dc8de
    • Peter Collingbourne's avatar
      LowerTypeTests: Fix use-after-free. Found by asan/msan. · b365d921
      Peter Collingbourne authored
      llvm-svn: 292700
      b365d921
    • Peter Collingbourne's avatar
      LowerTypeTests: Simplify; always create SizeM1 with type IntPtrTy, move... · 67addbca
      Peter Collingbourne authored
      LowerTypeTests: Simplify; always create SizeM1 with type IntPtrTy, move initialization out of if statement.
      
      llvm-svn: 292674
      67addbca
  3. Jan 20, 2017
  4. Jan 19, 2017
  5. Jan 18, 2017
  6. Jan 13, 2017
  7. Jan 12, 2017
    • Teresa Johnson's avatar
      [ThinLTO] Import static functions from the same module as caller · 83aaf358
      Teresa Johnson authored
      Summary:
      We can sometimes end up with multiple copies of a local function that
      have the same GUID in the index. This happens when there are local
      functions with the same name that are in different source files with the
      same name (but in different directories), and they were compiled in
      their own directory so had the same path at compile time.
      
      In this case make sure we import the copy in the caller's module. While
      it isn't a correctness problem (the renamed reference which is based on the
      module IR hash will be unique since the module must have had an
      externally visible function that was imported), importing the wrong copy
      will result in lost performance opportunity since it won't be referenced
      and inlined.
      
      Reviewers: mehdi_amini
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28440
      
      llvm-svn: 291841
      83aaf358
  8. Jan 11, 2017
  9. Jan 07, 2017
  10. Jan 06, 2017
  11. Jan 05, 2017
    • Teresa Johnson's avatar
      ThinLTO: add early "dead-stripping" on the Index · 6c475a75
      Teresa Johnson authored
      Summary:
      Using the linker-supplied list of "preserved" symbols, we can compute
      the list of "dead" symbols, i.e. the one that are not reachable from
      a "preserved" symbol transitively on the reference graph.
      Right now we are using this information to mark these functions as
      non-eligible for import.
      
      The impact is two folds:
      - Reduction of compile time: we don't import these functions anywhere
        or import the function these symbols are calling.
      - The limited number of import/export leads to better internalization.
      
      Patch originally by Mehdi Amini.
      
      Reviewers: mehdi_amini, pcc
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D23488
      
      llvm-svn: 291177
      6c475a75
    • Teresa Johnson's avatar
      [ThinLTO] Subsume all importing checks into a single flag · 519465b9
      Teresa Johnson authored
      Summary:
      This adds a new summary flag NotEligibleToImport that subsumes
      several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
      and IsNotViableToInline). It also subsumes the checking of references
      on the summary that was being done during the thin link by
      eligibleForImport() for each candidate. It is much more efficient to
      do that checking once during the per-module summary build and record
      it in the summary.
      
      Reviewers: mehdi_amini
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28169
      
      llvm-svn: 291108
      519465b9
    • Peter Collingbourne's avatar
      IR: Module summary representation for type identifiers; summary test... · b2ce2b68
      Peter Collingbourne authored
      IR: Module summary representation for type identifiers; summary test scaffolding for lowertypetests.
      
      Set up basic YAML I/O support for module summaries, plumb the summary into
      the pass and add a few command line flags to test YAML I/O support. Bitcode
      support to come separately, as will the code in LowerTypeTests that actually
      uses the summary. Also add a couple of tests that pass by virtue of the pass
      doing nothing with the summary (which happens to be the correct thing to do
      for those tests).
      
      Differential Revision: https://reviews.llvm.org/D28041
      
      llvm-svn: 291069
      b2ce2b68
  12. Jan 04, 2017
    • Mehdi Amini's avatar
      Use lazy-loading of Metadata in MetadataLoader when importing is enabled (NFC) · 19ef4fad
      Mehdi Amini authored
      Summary:
      This is a relatively simple scheme: we use the index emitted in the
      bitcode to avoid loading all the global metadata. Instead we load
      the index with their position in the bitcode so that we can load each
      of them individually. Materializing the global metadata block in this
      condition only triggers loading the named metadata, and the ones
      referenced from there (transitively). When materializing a function,
      metadata from the global block are loaded lazily as they are
      referenced.
      
      Two main current limitations are:
      
      1) Global values other than functions are not materialized on demand,
      so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records
      (and their transitive dependencies).
      2) When we load a single metadata, we don't recurse on the operands,
      instead we use a placeholder or a temporary metadata. Unfortunately
      tepmorary nodes are very expensive. This is why we don't have it
      always enabled and only for importing.
      
      These two limitations can be lifted in a subsequent improvement if
      needed.
      
      With this change, the total link time of opt with ThinLTO and Debug
      Info enabled is going down from 282s to 224s (~20%).
      
      Reviewers: pcc, tejohnson, dexonsmith
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D28113
      
      llvm-svn: 291027
      19ef4fad
  13. Jan 02, 2017
  14. Dec 28, 2016
    • Chandler Carruth's avatar
      [PM] Teach the inliner's call graph update to handle inserting new edges · 9900d18b
      Chandler Carruth authored
      when they are call edges at the leaf but may (transitively) be reached
      via ref edges.
      
      It turns out there is a simple rule: insert everything as a ref edge
      which is a safe conservative default. Then we let the existing update
      logic handle promoting some of those to call edges.
      
      Note that it would be fairly cheap to make these call edges right away
      if that is desirable by testing whether there is some existing call path
      from the source to the target. It just seemed like slightly more
      complexity in this code path that isn't strictly necessary. If anyone
      feels strongly about handling this differently I'm happy to change it.
      
      llvm-svn: 290649
      9900d18b
  15. Dec 27, 2016
    • Chandler Carruth's avatar
      [PM] Add one of the features left out of the initial inliner patch: · 141bf5d1
      Chandler Carruth authored
      skipping indirectly recursive inline chains.
      
      To do this, we implicitly build an inline stack for each callsite and
      check prior to inlining that doing so would not form a cycle. This uses
      the exact same technique and even shares some code with the legacy PM
      inliner.
      
      This solution remains deeply unsatisfying to me because it means we
      cannot actually iterate the inliner externally. Doing so would not be
      able to easily detect and avoid such cycles. Some day I would very much
      like to have a solution that works without this internal state to detect
      cycles, but this is not that day.
      
      llvm-svn: 290590
      141bf5d1
    • Chandler Carruth's avatar
      [PM] Teach the inliner in the new PM to merge attributes after inlining. · 03130d98
      Chandler Carruth authored
      Also enable the new PM in the attributes test case which caught this
      issue.
      
      llvm-svn: 290572
      03130d98
    • Chandler Carruth's avatar
      [PM] Teach the always inliner in the new pass manager to support · 6e9bb7e0
      Chandler Carruth authored
      removing fully-dead comdats without removing dead entries in comdats
      with live members.
      
      This factors the core logic out of the current inliner's internals to
      a reusable utility and leverages that in both places. The factored out
      code should also be (minorly) more efficient in cases where we have very
      few dead functions or dead comdats to consider.
      
      I've added a test case to cover this behavior of the always inliner.
      This is the last significant bug in the new PM's always inliner I've
      found (so far).
      
      llvm-svn: 290557
      6e9bb7e0
  16. Dec 26, 2016
  17. Dec 24, 2016
    • Chandler Carruth's avatar
      [PM] Teach the always inlining test case to be much more strict about · 4eaff12b
      Chandler Carruth authored
      whether functions are removed, and fix the new PM's always inliner to
      actually pass this test.
      
      Without this, the new PM's always inliner leaves all the functions
      kicking around which won't work out very well given the semantics of
      always inline.
      
      Doing this really highlights how frustrating the current alwaysinline
      semantic contract is though -- why can we put it on *external*
      functions, etc?
      
      Also I've added a number of tricky and interesting test cases for
      removing functions with the always inliner. There is one remaining case
      not handled -- fully removing comdats -- and I've left a FIXME about
      this.
      
      llvm-svn: 290457
      4eaff12b
  18. Dec 23, 2016
  19. Dec 22, 2016
  20. Dec 21, 2016
    • Adam Nemet's avatar
      [LDist] Match behavior between invoking via optimization pipeline or opt -loop-distribute · 32e6a34c
      Adam Nemet authored
      In r267672, where the loop distribution pragma was introduced, I tried
      it hard to keep the old behavior for opt: when opt is invoked
      with -loop-distribute, it should distribute the loop (it's off by
      default when ran via the optimization pipeline).
      
      As MichaelZ has discovered this has the unintended consequence of
      breaking a very common developer work-flow to reproduce compilations
      using opt: First you print the pass pipeline of clang
      with -debug-pass=Arguments and then invoking opt with the returned
      arguments.
      
      clang -debug-pass will include -loop-distribute but the pass is invoked
      with default=off so nothing happens unless the loop carries the pragma.
      While through opt (default=on) we will try to distribute all loops.
      
      This changes opt's default to off as well to match clang.  The tests are
      modified to explicitly enable the transformation.
      
      llvm-svn: 290235
      32e6a34c
    • Peter Collingbourne's avatar
      IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI. · 598bd2a2
      Peter Collingbourne authored
      No existing client is passing a non-null value here. This will come back
      in a slightly different form as part of the type identifier summary work.
      
      Differential Revision: https://reviews.llvm.org/D28006
      
      llvm-svn: 290222
      598bd2a2
  21. Dec 20, 2016
    • Chandler Carruth's avatar
      [PM] Provide an initial, minimal port of the inliner to the new pass manager. · 1d963114
      Chandler Carruth authored
      This doesn't implement *every* feature of the existing inliner, but
      tries to implement the most important ones for building a functional
      optimization pipeline and beginning to sort out bugs, regressions, and
      other problems.
      
      Notable, but intentional omissions:
      - No alloca merging support. Why? Because it isn't clear we want to do
        this at all. Active discussion and investigation is going on to remove
        it, so for simplicity I omitted it.
      - No support for trying to iterate on "internally" devirtualized calls.
        Why? Because it adds what I suspect is inappropriate coupling for
        little or no benefit. We will have an outer iteration system that
        tracks devirtualization including that from function passes and
        iterates already. We should improve that rather than approximate it
        here.
      - Optimization remarks. Why? Purely to make the patch smaller, no other
        reason at all.
      
      The last one I'll probably work on almost immediately. But I wanted to
      skip it in the initial patch to try to focus the change as much as
      possible as there is already a lot of code moving around and both of
      these *could* be skipped without really disrupting the core logic.
      
      A summary of the different things happening here:
      
      1) Adding the usual new PM class and rigging.
      
      2) Fixing minor underlying assumptions in the inline cost analysis or
         inline logic that don't generally hold in the new PM world.
      
      3) Adding the core pass logic which is in essence a loop over the calls
         in the nodes in the call graph. This is a bit duplicated from the old
         inliner, but only a handful of lines could realistically be shared.
         (I tried at first, and it really didn't help anything.) All told,
         this is only about 100 lines of code, and most of that is the
         mechanics of wiring up analyses from the new PM world.
      
      4) Updating the LazyCallGraph (in the new PM) based on the *newly
         inlined* calls and references. This is very minimal because we cannot
         form cycles.
      
      5) When inlining removes the last use of a function, eagerly nuking the
         body of the function so that any "one use remaining" inline cost
         heuristics are immediately refined, and queuing these functions to be
         completely deleted once inlining is complete and the call graph
         updated to reflect that they have become dead.
      
      6) After all the inlining for a particular function, updating the
         LazyCallGraph and the CGSCC pass manager to reflect the
         function-local simplifications that are done immediately and
         internally by the inline utilties. These are the exact same
         fundamental set of CG updates done by arbitrary function passes.
      
      7) Adding a bunch of test cases to specifically target CGSCC and other
         subtle aspects in the new PM world.
      
      Many thanks to the careful review from Easwaran and Sanjoy and others!
      
      Differential Revision: https://reviews.llvm.org/D24226
      
      llvm-svn: 290161
      1d963114
Loading