Skip to content
  1. Sep 28, 2015
    • Sean Silva's avatar
      [GlobalOpt] Sort members of llvm.used deterministically · ace7818c
      Sean Silva authored
      Patch by Jake VanAdrighem!
      
      Summary:
      Fix the way we sort the llvm.used and llvm.compiler.used members.
      
      This bug seems to have been introduced in rL183756 through a set of improper casts to GlobalValue*. In subsequent patches this problem was missed and transformed into a getName call on a ConstantExpr.
      
      Reviewers: silvas
      
      Subscribers: silvas, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D12851
      
      llvm-svn: 248728
      ace7818c
  2. Sep 24, 2015
    • Michael Zolotukhin's avatar
      Add CFG Simplification pass after Loop Unswitching. · 74621cce
      Michael Zolotukhin authored
      Loop unswitching produces conditional branches with constant condition,
      and it's beneficial for later passes to clean this up with simplify-cfg.
      We do this after the second invocation of loop-unswitch, but not after
      the first one. Not doing so might cause problem for passes like
      LoopUnroll, whose estimate of loop body size would be less accurate.
      
      Reviewers: hfinkel
      
      Differential Revision: http://reviews.llvm.org/D13064
      
      llvm-svn: 248460
      74621cce
  3. Sep 23, 2015
    • David Majnemer's avatar
      [DeadArgElim] Split the invoke successor edge · fa36bde2
      David Majnemer authored
      Invoking a function which returns an aggregate can sometimes be
      transformed to return a scalar value.  However, this means that we need
      to create an insertvalue instruction(s) to recreate the correct
      aggregate type.  We achieved this by inserting an insertvalue
      instruction at the invoke's normal successor.  However, this is not
      feasible if the normal successor uses the invoke's return value inside a
      PHI node.
      
      Instead, split the edge between the invoke and the unwind successor and
      create the insertvalue instruction in the new basic block.  The new
      basic block's successor will be the old invoke successor which leaves
      us with IR which is well behaved.
      
      This fixes PR24906.
      
      llvm-svn: 248387
      fa36bde2
  4. Sep 21, 2015
  5. Sep 15, 2015
  6. Sep 14, 2015
    • David Blaikie's avatar
      [opaque pointer types] Switch a few cases of getElementType over, since I had... · 6614d8d2
      David Blaikie authored
      [opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway
      
      llvm-svn: 247610
      6614d8d2
    • David Blaikie's avatar
      Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather... · 16a2f3e3
      David Blaikie authored
      Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space"
      
      This was a flawed change - it just caused the getElementType call to be
      deferred until later, when we really need to remove it. Now that the IR
      for GlobalAliases has been updated, the root cause is addressed that way
      instead and this change is no longer needed (and in fact gets in the way
      - because we want to pass the pointee type directly down further).
      
      Follow up patches to push this through GlobalValue, bitcode format, etc,
      will come along soon.
      
      This reverts commit 236160.
      
      llvm-svn: 247585
      16a2f3e3
    • JF Bastien's avatar
      [MergeFuncs] Fix bug in merging GetElementPointers · 26aca14b
      JF Bastien authored
      GetElementPointers must have the first argument's type compared
      for structural equivalence. Previously the code erroneously compared the
      pointer's type, but this code was dead because all pointer types (of the
      same address space) are the same. The pointee must be compared instead
      (using the type stored in the GEP, not from the pointer type which will
      be erased anyway).
      
      Author: jrkoenig
      Reviewers: dschuff, nlewycky, jfb
      Subscribers: nlewycky, llvm-commits
      Differential revision: http://reviews.llvm.org/D12820
      
      llvm-svn: 247570
      26aca14b
  7. Sep 13, 2015
  8. Sep 10, 2015
    • JF Bastien's avatar
      [MergeFuncs] Fix callsite attributes in thunk generation · fa946233
      JF Bastien authored
      This change correctly sets the attributes on the callsites
      generated in thunks. This makes sure things such as sret, sext, etc.
      are correctly set, so that the call can be a proper tailcall.
      
      Also, the transfer of attributes in the replaceDirectCallers function
      appears to be unnecessary, but until this is confirmed it will remain.
      
      Author: jrkoenig
      Reviewers: dschuff, jfb
      Subscribers: llvm-commits, nlewycky
      Differential revision: http://reviews.llvm.org/D12581
      
      llvm-svn: 247313
      fa946233
    • James Molloy's avatar
      Enable GlobalsAA by default · d47634d7
      James Molloy authored
      This can give significant improvements to alias analysis in some situations, and improves its testing coverage in all situations.
      
      llvm-svn: 247264
      d47634d7
    • Peter Collingbourne's avatar
      LowerBitSets: Fix non-determinism bug. · 1cbc91ec
      Peter Collingbourne authored
      Visit disjoint sets in a deterministic order based on the maximum BitSetNM
      index, otherwise the order in which we visit them will depend on pointer
      comparisons. This was being exposed by MSan.
      
      llvm-svn: 247201
      1cbc91ec
  9. Sep 09, 2015
    • Chandler Carruth's avatar
      [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible · 7b560d40
      Chandler Carruth authored
      with the new pass manager, and no longer relying on analysis groups.
      
      This builds essentially a ground-up new AA infrastructure stack for
      LLVM. The core ideas are the same that are used throughout the new pass
      manager: type erased polymorphism and direct composition. The design is
      as follows:
      
      - FunctionAAResults is a type-erasing alias analysis results aggregation
        interface to walk a single query across a range of results from
        different alias analyses. Currently this is function-specific as we
        always assume that aliasing queries are *within* a function.
      
      - AAResultBase is a CRTP utility providing stub implementations of
        various parts of the alias analysis result concept, notably in several
        cases in terms of other more general parts of the interface. This can
        be used to implement only a narrow part of the interface rather than
        the entire interface. This isn't really ideal, this logic should be
        hoisted into FunctionAAResults as currently it will cause
        a significant amount of redundant work, but it faithfully models the
        behavior of the prior infrastructure.
      
      - All the alias analysis passes are ported to be wrapper passes for the
        legacy PM and new-style analysis passes for the new PM with a shared
        result object. In some cases (most notably CFL), this is an extremely
        naive approach that we should revisit when we can specialize for the
        new pass manager.
      
      - BasicAA has been restructured to reflect that it is much more
        fundamentally a function analysis because it uses dominator trees and
        loop info that need to be constructed for each function.
      
      All of the references to getting alias analysis results have been
      updated to use the new aggregation interface. All the preservation and
      other pass management code has been updated accordingly.
      
      The way the FunctionAAResultsWrapperPass works is to detect the
      available alias analyses when run, and add them to the results object.
      This means that we should be able to continue to respect when various
      passes are added to the pipeline, for example adding CFL or adding TBAA
      passes should just cause their results to be available and to get folded
      into this. The exception to this rule is BasicAA which really needs to
      be a function pass due to using dominator trees and loop info. As
      a consequence, the FunctionAAResultsWrapperPass directly depends on
      BasicAA and always includes it in the aggregation.
      
      This has significant implications for preserving analyses. Generally,
      most passes shouldn't bother preserving FunctionAAResultsWrapperPass
      because rebuilding the results just updates the set of known AA passes.
      The exception to this rule are LoopPass instances which need to preserve
      all the function analyses that the loop pass manager will end up
      needing. This means preserving both BasicAAWrapperPass and the
      aggregating FunctionAAResultsWrapperPass.
      
      Now, when preserving an alias analysis, you do so by directly preserving
      that analysis. This is only necessary for non-immutable-pass-provided
      alias analyses though, and there are only three of interest: BasicAA,
      GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
      preserved when needed because it (like DominatorTree and LoopInfo) is
      marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
      set everywhere we previously were preserving all of AliasAnalysis, and
      I've added SCEVAA in the intersection of that with where we preserve
      SCEV itself.
      
      One significant challenge to all of this is that the CGSCC passes were
      actually using the alias analysis implementations by taking advantage of
      a pretty amazing set of loop holes in the old pass manager's analysis
      management code which allowed analysis groups to slide through in many
      cases. Moving away from analysis groups makes this problem much more
      obvious. To fix it, I've leveraged the flexibility the design of the new
      PM components provides to just directly construct the relevant alias
      analyses for the relevant functions in the IPO passes that need them.
      This is a bit hacky, but should go away with the new pass manager, and
      is already in many ways cleaner than the prior state.
      
      Another significant challenge is that various facilities of the old
      alias analysis infrastructure just don't fit any more. The most
      significant of these is the alias analysis 'counter' pass. That pass
      relied on the ability to snoop on AA queries at different points in the
      analysis group chain. Instead, I'm planning to build printing
      functionality directly into the aggregation layer. I've not included
      that in this patch merely to keep it smaller.
      
      Note that all of this needs a nearly complete rewrite of the AA
      documentation. I'm planning to do that, but I'd like to make sure the
      new design settles, and to flesh out a bit more of what it looks like in
      the new pass manager first.
      
      Differential Revision: http://reviews.llvm.org/D12080
      
      llvm-svn: 247167
      7b560d40
    • Peter Collingbourne's avatar
      Re-apply r247080 with order of evaluation fix. · 8d24ae94
      Peter Collingbourne authored
      llvm-svn: 247095
      8d24ae94
    • Peter Collingbourne's avatar
      Revert r247080, "LowerBitSets: Extend pass to support functions as bitset · 07f3af2e
      Peter Collingbourne authored
      members." as it causes test failures on a number of bots.
      
      llvm-svn: 247088
      07f3af2e
  10. Sep 08, 2015
    • Peter Collingbourne's avatar
      LowerBitSets: Extend pass to support functions as bitset members. · c634ed0b
      Peter Collingbourne authored
      This change extends the bitset lowering pass to support bitsets that may
      contain either functions or global variables. A function bitset is lowered to
      a jump table that is laid out before one of the functions in the bitset.
      
      Also add support for non-string bitset identifier names. This allows for
      distinct metadata nodes to stand in for names with internal linkage,
      as done in D11857.
      
      Differential Revision: http://reviews.llvm.org/D11856
      
      llvm-svn: 247080
      c634ed0b
  11. Sep 04, 2015
  12. Sep 03, 2015
    • JF Bastien's avatar
      [MergeFuncs] Efficiently defer functions on merge · 3a4ad61c
      JF Bastien authored
      Summary:
      This patch introduces a side table in Merge Functions to
      efficiently remove functions from the function set when functions
      they refer to are merged. Previously these functions would need to
      be compared lg(N) times to find the appropriate FunctionNode in the
      tree to defer. With the recent determinism changes, this comparison
      is more expensive. In addition, the removal function would not always
      actually remove the function from the set (i.e. after remove(F),
      there would sometimes still be a node in the tree which contains F).
      
      With these changes, these functions are properly deferred, and so more
      functions can be merged. In addition, when there are many merged
      functions (and thus more deferred functions), there is a speedup:
      
      chromium: 48678 merged -> 49380 merged; 6.58s -> 5.49s
      libxul.so: 41004 merged -> 41030 merged; 8.02s -> 6.94s
      mysqld: 1607 merged -> 1607 merged (same); 0.215s -> 0.212s (probably noise)
      
      Author: jrkoenig
      Reviewers: jfb, dschuff
      Subscribers: llvm-commits, nlewycky
      Differential revision: http://reviews.llvm.org/D12537
      
      llvm-svn: 246735
      3a4ad61c
  13. Sep 02, 2015
  14. Sep 01, 2015
  15. Aug 31, 2015
    • Hans Wennborg's avatar
      Fix Windows build by including raw_ostream.h · 043bf5b2
      Hans Wennborg authored
      llvm-svn: 246486
      043bf5b2
    • Philip Reames's avatar
      [FunctionAttr] Infer nonnull attributes on returns · a88caeab
      Philip Reames authored
      Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't.
      
      Differential Revision: http://reviews.llvm.org/D9688
      
      llvm-svn: 246476
      a88caeab
  16. Aug 28, 2015
    • JF Bastien's avatar
      Remove Merge Functions pointer comparisons · f5aa1ca6
      JF Bastien authored
      Summary:
      This patch removes two remaining places where pointer value comparisons
      are used to order functions: comparing range annotation metadata, and comparing
      block address constants. (These are both rare cases, and so no actual
      non-determinism was observed from either case).
      
      The fix for range metadata is simple: the annotation always consists of a pair
      of integers, so we just order by those integers.
      
      The fix for block addresses is more subtle. Two constants are the same if they
      are the same basic block in the same function, or if they refer to corresponding
      basic blocks in each respective function. Note that in the first case, merging
      is trivially correct. In the second, the correctness of merging relies on the
      fact that the the values of block addresses cannot be compared. This change is
      actually an enhancement, as these functions could not previously be merged (see
      merge-block-address.ll).
      
      There is still a problem with cross function block addresses, in that constants
      pointing to a basic block in a merged function is not updated.
      
      This also more robustly compares floating point constants by all fields of their
      semantics, and fixes a dyn_cast/cast mixup.
      
      Author: jrkoenig
      Reviewers: dschuff, nlewycky, jfb
      Subscribers llvm-commits
      Differential revision: http://reviews.llvm.org/D12376
      
      llvm-svn: 246305
      f5aa1ca6
  17. Aug 26, 2015
    • Diego Novillo's avatar
      Fix memory leak in sample profile pass. · 7732ae4a
      Diego Novillo authored
      The problem here were the function analyses invoked by the function pass
      manager from the new IPO pass. I looked at other IPO passes needing
      dominance information and the only one that requires it (partial
      inliner) does not use the standard dependency mechanism.
      
      This patch mimics what the partial inliner does to compute dominance,
      post-dominance and loop info. One thing I like about this approach is
      that I can delay the computation of all this until I actually need it.
      
      This should bring the ASAN buildbot back to green. If there's a better
      way to fix this, I'll do it in a follow-up patch.
      
      llvm-svn: 246066
      7732ae4a
    • JF Bastien's avatar
      Comparing operands should not require the same ValueID · 9dc042a0
      JF Bastien authored
      Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging.
      
      Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes.
      
      Author: jrkoenig
      Reviewers: nlewycky, jfb, dschuff
      Subscribers: llvm-commits
      Differential revision: http://reviews.llvm.org/D12302
      
      llvm-svn: 246001
      9dc042a0
  18. Aug 25, 2015
  19. Aug 24, 2015
    • Mehdi Amini's avatar
      Require Dominator Tree For SROA, improve compile-time · d134a67c
      Mehdi Amini authored
      TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree.
      There is no reason not to require it up-front for SROA.
      
      Some history is necessary to understand why we ended-up here.
      
      r123437 switched the second (Legacy)SROA in the optimizer pipeline to
      use SSAUpdater in order to avoid recomputing the costly
      DominanceFrontier. The purpose was to speed-up the compile-time.
      
      Later r123609 removed the need for the DominanceFrontier in
      (Legacy)SROA.
      
      Right after, some cleanup was made in r123724 to remove any reference
      to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and
      SROA_DT (the latter replacing SROA_DF).
      The second argument of `createScalarReplAggregatesPass` was renamed
      from `UseDomFrontier` to `UseDomTree`.
      I believe this is were a mistake was made. The pipeline was not
      updated and the call site was still:
          PM->add(createScalarReplAggregatesPass(-1, false));
      
      At that time, SROA was immediately followed in the pipeline by
      EarlyCSE which required alread the DominatorTree. Not requiring
      the DominatorTree in SROA didn't save anything, but unfortunately
      it was lost at this point.
      
      When the new SROA Pass was introduced in r163965, I believe the goal
      was to have an exact replacement of the existing SROA, this bug
      slipped through.
      
      You can see currently:
      
      $ echo "" | clang -x c++  -O3 -c - -mllvm -debug-pass=Structure
      ...
      ...
            FunctionPass Manager
              SROA
              Dominator Tree Construction
              Early CSE
      
      After this patch:
      
      $ echo "" | clang -x c++  -O3 -c - -mllvm -debug-pass=Structure
      ...
      ...
            FunctionPass Manager
              Dominator Tree Construction
              SROA
              Early CSE
      
      This improves the compile time from 88s to 23s for PR17855.
      https://llvm.org/bugs/show_bug.cgi?id=17855
      
      And from 113s to 12s for PR16756
      https://llvm.org/bugs/show_bug.cgi?id=16756
      
      Reviewers: chandlerc
      
      Differential Revision: http://reviews.llvm.org/D12267
      
      From: Mehdi Amini <mehdi.amini@apple.com>
      llvm-svn: 245820
      d134a67c
  20. Aug 22, 2015
    • JF Bastien's avatar
      Improve the determinism of MergeFunctions · 057292a7
      JF Bastien authored
      Summary:
      
      Merge functions previously relied on unsigned comparisons of pointer values to
      order functions. This caused observable non-determinism in the compiler for
      large bitcode programs. Basically, opt -mergefuncs program.bc | md5sum produces
      different hashes when run repeatedly on the same machine. Differing output was
      observed on three large bitcodes, but it was less frequent on the smallest file.
      It is possible that this only manifests on the large inputs, hence remaining
      undetected until now.
      
      This patch fixes this by removing (almost, see below) all places where
      comparisons between pointers are used to order functions. Most of these changes
      are local, but the comparison of global values requires assigning an identifier
      to each local in the order it is visited. This is very similar to the way the
      comparison function identifies Value*'s defined within a function. Because the
      order of visiting the functions and their subparts is deterministic, the
      identifiers assigned to the globals will be as well, and the order of functions
      will be deterministic.
      
      With these changes, there is no more observed non-determinism. There is also
      only minor slowdowns (negligible to 4%) compared to the baseline, which is
      likely a result of the fact that global comparisons involve hash lookups and not
      just pointer comparisons.
      
      The one caveat so far is that programs containing BlockAddress constants can
      still be non-deterministic. It is not clear what the right solution is here. In
      particular, even if the global numbers are used to order by function, we still
      need a way to order the BasicBlock*'s. Unfortunately, we cannot just bail out
      and fail to order the functions or consider them equal, because we require a
      total order over functions. Note that programs with BlockAddress constants are
      relatively rare, so the impact of leaving this in is minor as long as this pass
      is opt-in.
      
      Author: jrkoenig
      
      Reviewers: nlewycky, jfb, dschuff
      
      Subscribers: jevinskie, llvm-commits, chapuni
      
      Differential revision: http://reviews.llvm.org/D12168
      
      llvm-svn: 245762
      057292a7
  21. Aug 18, 2015
    • Chandler Carruth's avatar
      [PM/AA] Remove the last relics of the separate IPA library from LLVM, · 7adc3a2b
      Chandler Carruth authored
      folding the code into the main Analysis library.
      
      There already wasn't much of a distinction between Analysis and IPA.
      A number of the passes in Analysis are actually IPA passes, and there
      doesn't seem to be any advantage to separating them.
      
      Moreover, it makes it hard to have interactions between analyses that
      are both local and interprocedural. In trying to make the Alias Analysis
      infrastructure work with the new pass manager, it becomes particularly
      awkward to navigate this split.
      
      I've tried to find all the places where we referenced this, but I may
      have missed some. I have also adjusted the C API to continue to be
      equivalently functional after this change.
      
      Differential Revision: http://reviews.llvm.org/D12075
      
      llvm-svn: 245318
      7adc3a2b
  22. Aug 16, 2015
  23. Aug 15, 2015
    • JF Bastien's avatar
      Accelerate MergeFunctions with hashing · 5e4303dc
      JF Bastien authored
      This patch makes the Merge Functions pass faster by calculating and comparing
      a hash value which captures the essential structure of a function before
      performing a full function comparison.
      
      The hash is calculated by hashing the function signature, then walking the basic
      blocks of the function in the same order as the main comparison function. The
      opcode of each instruction is hashed in sequence, which means that different
      functions according to the existing total order cannot have the same hash, as
      the comparison requires the opcodes of the two functions to be the same order.
      
      The hash function is a static member of the FunctionComparator class because it
      is tightly coupled to the exact comparison function used. For example, functions
      which are equivalent modulo a single variant callsite might be merged by a more
      aggressive MergeFunctions, and the hash function would need to be insensitive to
      these differences in order to exploit this.
      
      The hashing function uses a utility class which accumulates the values into an
      internal state using a standard bit-mixing function. Note that this is a different interface
      than a regular hashing routine, because the values to be hashed are scattered
      amongst the properties of a llvm::Function, not linear in memory. This scheme is
      fast because only one word of state needs to be kept, and the mixing function is
      a few instructions.
      
      The main runOnModule function first computes the hash of each function, and only
      further processes functions which do not have a unique function hash. The hash
      is also used to order the sorted function set. If the hashes differ, their
      values are used to order the functions, otherwise the full comparison is done.
      
      Both of these are helpful in speeding up MergeFunctions. Together they result in
      speedups of 9% for mysqld (a mostly C application with little redundancy), 46%
      for libxul in Firefox, and 117% for Chromium. (These are all LTO builds.) In all
      three cases, the new speed of MergeFunctions is about half that of the module
      verifier, making it relatively inexpensive even for large LTO builds with
      hundreds of thousands of functions. The same functions are merged, so this
      change is free performance.
      
      Author: jrkoenig
      
      Reviewers: nlewycky, dschuff, jfb
      
      Subscribers: llvm-commits, aemerson
      
      Differential revision: http://reviews.llvm.org/D11923
      
      llvm-svn: 245140
      5e4303dc
  24. Aug 14, 2015
    • David Majnemer's avatar
      [IR] Add token types · b611e3f5
      David Majnemer authored
      This introduces the basic functionality to support "token types".
      The motivation stems from the need to perform operations on a Value
      whose provenance cannot be obscured.
      
      There are several applications for such a type but my immediate
      motivation stems from WinEH.  Our personality routine enforces a
      single-entry - single-exit regime for cleanups.  After several rounds of
      optimizations, we may be left with a terminator whose "cleanup-entry
      block" is not entirely clear because control flow has merged two
      cleanups together.  We have experimented with using labels as operands
      inside of instructions which are not terminators to indicate where we
      came from but found that LLVM does not expect such exotic uses of
      BasicBlocks.
      
      Instead, we can use this new type to clearly associate the "entry point"
      and "exit point" of our cleanup.  This is done by having the cleanuppad
      yield a Token and consuming it at the cleanupret.
      The token type makes it impossible to obscure or otherwise hide the
      Value, making it trivial to track the relationship between the two
      points.
      
      What is the burden to the optimizer?  Well, it turns out we have already
      paid down this cost by accepting that there are certain calls that we
      are not permitted to duplicate, optimizations have to watch out for
      such instructions anyway.  There are additional places in the optimizer
      that we will probably have to update but early examination has given me
      the impression that this will not be heroic.
      
      Differential Revision: http://reviews.llvm.org/D11861
      
      llvm-svn: 245029
      b611e3f5
Loading