Skip to content
  1. Oct 02, 2013
    • Chandler Carruth's avatar
      Remove the very substantial, largely unmaintained legacy PGO · ea564946
      Chandler Carruth authored
      infrastructure.
      
      This was essentially work toward PGO based on a design that had several
      flaws, partially dating from a time when LLVM had a different
      architecture, and with an effort to modernize it abandoned without being
      completed. Since then, it has bitrotted for several years further. The
      result is nearly unusable, and isn't helping any of the modern PGO
      efforts. Instead, it is getting in the way, adding confusion about PGO
      in LLVM and distracting everyone with maintenance on essentially dead
      code. Removing it paves the way for modern efforts around PGO.
      
      Among other effects, this removes the last of the runtime libraries from
      LLVM. Those are being developed in the separate 'compiler-rt' project
      now, with somewhat different licensing specifically more approriate for
      runtimes.
      
      llvm-svn: 191835
      ea564946
  2. Oct 01, 2013
  3. Sep 30, 2013
  4. Sep 28, 2013
  5. Sep 24, 2013
  6. Sep 22, 2013
  7. Sep 16, 2013
  8. Sep 14, 2013
  9. Sep 10, 2013
  10. Sep 09, 2013
    • Bob Wilson's avatar
      Revert patches to add case-range support for PR1255. · e407736a
      Bob Wilson authored
      The work on this project was left in an unfinished and inconsistent state.
      Hopefully someone will eventually get a chance to implement this feature, but
      in the meantime, it is better to put things back the way the were.  I have
      left support in the bitcode reader to handle the case-range bitcode format,
      so that we do not lose bitcode compatibility with the llvm 3.3 release.
      
      This reverts the following commits: 155464, 156374, 156377, 156613, 156704,
      156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575,
      157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884,
      157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100,
      159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659,
      159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736
      
      llvm-svn: 190328
      e407736a
  11. Sep 06, 2013
  12. Aug 31, 2013
  13. Aug 24, 2013
  14. Aug 22, 2013
  15. Aug 19, 2013
  16. Aug 15, 2013
  17. Aug 14, 2013
    • Chandler Carruth's avatar
      Fix a really terrifying but improbable bug in mem2reg. If you have seen · 2de93afe
      Chandler Carruth authored
      extremely subtle miscompilations (such as a load getting replaced with
      the value stored *below* the load within a basic block) related to
      promoting an alloca to an SSA value, there is the dim possibility that
      you hit this. Please let me know if you won this unfortunate lottery.
      
      The first half of mem2reg's core logic (as it is used both in the
      standalone mem2reg pass and in SROA) builds up a mapping from
      'Instruction *' to the index of that instruction within its basic block.
      This allows quickly establishing which store dominate a particular load
      even for large basic blocks. We cache this information throughout the
      run of mem2reg over a function in order to amortize the cost of
      computing it.
      
      This is not in and of itself a strange pattern in LLVM. However, it
      introduces a very important constraint: absolutely no instruction can be
      deleted from the program without updating the mapping. Otherwise a newly
      allocated instruction might get the same pointer address, and then end
      up with a wrong index. Yes, LLVM routinely suffers from a *single
      threaded* variant of the ABA problem. Most places in LLVM don't find
      avoiding this an imposition because they don't both delete and create
      new instructions iteratively, but mem2reg *loves* to do this... All the
      time. Fortunately, the mem2reg code was really careful about updating
      this cache to handle this eventuallity... except when it comes to the
      debug declare intrinsic. Oops. The fix is to invalidate that pointer in
      the cache when we delete it, the same as we do when deleting alloca
      instructions and other instructions.
      
      I've also caused the same bug in new code while working on a fix to
      PR16867, so this seems to be a really unfortunate pattern. Hopefully in
      subsequent patches the deletion of dead instructions can be consolidated
      sufficiently to make it less likely that we'll see future occurences of
      this bug.
      
      Sorry for not having a test case, but I have literally no idea how to
      reliably trigger this kind of thing. It may be single-threaded, but it
      remains an ABA problem. It would require a really amazing number of
      stars to align.
      
      llvm-svn: 188367
      2de93afe
    • Nick Lewycky's avatar
      Revert r187191, which broke opt -mem2reg on the testcases included in PR16867. · c7776f73
      Nick Lewycky authored
      However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146
      when SROA started sending more things directly down the PromoteMemToReg path.
      
      In order to revert r187191, I also revert dependent revisions r187296, r187322
      and r188146. Fixes PR16867. Does not add the testcases from that PR, but both
      of them should get added for both mem2reg and sroa when this revert gets
      unreverted.
      
      llvm-svn: 188327
      c7776f73
  18. Aug 13, 2013
  19. Aug 12, 2013
  20. Aug 10, 2013
  21. Aug 06, 2013
  22. Aug 05, 2013
    • Peter Collingbourne's avatar
      Introduce an optimisation for special case lists with large numbers of literal entries. · bace6066
      Peter Collingbourne authored
      Our internal regex implementation does not cope with large numbers
      of anchors very efficiently.  Given a ~3600-entry special case list,
      regex compilation can take on the order of seconds.  This patch solves
      the problem for the special case of patterns matching literal global
      names (i.e. patterns with no regex metacharacters).  Rather than
      forming regexes from literal global name patterns, add them to
      a StringSet which is checked before matching against the regex.
      This reduces regex compilation time by an order of roughly thousands
      when reading the aforementioned special case list, according to a
      completely unscientific study.
      
      No test cases.  I figure that any new tests for this code should
      check that regex metacharacters are properly recognised.  However,
      I could not find any documentation which documents the fact that the
      syntax of global names in special case lists is based on regexes.
      The extent to which regex syntax is supported in special case lists
      should probably be decided on/documented before writing tests.
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D1150
      
      llvm-svn: 187732
      bace6066
  23. Aug 02, 2013
  24. Jul 29, 2013
  25. Jul 28, 2013
  26. Jul 27, 2013
    • Chandler Carruth's avatar
      Merge the removal of dead instructions and lifetime markers with the · e8f5812a
      Chandler Carruth authored
      analysis of the alloca. We don't need to visit all the users twice for
      this. We build up a kill list during the analysis and then just process
      it afterward. This recovers the tiny bit of performance lost by moving
      to the visitor based analysis system as it removes one entire use-list
      walk from mem2reg. In some cases, this is now faster than mem2reg was
      previously.
      
      llvm-svn: 187296
      e8f5812a
    • Nick Lewycky's avatar
      Reimplement isPotentiallyReachable to make nocapture deduction much stronger. · 0b68245e
      Nick Lewycky authored
      Adds unit tests for it too.
      
      Split BasicBlockUtils into an analysis-half and a transforms-half, and put the
      analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable
      into llvm::isPotentiallyReachable and move it into Analysis/CFG.
      
      llvm-svn: 187283
      0b68245e
    • Tom Stellard's avatar
      SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch conditions · 8b1e021e
      Tom Stellard authored
      Merge consecutive if-regions if they contain identical statements.
      Both transformations reduce number of branches.  The transformation
      is guarded by a target-hook, and is currently enabled only for +R600,
      but the correctness has been tested on X86 target using a variety of
      CPU benchmarks.
      
      Patch by: Mei Ye
      
      llvm-svn: 187278
      8b1e021e
  27. Jul 26, 2013
    • Chandler Carruth's avatar
      Re-implement the analysis of uses in mem2reg to be significantly more · 9af38fc2
      Chandler Carruth authored
      robust. It now uses an InstVisitor and worklist to actually walk the
      uses of the Alloca transitively and detect the pattern which we can
      directly promote: loads & stores of the whole alloca and instructions we
      can completely ignore.
      
      Also, with this new implementation teach both the predicate for testing
      whether we can promote and the promotion engine itself to use the same
      code so we no longer have strange divergence between the two code paths.
      
      I've added some silly test cases to demonstrate that we can handle
      slightly more degenerate code patterns now. See the below for why this
      is even interesting.
      
      Performance impact: roughly 1% regression in the performance of SROA or
      ScalarRepl on a large C++-ish test case where most of the allocas are
      basically ready for promotion. The reason is because of silly redundant
      work that I've left FIXMEs for and which I'll address in the next
      commit. I wanted to separate this commit as it changes the behavior.
      Once the redundant work in removing the dead uses of the alloca is
      fixed, this code appears to be faster than the old version. =]
      
      So why is this useful? Because the previous requirement for promotion
      required a *specific* visit pattern of the uses of the alloca to verify:
      we *had* to look for no more than 1 intervening use. The end goal is to
      have SROA automatically detect when an alloca is already promotable and
      directly hand it to the mem2reg machinery rather than trying to
      partition and rewrite it. This is a 25% or more performance improvement
      for SROA, and a significant chunk of the delta between it and
      ScalarRepl. To get there, we need to make mem2reg actually capable of
      promoting allocas which *look* promotable to SROA without have SROA do
      tons of work to massage the code into just the right form.
      
      This is actually the tip of the iceberg. There are tremendous potential
      savings we can realize here by de-duplicating work between mem2reg and
      SROA.
      
      llvm-svn: 187191
      9af38fc2
  28. Jul 25, 2013
    • Rafael Espindola's avatar
      Respect llvm.used in Internalize. · 17600e29
      Rafael Espindola authored
      The language reference says that:
      
      "If a symbol appears in the @llvm.used list, then the compiler,
      assembler, and linker are required to treat the symbol as if there is
      a reference to the symbol that it cannot see"
      
      Since even the linker cannot see the reference, we must assume that
      the reference can be using the symbol table. For example, a user can add
      __attribute__((used)) to a debug helper function like dump and use it from
      a debugger.
      
      llvm-svn: 187103
      17600e29
Loading