Skip to content
  1. Mar 24, 2012
  2. Mar 16, 2012
    • Chandler Carruth's avatar
      Rip out support for 'llvm.noinline'. This thing has a strange history... · b37fc13a
      Chandler Carruth authored
      It was added in 2007 as the first cut at supporting no-inline
      attributes, but we didn't have function attributes of any form at the
      time. However, it was added without any mention in the LangRef or other
      documentation.
      
      Later on, in 2008, Devang added function notes for 'inline=never' and
      then turned them into proper function attributes. From that point
      onward, as far as I can tell, the world moved on, and no one has touched
      'llvm.noinline' in any meaningful way since.
      
      It's time has now come. We have had better mechanisms for doing this for
      a long time, all the frontends I'm aware of use them, and this is just
      holding back progress. Given that it was never a documented feature of
      the IR, I've provided no auto-upgrade support. If people know of real,
      in-the-wild bitcode that relies on this, yell at me and I'll add it, but
      I *seriously* doubt anyone cares.
      
      llvm-svn: 152904
      b37fc13a
    • Chandler Carruth's avatar
      Start removing the use of an ad-hoc 'never inline' set and instead · d7a5f2ad
      Chandler Carruth authored
      directly query the function information which this set was representing.
      This simplifies the interface of the inline cost analysis, and makes the
      always-inline pass significantly more efficient.
      
      Previously, always-inline would first make a single set of every
      function in the module *except* those marked with the always-inline
      attribute. It would then query this set at every call site to see if the
      function was a member of the set, and if so, refuse to inline it. This
      is quite wasteful. Instead, simply check the function attribute directly
      when looking at the callsite.
      
      The normal inliner also had similar redundancy. It added every function
      in the module with the noinline attribute to its set to ignore, even
      though inside the cost analysis function we *already tested* the
      noinline attribute and produced the same result.
      
      The only tricky part of removing this is that we have to be able to
      correctly remove only the functions inlined by the always-inline pass
      when finalizing, which requires a bit of a hack. Still, much less of
      a hack than the set of all non-always-inline functions was. While I was
      touching this function, I switched a heavy-weight set to a vector with
      sort+unique. The algorithm already had a two-phase insert and removal
      pattern, we were just needlessly paying the uniquing cost on every
      insert.
      
      This probably speeds up some compiles by a small amount (-O0 compiles
      with lots of always-inline, so potentially heavy libc++ users), but I've
      not tried to measure it.
      
      I believe there is no functional change here, but yell if you spot one.
      None are intended.
      
      Finally, the direction this is going in is to greatly simplify the
      inline cost query interface so that we can replace its implementation
      with a much more clever one. Along the way, all the APIs get simplified,
      so it seems incrementally good.
      
      llvm-svn: 152903
      d7a5f2ad
  3. Mar 14, 2012
    • Chandler Carruth's avatar
      Change where we enable the heuristic that delays inlining into functions · 30b8416d
      Chandler Carruth authored
      which are small enough to themselves be inlined. Delaying in this manner
      can be harmful if the function is inelligible for inlining in some (or
      many) contexts as it pessimizes the code of the function itself in the
      event that inlining does not eventually happen.
      
      Previously the check was written to only do this delaying of inlining
      for static functions in the hope that they could be entirely deleted and
      in the knowledge that all callers of static functions will have the
      opportunity to inline if it is in fact profitable. However, with C++ we
      get two other important sources of functions where the definition is
      always available for inlining: inline functions and templated functions.
      This patch generalizes the inliner to allow linkonce-ODR (the linkage
      such C++ routines receive) to also qualify for this delay-based
      inlining.
      
      Benchmarking across a range of large real-world applications shows
      roughly 2% size increase across the board, but an average speedup of
      about 0.5%. Some benhcmarks improved over 2%, and the 'clang' binary
      itself (when bootstrapped with this feature) shows a 1% -O0 performance
      improvement when run over all Sema, Lex, and Parse source code smashed
      into a single file. A clean re-build of Clang+LLVM with a bootstrapped
      Clang shows approximately 2% improvement, but that measurement is often
      noisy.
      
      llvm-svn: 152737
      30b8416d
  4. Mar 13, 2012
  5. Mar 12, 2012
    • Chandler Carruth's avatar
      When inlining a function and adding its inner call sites to the · 595fda84
      Chandler Carruth authored
      candidate set for subsequent inlining, try to simplify the arguments to
      the inner call site now that inlining has been performed.
      
      The goal here is to propagate and fold constants through deeply nested
      call chains. Without doing this, we loose the inliner bonus that should
      be applied because the arguments don't match the exact pattern the cost
      estimator uses.
      
      Reviewed on IRC by Benjamin Kramer.
      
      llvm-svn: 152556
      595fda84
  6. Mar 08, 2012
    • Stepan Dyatkovskiy's avatar
      Taken into account Duncan's comments for r149481 dated by 2nd Feb 2012: · 5b648afb
      Stepan Dyatkovskiy authored
      http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120130/136146.html
      
      Implemented CaseIterator and it solves almost all described issues: we don't need to mix operand/case/successor indexing anymore. Base iterator class is implemented as a template since it may be initialized either from "const SwitchInst*" or from "SwitchInst*".
      
      ConstCaseIt is just a read-only iterator.
      CaseIt is read-write iterator; it allows to change case successor and case value.
      
      Usage of iterator allows totally remove resolveXXXX methods. All indexing convertions done automatically inside the iterator's getters.
      
      Main way of iterator usage looks like this:
      SwitchInst *SI = ... // intialize it somehow
      
      for (SwitchInst::CaseIt i = SI->caseBegin(), e = SI->caseEnd(); i != e; ++i) {
        BasicBlock *BB = i.getCaseSuccessor();
        ConstantInt *V = i.getCaseValue();
        // Do something.
      }
      
      If you want to convert case number to TerminatorInst successor index, just use getSuccessorIndex iterator's method.
      If you want initialize iterator from TerminatorInst successor index, use CaseIt::fromSuccessorIndex(...) method.
      
      There are also related changes in llvm-clients: klee and clang.
      
      llvm-svn: 152297
      5b648afb
  7. Feb 27, 2012
  8. Feb 25, 2012
  9. Feb 23, 2012
  10. Feb 21, 2012
  11. Feb 20, 2012
  12. Feb 17, 2012
  13. Feb 12, 2012
  14. Feb 09, 2012
  15. Feb 06, 2012
  16. Feb 05, 2012
    • Nick Lewycky's avatar
      Teach GlobalOpt to handle atomic accesses to globals. · 52da72b1
      Nick Lewycky authored
       * Most of the transforms come through intact by having each transformed load or
      store copy the ordering and synchronization scope of the original.
       * The transform that turns a global only accessed in main() into an alloca
      (since main is non-recursive) with a store of the initial value uses an
      unordered store, since it's guaranteed to be the first thing to happen in main.
      (Threads may have started before main (!) but they can't have the address of a
      function local before the point in the entry block we insert our code.)
       * The heap-SRoA transforms are disabled in the face of atomic operations. This
      can probably be improved; it seems odd to have atomic accesses to an alloca
      that doesn't have its address taken.
      
      AnalyzeGlobal keeps track of the strongest ordering found in any use of the
      global. This is more information than we need right now, but it's cheap to
      compute and likely to be useful.
      
      llvm-svn: 149847
      52da72b1
    • Nick Lewycky's avatar
      Clean up some whitespace and comments. No functionality change. · bbd1156b
      Nick Lewycky authored
      llvm-svn: 149845
      bbd1156b
  17. Feb 01, 2012
    • Stepan Dyatkovskiy's avatar
      SwitchInst refactoring. · 513aaa56
      Stepan Dyatkovskiy authored
      The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.
      
      What was done:
      
      1. Changed semantics of index inside the getCaseValue method:
      getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
      2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
      3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
      4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
      4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
      4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.
      
      Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
      llvm-svn: 149481
      513aaa56
    • Hal Finkel's avatar
      Add a basic-block autovectorization pass. · c34e5113
      Hal Finkel authored
      This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
      Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).
      
      llvm-svn: 149468
      c34e5113
  18. Jan 27, 2012
  19. Jan 26, 2012
  20. Jan 25, 2012
  21. Jan 20, 2012
  22. Jan 17, 2012
  23. Jan 11, 2012
  24. Jan 06, 2012
  25. Jan 05, 2012
Loading