Skip to content
  1. May 14, 2012
  2. May 10, 2012
  3. May 09, 2012
  4. May 08, 2012
    • Duncan Sands's avatar
      Calling ReassociateExpression recursively is extremely dangerous since it will · 3bbb1d50
      Duncan Sands authored
      replace the operands of expressions with only one use with undef and generate
      a new expression for the original without using RAUW to update the original.
      Thus any copies of the original expression held in a vector may end up
      referring to some bogus value - and using a ValueHandle won't help since there
      is no RAUW.  There is already a mechanism for getting the effect of recursion
      non-recursively: adding the value to be recursed on to RedoInsts.  But it wasn't
      being used systematically.  Have various places where recursion had snuck in at
      some point use the RedoInsts mechanism instead.  Fixes PR12169.
      
      llvm-svn: 156379
      3bbb1d50
  5. May 07, 2012
  6. May 06, 2012
    • Benjamin Kramer's avatar
      Switch the select to branch transformation on by default. · 3d38c17b
      Benjamin Kramer authored
      The primitive conservative heuristic seems to give a slight overall
      improvement while not regressing stuff. Make it available to wider
      testing. If you notice any speed regressions (or significant code
      size regressions) let me know!
      
      llvm-svn: 156258
      3d38c17b
  7. May 05, 2012
    • Benjamin Kramer's avatar
      CodeGenPrepare: Add a transform to turn selects into branches in some cases. · 047d7ca0
      Benjamin Kramer authored
      This came up when a change in block placement formed a cmov and slowed down a
      hot loop by 50%:
      
      	ucomisd	(%rdi), %xmm0
      	cmovbel	%edx, %esi
      
      cmov is a really bad choice in this context because it doesn't get branch
      prediction. If we emit it as a branch, an out-of-order CPU can do a better job
      (if the branch is predicted right) and avoid waiting for the slow load+compare
      instruction to finish. Of course it won't help if the branch is unpredictable,
      but those are really rare in practice.
      
      This patch uses a dumb conservative heuristic, it turns all cmovs that have one
      use and a direct memory operand into branches. cmovs usually save some code
      size, so we disable the transform in -Os mode. In-Order architectures are
      unlikely to benefit as well, those are included in the
      "predictableSelectIsExpensive" flag.
      
      It would be better to reuse branch probability info here, but BPI doesn't
      support select instructions currently. It would make sense to use the same
      heuristics as the if-converter pass, which does the opposite direction of this
      transform.
      
      
      Test suite shows a small improvement here and there on corei7-level machines,
      but the actual results depend a lot on the used microarchitecture. The
      transformation is currently disabled by default and available by passing the
      -enable-cgp-select2branch flag to the code generator.
      
      Thanks to Chandler for the initial test case to him and Evan Cheng for providing
      me with comments and test-suite numbers that were more stable than mine :)
      
      llvm-svn: 156234
      047d7ca0
  8. May 04, 2012
    • Bill Wendling's avatar
      Add 'landingpad' instructions to the list of instructions to ignore. · fa0ebcd1
      Bill Wendling authored
      Also combine the code in the 'assert' statement.
      
      llvm-svn: 156155
      fa0ebcd1
    • Chandler Carruth's avatar
      A pile of long over-due refactorings here. There are some very, *very* · da7513a8
      Chandler Carruth authored
      minor behavior changes with this, but nothing I have seen evidence of in
      the wild or expect to be meaningful. The real goal is unifying our logic
      and simplifying the interfaces. A summary of the changes follows:
      
      - Make 'callIsSmall' actually accept a callsite so it can handle
        intrinsics, and simplify callers appropriately.
      - Nuke a completely bogus declaration of 'callIsSmall' that was still
        lurking in InlineCost.h... No idea how this got missed.
      - Teach the 'isInstructionFree' about the various more intelligent
        'free' heuristics that got added to the inline cost analysis during
        review and testing. This mostly surrounds int->ptr and ptr->int casts.
      - Switch most of the interesting parts of the inline cost analysis that
        were essentially computing 'is this instruction free?' to use the code
        metrics routine instead. This way we won't keep duplicating logic.
      
      All of this is motivated by the desire to allow other passes to compute
      a roughly equivalent 'cost' metric for a particular basic block as the
      inline cost analysis. Sadly, re-using the same analysis for both is
      really messy because only the actual inline cost analysis is ever going
      to go to the contortions required for simplification, SROA analysis,
      etc.
      
      llvm-svn: 156140
      da7513a8
  9. May 03, 2012
  10. May 02, 2012
  11. May 01, 2012
  12. Apr 30, 2012
  13. Apr 27, 2012
  14. Apr 26, 2012
    • Chandler Carruth's avatar
      Teach the reassociate pass to fold chains of multiplies with repeated · 739ef80f
      Chandler Carruth authored
      elements to minimize the number of multiplies required to compute the
      final result. This uses a heuristic to attempt to form near-optimal
      binary exponentiation-style multiply chains. While there are some cases
      it misses, it seems to at least a decent job on a very diverse range of
      inputs.
      
      Initial benchmarks show no interesting regressions, and an 8%
      improvement on SPASS. Let me know if any other interesting results (in
      either direction) crop up!
      
      Credit to Richard Smith for the core algorithm, and helping code the
      patch itself.
      
      llvm-svn: 155616
      739ef80f
  15. Apr 25, 2012
  16. Apr 20, 2012
  17. Apr 19, 2012
  18. Apr 18, 2012
    • Bill Wendling's avatar
      Use a heavy hammer to fix PR12573. · 4d4d0257
      Bill Wendling authored
      If the loop contains invoke instructions, whose unwind edge escapes the loop,
      then don't try to unswitch the loop. Doing so may cause the unwind edge to be
      split, which not only is non-trivial but doesn't preserve loop simplify
      information.
      
      Fixes PR12573
      
      llvm-svn: 154987
      4d4d0257
    • Andrew Trick's avatar
      loop-reduce: Add an early bailout to catch extremely large loops. · 19f80c1e
      Andrew Trick authored
      This introduces a threshold of 200 IV Users, which is very
      conservative but should be sufficient to avoid serious compile time
      sink or stack overflow. The llvm test-suite with LTO never exceeds 190
      users per loop.
      
      The bug doesn't relate to a specific type of loop. Checking in an
      arbitrary giant loop as a unit test would be silly.
      
      Fixes rdar://11262507.
      
      llvm-svn: 154983
      19f80c1e
    • Joe Groff's avatar
      fix pr12559: mark unavailable win32 math libcalls · a81bcbb9
      Joe Groff authored
      also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint
      
      llvm-svn: 154960
      a81bcbb9
  19. Apr 13, 2012
  20. Apr 11, 2012
  21. Apr 10, 2012
Loading