Skip to content
  1. May 12, 2012
  2. May 11, 2012
  3. May 10, 2012
  4. May 09, 2012
  5. May 08, 2012
    • Duncan Sands's avatar
      Calling ReassociateExpression recursively is extremely dangerous since it will · 3bbb1d50
      Duncan Sands authored
      replace the operands of expressions with only one use with undef and generate
      a new expression for the original without using RAUW to update the original.
      Thus any copies of the original expression held in a vector may end up
      referring to some bogus value - and using a ValueHandle won't help since there
      is no RAUW.  There is already a mechanism for getting the effect of recursion
      non-recursively: adding the value to be recursed on to RedoInsts.  But it wasn't
      being used systematically.  Have various places where recursion had snuck in at
      some point use the RedoInsts mechanism instead.  Fixes PR12169.
      
      llvm-svn: 156379
      3bbb1d50
    • Andrew Trick's avatar
      Allow NULL LoopPassManager argument in UnrollLoop. PR12734. · d29cd732
      Andrew Trick authored
      llvm-svn: 156358
      d29cd732
  6. May 07, 2012
  7. May 06, 2012
  8. May 05, 2012
    • Benjamin Kramer's avatar
      CodeGenPrepare: Add a transform to turn selects into branches in some cases. · 047d7ca0
      Benjamin Kramer authored
      This came up when a change in block placement formed a cmov and slowed down a
      hot loop by 50%:
      
      	ucomisd	(%rdi), %xmm0
      	cmovbel	%edx, %esi
      
      cmov is a really bad choice in this context because it doesn't get branch
      prediction. If we emit it as a branch, an out-of-order CPU can do a better job
      (if the branch is predicted right) and avoid waiting for the slow load+compare
      instruction to finish. Of course it won't help if the branch is unpredictable,
      but those are really rare in practice.
      
      This patch uses a dumb conservative heuristic, it turns all cmovs that have one
      use and a direct memory operand into branches. cmovs usually save some code
      size, so we disable the transform in -Os mode. In-Order architectures are
      unlikely to benefit as well, those are included in the
      "predictableSelectIsExpensive" flag.
      
      It would be better to reuse branch probability info here, but BPI doesn't
      support select instructions currently. It would make sense to use the same
      heuristics as the if-converter pass, which does the opposite direction of this
      transform.
      
      
      Test suite shows a small improvement here and there on corei7-level machines,
      but the actual results depend a lot on the used microarchitecture. The
      transformation is currently disabled by default and available by passing the
      -enable-cgp-select2branch flag to the code generator.
      
      Thanks to Chandler for the initial test case to him and Evan Cheng for providing
      me with comments and test-suite numbers that were more stable than mine :)
      
      llvm-svn: 156234
      047d7ca0
    • Stepan Dyatkovskiy's avatar
      Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for... · cb2a1a34
      Stepan Dyatkovskiy authored
      Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
      Also added fix to 2011-06-13-nsw-alloca.ll test.
      
      llvm-svn: 156231
      cb2a1a34
  9. May 04, 2012
    • Chandler Carruth's avatar
      Teach the code extractor how to extract a sequence of blocks from · 6781821c
      Chandler Carruth authored
      RegionInfo's RegionNode. This mirrors the logic for automating the
      extraction from a Loop.
      
      llvm-svn: 156208
      6781821c
    • Chandler Carruth's avatar
      Factor the computation of input and output sets into a public interface · 14316fcf
      Chandler Carruth authored
      of the CodeExtractor utility. This allows speculatively computing input
      and output sets to measure the likely size impact of the code
      extraction.
      
      These sets cannot be reused sadly -- we mutate the function prior to
      forming the final sets used by the actual extraction.
      
      The interface has been revamped slightly to make it easier to use
      correctly by making the interface const and sinking the computation of
      the number of exit blocks into the full extraction function and away
      from the rest of this logic which just computed two output parameters.
      
      llvm-svn: 156168
      14316fcf
    • Chandler Carruth's avatar
      Rather than trying to gracefully handle input sequences with repeated · 44e13911
      Chandler Carruth authored
      blocks, assert that this doesn't happen. We don't want to bother trying
      to support this call pattern as it isn't necessary.
      
      llvm-svn: 156167
      44e13911
    • Chandler Carruth's avatar
      Fix a goof with my previous commit by completely returning when we · 0a570552
      Chandler Carruth authored
      detect an in-eligible block rather than just breaking out of the loop.
      
      llvm-svn: 156166
      0a570552
    • Chandler Carruth's avatar
      Hoist a safety assert from the extraction method into the construction · 2f5d0191
      Chandler Carruth authored
      of the extractor itself.
      
      llvm-svn: 156164
      2f5d0191
    • Chandler Carruth's avatar
      Move the CodeExtractor utility to a dedicated header file / source file, · 0fde0015
      Chandler Carruth authored
      and expose it as a utility class rather than as free function wrappers.
      
      The simple free-function interface works well for the bugpoint-specific
      pass's uses of code extraction, but in an upcoming patch for more
      advanced code extraction, they simply don't expose a rich enough
      interface. I need to expose various stages of the process of doing the
      code extraction and query information to decide whether or not to
      actually complete the extraction or give up.
      
      Rather than build up a new predicate model and pass that into these
      functions, just take the class that was actually implementing the
      functions and lift it up into a proper interface that can be used to
      perform code extraction. The interface is cleaned up and re-documented
      to work better in a header. It also is now setup to accept the blocks to
      be extracted in the constructor rather than in a method.
      
      In passing this essentially reverts my previous commit here exposing
      a block-level query for eligibility of extraction. That is no longer
      necessary with the more rich interface as clients can query the
      extraction object for eligibility directly. This will reduce the number
      of walks of the input basic block sequence by quite a bit which is
      useful if this enters the normal optimization pipeline.
      
      llvm-svn: 156163
      0fde0015
    • Bill Wendling's avatar
      Add 'landingpad' instructions to the list of instructions to ignore. · fa0ebcd1
      Bill Wendling authored
      Also combine the code in the 'assert' statement.
      
      llvm-svn: 156155
      fa0ebcd1
    • Chandler Carruth's avatar
      A pile of long over-due refactorings here. There are some very, *very* · da7513a8
      Chandler Carruth authored
      minor behavior changes with this, but nothing I have seen evidence of in
      the wild or expect to be meaningful. The real goal is unifying our logic
      and simplifying the interfaces. A summary of the changes follows:
      
      - Make 'callIsSmall' actually accept a callsite so it can handle
        intrinsics, and simplify callers appropriately.
      - Nuke a completely bogus declaration of 'callIsSmall' that was still
        lurking in InlineCost.h... No idea how this got missed.
      - Teach the 'isInstructionFree' about the various more intelligent
        'free' heuristics that got added to the inline cost analysis during
        review and testing. This mostly surrounds int->ptr and ptr->int casts.
      - Switch most of the interesting parts of the inline cost analysis that
        were essentially computing 'is this instruction free?' to use the code
        metrics routine instead. This way we won't keep duplicating logic.
      
      All of this is motivated by the desire to allow other passes to compute
      a roughly equivalent 'cost' metric for a particular basic block as the
      inline cost analysis. Sadly, re-using the same analysis for both is
      really messy because only the actual inline cost analysis is ever going
      to go to the contortions required for simplification, SROA analysis,
      etc.
      
      llvm-svn: 156140
      da7513a8
    • Chandler Carruth's avatar
      Factor the logic for testing whether a basic block is viable for code · a46e6242
      Chandler Carruth authored
      extraction into a public interface. Also clean it up and apply it more
      consistently such that we check for landing pads *anywhere* in the
      extracted code, not just in single-block extraction.
      
      This will be used to guide decisions in passes that are planning to
      eventually perform a round of code extraction.
      
      llvm-svn: 156114
      a46e6242
    • Nuno Lopes's avatar
      remove calls to calloc if the allocated memory is not used (it was already being done for malloc) · d4cf35d7
      Nuno Lopes authored
      fix a few typos found by Chad in my previous commit
      
      llvm-svn: 156110
      d4cf35d7
  10. May 03, 2012
  11. May 02, 2012
  12. May 01, 2012
  13. Apr 30, 2012
  14. Apr 27, 2012
    • Hal Finkel's avatar
      Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). · 27c32461
      Hal Finkel authored
      Target specific types should not be vectorized. As a practical matter,
      these types are already register matched (at least in the x86 case),
      and codegen does not always work correctly (at least in the ppc case,
      and this is not worth fixing because ppc_fp128 is currently broken and
      will probably go away soon).
      
      llvm-svn: 155729
      27c32461
Loading