Skip to content
  1. Feb 03, 2012
  2. Feb 01, 2012
    • Hal Finkel's avatar
      Add a basic-block autovectorization pass. · c34e5113
      Hal Finkel authored
      This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
      Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).
      
      llvm-svn: 149468
      c34e5113
    • Jim Grosbach's avatar
      Disable InstCombine unsafe folding bitcasts of calls w/ varargs. · 9fa04815
      Jim Grosbach authored
      Changing arguments from being passed as fixed to varargs is unsafe, as
      the ABI may require they be handled differently (stack vs. register, for
      example).
      
      Remove two tests which rely on the bitcast being folded into the direct
      call, which is exactly the transformation that's unsafe.
      
      llvm-svn: 149457
      9fa04815
  3. Jan 31, 2012
  4. Jan 28, 2012
  5. Jan 27, 2012
  6. Jan 25, 2012
  7. Jan 23, 2012
  8. Jan 20, 2012
    • Andrew Trick's avatar
      Handle a corner case with IV chain collection with bailout instead of assert. · b9c822ab
      Andrew Trick authored
      Fixes PR11783: bad cast to AddRecExpr.
      
      llvm-svn: 148572
      b9c822ab
    • Andrew Trick's avatar
      Test case comments missing from my previous checkin. · 16abc8a1
      Andrew Trick authored
      llvm-svn: 148571
      16abc8a1
    • Nick Lewycky's avatar
      Fix CountCodeReductionForAlloca to more accurately represent what SROA can and · e8415fea
      Nick Lewycky authored
      can't handle. Also don't produce non-zero results for things which won't be
      transformed by SROA at all just because we saw the loads/stores before we saw
      the use of the address.
      
      llvm-svn: 148536
      e8415fea
    • Andrew Trick's avatar
      SCEVExpander fixes. Affects LSR and indvars. · c908b43d
      Andrew Trick authored
      LSR has gradually been improved to more aggressively reuse existing code, particularly existing phi cycles. This exposed problems with the SCEVExpander's sloppy treatment of its insertion point. I applied some rigor to the insertion point problem that will hopefully avoid an endless bug cycle in this area. Changes:
      
      - Always used properlyDominates to check safe code hoisting.
      
      - The insertion point provided to SCEV is now considered a lower bound. This is usually a block terminator or the use itself. Under no cirumstance may SCEVExpander insert below this point.
      
      - LSR is reponsible for finding a "canonical" insertion point across expansion of different expressions.
      
      - Robust logic to determine whether IV increments are in "expanded" form and/or can be safely hoisted above some insertion point.
      
      Fixes PR11783: SCEVExpander assert.
      
      llvm-svn: 148535
      c908b43d
  9. Jan 19, 2012
  10. Jan 18, 2012
  11. Jan 17, 2012
  12. Jan 14, 2012
  13. Jan 13, 2012
  14. Jan 11, 2012
    • Duncan Sands's avatar
      Don't try to create a GEP when the pointee type is unsized (such GEPs · 0bf46b53
      Duncan Sands authored
      are invalid).  Fixes a crash on array1.C from the GCC testsuite when
      compiled with dragonegg.
      
      llvm-svn: 147946
      0bf46b53
    • Stepan Dyatkovskiy's avatar
      Improved compile time: · 82165698
      Stepan Dyatkovskiy authored
      1. Size heuristics changed. Now we calculate number of unswitching
      branches only once per loop.
      2. Some checks was moved from UnswitchIfProfitable to
      processCurrentLoop, since it is not changed during processCurrentLoop
      iteration. It allows decide to skip some loops at an early stage.
      Extended statistics:
      - Added total number of instructions analyzed.
      
      llvm-svn: 147935
      82165698
    • Bill Wendling's avatar
      If the global variable is removed by the linker, then don't constant merge it · c7915519
      Bill Wendling authored
      with other symbols.
      
      An object in the __cfstring section is suppoed to be filled with CFString
      objects, which have a pointer to ___CFConstantStringClassReference followed by a
      pointer to a __cstring. If we allow the object in the __cstring section to be
      merged with another global, then it could end up in any section. Because the
      linker is going to remove these symbols in the final executable, we shouldn't
      bother to merge them.
      <rdar://problem/10564621>
      
      llvm-svn: 147899
      c7915519
  15. Jan 10, 2012
    • Andrew Trick's avatar
      Enable LSR IV Chains with sufficient heuristics. · d5d2db9a
      Andrew Trick authored
      These heuristics are sufficient for enabling IV chains by
      default. Performance analysis has been done for i386, x86_64, and
      thumbv7. The optimization is rarely important, but can significantly
      speed up certain cases by eliminating spill code within the
      loop. Unrolled loops are prime candidates for IV chains. In many
      cases, the final code could still be improved with more target
      specific optimization following LSR. The goal of this feature is for
      LSR to make the best choice of induction variables.
      
      Instruction selection may not completely take advantage of this
      feature yet. As a result, there could be cases of slight code size
      increase.
      
      Code size can be worse on x86 because it doesn't support postincrement
      addressing. In fact, when chains are formed, you may see redundant
      address plus stride addition in the addressing mode. GenerateIVChains
      tries to compensate for the common cases.
      
      On ARM, code size increase can be mitigated by using postincrement
      addressing, but downstream codegen currently misses some opportunities.
      
      llvm-svn: 147826
      d5d2db9a
  16. Jan 09, 2012
    • Andrew Trick's avatar
      Adding IV chain generation to LSR. · 248d410e
      Andrew Trick authored
      After collecting chains, check if any should be materialized. If so,
      hide the chained IV users from the LSR solver. LSR will only solve for
      the head of the chain. GenerateIVChains will then materialize the
      chained IV users by computing the IV relative to its previous value in
      the chain.
      
      In theory, chained IV users could be exposed to LSR's solver. This
      would be considerably complicated to implement and I'm not aware of a
      case where we need it. In practice it's more important to
      intelligently prune the search space of nontrivial loops before
      running the solver, otherwise the solver is often forced to prune the
      most optimal solutions. Hiding the chained users does this well, so
      that LSR is more likely to find the best IV for the chain as a whole.
      
      llvm-svn: 147801
      248d410e
    • Benjamin Kramer's avatar
      InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit tests. · f9d0cc01
      Benjamin Kramer authored
      This subsumes several other transforms while enabling us to catch more cases.
      
      llvm-svn: 147777
      f9d0cc01
  17. Jan 08, 2012
  18. Jan 07, 2012
  19. Jan 06, 2012
  20. Jan 05, 2012
  21. Jan 04, 2012
  22. Jan 02, 2012
  23. Dec 31, 2011
Loading