Skip to content
  1. Apr 22, 2008
    • Chris Lattner's avatar
      Dig through multiple levels of AND to thread jumps if needed. · d5425e8f
      Chris Lattner authored
      llvm-svn: 50106
      d5425e8f
    • Chris Lattner's avatar
      Teach jump threading to thread through blocks like: · 3df4c15d
      Chris Lattner authored
        br (and X, phi(Y, Z, false)), label L1, label L2
      
      This triggers once on 252.eon and 6 times on 176.gcc.  Blocks 
      in question often look like this:
      
      bb262:		; preds = %bb261, %bb248
      	%iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ]		; <i1> [#uses=4]
      	%tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null		; <i1> [#uses=1]
      	%bothcond = or i1 %iftmp.251.0, %tmp270		; <i1> [#uses=1]
      	br i1 %bothcond, label %bb288, label %bb273
      
      In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261.  When coming from bb248, it is all that matters.
      
      
      Another random example:
      
      check_asm_operands.exit:		; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413
      	%tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1]
      	call void @llvm.stackrestore( i8* %savedstack ) nounwind 
      	%tmp4389 = icmp eq i32 %added_sets_1.0, 0		; <i1> [#uses=1]
      	%tmp4394 = icmp eq i32 %added_sets_2.0, 0		; <i1> [#uses=1]
      	%bothcond80 = and i1 %tmp4389, %tmp4394		; <i1> [#uses=1]
      	%bothcond81 = and i1 %bothcond80, %tmp.0.i420		; <i1> [#uses=1]
      	br i1 %bothcond81, label %bb4398, label %bb4397
      
      Here is the case from 252.eon:
      
      bb290.i.i:		; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110
      	%myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ]		; <i1> [#uses=2]
      	%i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ]		; <i32> [#uses=3]
      	%tmp292.i.i = load i8* %tmp16.i.i100, align 1		; <i8> [#uses=1]
      	%tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0		; <i1> [#uses=1]
      	%bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i		; <i1> [#uses=1]
      	br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i
        Factoring out 3 common predecessors.
      
      On the path from any blocks other than bb23.i57.i.i, the load and compare 
      are dead.
      
      llvm-svn: 50096
      3df4c15d
    • Chris Lattner's avatar
      refactor some code, no functionality change. · e369c35a
      Chris Lattner authored
      llvm-svn: 50094
      e369c35a
    • Chris Lattner's avatar
      remove dead code. · 8fb13cbe
      Chris Lattner authored
      llvm-svn: 50080
      8fb13cbe
    • Chris Lattner's avatar
      optimize "p != gep p, ..." better. This allows us to compile · c3a43935
      Chris Lattner authored
      getelementptr-seteq.ll into:
      
      define i1 @test(i64 %X, %S* %P) {
      	%C = icmp eq i64 %X, -1		; <i1> [#uses=1]
      	ret i1 %C
      }
      
      instead of:
      
      define i1 @test(i64 %X, %S* %P) {
      	%A.idx.mask = and i64 %X, 4611686018427387903		; <i64> [#uses=1]
      	%C = icmp eq i64 %A.idx.mask, 4611686018427387903		; <i1> [#uses=1]
      	ret i1 %C
      }
      
      And fixes the second half of PR2235.  This speeds up the insertion sort
      case by 45%, from 1.12s to 0.77s.  In practice, this will significantly
      speed up for loops structured like:
      
      for (double *P = Base + N; P != Base; --P)
        ...
      
      Which happens frequently for C++ iterators.
      
      llvm-svn: 50079
      c3a43935
  2. Apr 21, 2008
  3. Apr 20, 2008
  4. Apr 17, 2008
  5. Apr 14, 2008
  6. Apr 13, 2008
  7. Apr 11, 2008
  8. Apr 10, 2008
    • Dan Gohman's avatar
      Teach InstCombine's ComputeMaskedBits to handle pointer expressions · 99b7b3f0
      Dan Gohman authored
      in addition to integer expressions. Rewrite GetOrEnforceKnownAlignment
      as a ComputeMaskedBits problem, moving all of its special alignment
      knowledge to ComputeMaskedBits as low-zero-bits knowledge.
      
      Also, teach ComputeMaskedBits a few basic things about Mul and PHI
      instructions.
      
      This improves ComputeMaskedBits-based simplifications in a few cases,
      but more noticeably it significantly improves instcombine's alignment
      detection for loads, stores, and memory intrinsics.
      
      llvm-svn: 49492
      99b7b3f0
  9. Apr 09, 2008
  10. Apr 07, 2008
  11. Apr 06, 2008
  12. Apr 02, 2008
    • David Greene's avatar
      · 586740f4
      David Greene authored
      Iterators folloring a SmallVector erased element are invalidated so
      don't access cached iterators from after the erased element.
      
      Re-apply 49056 with SmallVector support.
      
      llvm-svn: 49106
      586740f4
    • Tanya Lattner's avatar
      Reverting 49056 due to the build being broken. · 052838c5
      Tanya Lattner authored
      llvm-svn: 49060
      052838c5
    • David Greene's avatar
      · 7f7edc38
      David Greene authored
      Iterators folloring a SmallVector erased element are invalidated so
      don't access cached iterators from after the erased element.
      
      llvm-svn: 49056
      7f7edc38
  13. Mar 31, 2008
  14. Mar 30, 2008
  15. Mar 29, 2008
  16. Mar 28, 2008
    • Chris Lattner's avatar
      make memset inference significantly more powerful: it can now handle · d62964a7
      Chris Lattner authored
      memsets that initialize "structs of arrays" and other store sequences
      that are not sequential.  This is still only enabled if you pass 
      -form-memset-from-stores.  The flag is not heavily tested and I haven't
      analyzed the perf regressions when -form-memset-from-stores is passed
      either, but this causes no make check regressions.
      
      llvm-svn: 48909
      d62964a7
Loading