Skip to content
  1. Jan 09, 2011
  2. Jan 08, 2011
    • Cameron Zwarich's avatar
      Fix coding style. · 0939bc37
      Cameron Zwarich authored
      llvm-svn: 123093
      0939bc37
    • Chris Lattner's avatar
      fix a latent bug in memcpyoptimizer that my recent patches exposed: it wasn't · 7d6433ae
      Chris Lattner authored
      updating memdep when fusing stores together.  This fixes the crash optimizing
      the bullet benchmark.
      
      llvm-svn: 123091
      7d6433ae
    • Chris Lattner's avatar
      tryMergingIntoMemset can only handle constant length memsets. · ff6ed2ac
      Chris Lattner authored
      llvm-svn: 123090
      ff6ed2ac
    • Chris Lattner's avatar
      Merge memsets followed by neighboring memsets and other stores into · 9a1d63ba
      Chris Lattner authored
      larger memsets.  Among other things, this fixes rdar://8760394 and
      allows us to handle "Example 2" from http://blog.regehr.org/archives/320,
      compiling it into a single 4096-byte memset:
      
      _mad_synth_mute:                        ## @mad_synth_mute
      ## BB#0:                                ## %entry
      	pushq	%rax
      	movl	$4096, %esi             ## imm = 0x1000
      	callq	___bzero
      	popq	%rax
      	ret
      
      llvm-svn: 123089
      9a1d63ba
    • Chris Lattner's avatar
      fix an issue in IsPointerOffset that prevented us from recognizing that · 5120ebf1
      Chris Lattner authored
      P and P+1 are relative to the same base pointer.
      
      llvm-svn: 123087
      5120ebf1
    • Chris Lattner's avatar
      enhance memcpyopt to merge a store and a subsequent · 4dc1fd93
      Chris Lattner authored
      memset into a single larger memset.
      
      llvm-svn: 123086
      4dc1fd93
    • Chris Lattner's avatar
      fit in 80 cols · 2f2c3351
      Chris Lattner authored
      llvm-svn: 123085
      2f2c3351
    • Chris Lattner's avatar
      merge two tests and filecheckify · 9dbbc49f
      Chris Lattner authored
      llvm-svn: 123082
      9dbbc49f
    • Chris Lattner's avatar
      constify TargetData references. · c638147e
      Chris Lattner authored
      Split memset formation logic out into its own
      "tryMergingIntoMemset" helper function.
      
      llvm-svn: 123081
      c638147e
    • Chris Lattner's avatar
      When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85
      Chris Lattner authored
      to be foldable into an uncond branch.  When this happens, we can make a
      much simpler CFG for the loop, which is important for nested loop cases
      where we want the outer loop to be aggressively optimized.
      
      Handle this case more aggressively.  For example, previously on
      phi-duplicate.ll we would get this:
      
      
      define void @test(i32 %N, double* %G) nounwind ssp {
      entry:
        %cmp1 = icmp slt i64 1, 1000
        br i1 %cmp1, label %bb.nph, label %for.end
      
      bb.nph:                                           ; preds = %entry
        br label %for.body
      
      for.body:                                         ; preds = %bb.nph, %for.cond
        %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
        %arrayidx = getelementptr inbounds double* %G, i64 %j.02
        %tmp3 = load double* %arrayidx
        %sub = sub i64 %j.02, 1
        %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
        %tmp7 = load double* %arrayidx6
        %add = fadd double %tmp3, %tmp7
        %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
        store double %add, double* %arrayidx10
        %inc = add nsw i64 %j.02, 1
        br label %for.cond
      
      for.cond:                                         ; preds = %for.body
        %cmp = icmp slt i64 %inc, 1000
        br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
      
      for.cond.for.end_crit_edge:                       ; preds = %for.cond
        br label %for.end
      
      for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
        ret void
      }
      
      Now we get the much nicer:
      
      define void @test(i32 %N, double* %G) nounwind ssp {
      entry:
        br label %for.body
      
      for.body:                                         ; preds = %entry, %for.body
        %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
        %arrayidx = getelementptr inbounds double* %G, i64 %j.01
        %tmp3 = load double* %arrayidx
        %sub = sub i64 %j.01, 1
        %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
        %tmp7 = load double* %arrayidx6
        %add = fadd double %tmp3, %tmp7
        %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
        store double %add, double* %arrayidx10
        %inc = add nsw i64 %j.01, 1
        %cmp = icmp slt i64 %inc, 1000
        br i1 %cmp, label %for.body, label %for.end
      
      for.end:                                          ; preds = %for.body
        ret void
      }
      
      With all of these recent changes, we are now able to compile:
      
      void foo(char *X) {
       for (int i = 0; i != 100; ++i) 
         for (int j = 0; j != 100; ++j)
           X[j+i*100] = 0;
      }
      
      into a single memset of 10000 bytes.  This series of changes
      should also be helpful for other nested loop scenarios as well.
      
      llvm-svn: 123079
      59c82f85
Loading