Skip to content
  1. Jan 08, 2011
    • Chris Lattner's avatar
      enhance memcpyopt to merge a store and a subsequent · 4dc1fd93
      Chris Lattner authored
      memset into a single larger memset.
      
      llvm-svn: 123086
      4dc1fd93
    • Chris Lattner's avatar
      merge two tests and filecheckify · 9dbbc49f
      Chris Lattner authored
      llvm-svn: 123082
      9dbbc49f
    • Chris Lattner's avatar
      When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85
      Chris Lattner authored
      to be foldable into an uncond branch.  When this happens, we can make a
      much simpler CFG for the loop, which is important for nested loop cases
      where we want the outer loop to be aggressively optimized.
      
      Handle this case more aggressively.  For example, previously on
      phi-duplicate.ll we would get this:
      
      
      define void @test(i32 %N, double* %G) nounwind ssp {
      entry:
        %cmp1 = icmp slt i64 1, 1000
        br i1 %cmp1, label %bb.nph, label %for.end
      
      bb.nph:                                           ; preds = %entry
        br label %for.body
      
      for.body:                                         ; preds = %bb.nph, %for.cond
        %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
        %arrayidx = getelementptr inbounds double* %G, i64 %j.02
        %tmp3 = load double* %arrayidx
        %sub = sub i64 %j.02, 1
        %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
        %tmp7 = load double* %arrayidx6
        %add = fadd double %tmp3, %tmp7
        %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
        store double %add, double* %arrayidx10
        %inc = add nsw i64 %j.02, 1
        br label %for.cond
      
      for.cond:                                         ; preds = %for.body
        %cmp = icmp slt i64 %inc, 1000
        br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
      
      for.cond.for.end_crit_edge:                       ; preds = %for.cond
        br label %for.end
      
      for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
        ret void
      }
      
      Now we get the much nicer:
      
      define void @test(i32 %N, double* %G) nounwind ssp {
      entry:
        br label %for.body
      
      for.body:                                         ; preds = %entry, %for.body
        %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
        %arrayidx = getelementptr inbounds double* %G, i64 %j.01
        %tmp3 = load double* %arrayidx
        %sub = sub i64 %j.01, 1
        %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
        %tmp7 = load double* %arrayidx6
        %add = fadd double %tmp3, %tmp7
        %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
        store double %add, double* %arrayidx10
        %inc = add nsw i64 %j.01, 1
        %cmp = icmp slt i64 %inc, 1000
        br i1 %cmp, label %for.body, label %for.end
      
      for.end:                                          ; preds = %for.body
        ret void
      }
      
      With all of these recent changes, we are now able to compile:
      
      void foo(char *X) {
       for (int i = 0; i != 100; ++i) 
         for (int j = 0; j != 100; ++j)
           X[j+i*100] = 0;
      }
      
      into a single memset of 10000 bytes.  This series of changes
      should also be helpful for other nested loop scenarios as well.
      
      llvm-svn: 123079
      59c82f85
    • Chris Lattner's avatar
      Three major changes: · 063dca0f
      Chris Lattner authored
      1. Rip out LoopRotate's domfrontier updating code.  It isn't
         needed now that LICM doesn't use DF and it is super complex
         and gross.
      2. Make DomTree updating code a lot simpler and faster.  The 
         old loop over all the blocks was just to find a block??
      3. Change the code that inserts the new preheader to just use
         SplitCriticalEdge instead of doing an overcomplex 
         reimplementation of it.
      
      No behavior change, except for the name of the inserted preheader.
      
      llvm-svn: 123072
      063dca0f
    • Rafael Espindola's avatar
      First step in fixing PR8927: · 45e6c195
      Rafael Espindola authored
      Add a unnamed_addr bit to global variables and functions. This will be used
      to indicate that the address is not significant and therefore the constant
      or function can be merged with others.
      
      If an optimization pass can show that an address is not used, it can set this.
      
      Examples of things that can have this set by the FE are globals created to
      hold string literals and C++ constructors.
      
      Adding unnamed_addr to a non-const global should have no effect unless
      an optimization can transform that global into a constant.
      
      Aliases are not allowed to have unnamed_addr since I couldn't figure
      out any use for it.
      
      llvm-svn: 123063
      45e6c195
    • Frits van Bommel's avatar
    • Chris Lattner's avatar
      Have loop-rotate simplify instructions (yay instsimplify!) as it clones · 8c5defd0
      Chris Lattner authored
      them into the loop preheader, eliminating silly instructions like
      "icmp i32 0, 100" in fixed tripcount loops.  This also better exposes the 
      bigger problem with loop rotate that I'd like to fix: once this has been
      folded, the duplicated conditional branch *often* turns into an uncond branch.
      
      Not aggressively handling this is pessimizing later loop optimizations 
      somethin' fierce by making "dominates all exit blocks" checks fail.
      
      llvm-svn: 123060
      8c5defd0
    • Evan Cheng's avatar
      Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. · 078b0b09
      Evan Cheng authored
      llvm-svn: 123048
      078b0b09
    • Evan Cheng's avatar
      Do not model all INLINEASM instructions as having unmodelled side effects. · 6eb516db
      Evan Cheng authored
      Instead encode llvm IR level property "HasSideEffects" in an operand (shared
      with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
      the operand when the instruction is an INLINEASM.
      
      This allows memory instructions to be moved around INLINEASM instructions.
      
      llvm-svn: 123044
      6eb516db
  2. Jan 07, 2011
  3. Jan 06, 2011
  4. Jan 05, 2011
  5. Jan 04, 2011
Loading