Skip to content
  1. Jan 08, 2011
    • Chris Lattner's avatar
      constify TargetData references. · c638147e
      Chris Lattner authored
      Split memset formation logic out into its own
      "tryMergingIntoMemset" helper function.
      
      llvm-svn: 123081
      c638147e
    • Chris Lattner's avatar
      When loop rotation happens, it is *very* common for the duplicated condbr · 59c82f85
      Chris Lattner authored
      to be foldable into an uncond branch.  When this happens, we can make a
      much simpler CFG for the loop, which is important for nested loop cases
      where we want the outer loop to be aggressively optimized.
      
      Handle this case more aggressively.  For example, previously on
      phi-duplicate.ll we would get this:
      
      
      define void @test(i32 %N, double* %G) nounwind ssp {
      entry:
        %cmp1 = icmp slt i64 1, 1000
        br i1 %cmp1, label %bb.nph, label %for.end
      
      bb.nph:                                           ; preds = %entry
        br label %for.body
      
      for.body:                                         ; preds = %bb.nph, %for.cond
        %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ]
        %arrayidx = getelementptr inbounds double* %G, i64 %j.02
        %tmp3 = load double* %arrayidx
        %sub = sub i64 %j.02, 1
        %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
        %tmp7 = load double* %arrayidx6
        %add = fadd double %tmp3, %tmp7
        %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02
        store double %add, double* %arrayidx10
        %inc = add nsw i64 %j.02, 1
        br label %for.cond
      
      for.cond:                                         ; preds = %for.body
        %cmp = icmp slt i64 %inc, 1000
        br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
      
      for.cond.for.end_crit_edge:                       ; preds = %for.cond
        br label %for.end
      
      for.end:                                          ; preds = %for.cond.for.end_crit_edge, %entry
        ret void
      }
      
      Now we get the much nicer:
      
      define void @test(i32 %N, double* %G) nounwind ssp {
      entry:
        br label %for.body
      
      for.body:                                         ; preds = %entry, %for.body
        %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ]
        %arrayidx = getelementptr inbounds double* %G, i64 %j.01
        %tmp3 = load double* %arrayidx
        %sub = sub i64 %j.01, 1
        %arrayidx6 = getelementptr inbounds double* %G, i64 %sub
        %tmp7 = load double* %arrayidx6
        %add = fadd double %tmp3, %tmp7
        %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01
        store double %add, double* %arrayidx10
        %inc = add nsw i64 %j.01, 1
        %cmp = icmp slt i64 %inc, 1000
        br i1 %cmp, label %for.body, label %for.end
      
      for.end:                                          ; preds = %for.body
        ret void
      }
      
      With all of these recent changes, we are now able to compile:
      
      void foo(char *X) {
       for (int i = 0; i != 100; ++i) 
         for (int j = 0; j != 100; ++j)
           X[j+i*100] = 0;
      }
      
      into a single memset of 10000 bytes.  This series of changes
      should also be helpful for other nested loop scenarios as well.
      
      llvm-svn: 123079
      59c82f85
    • Chris Lattner's avatar
      make domtree verification print something useful on failure. · 5f7734c4
      Chris Lattner authored
      llvm-svn: 123078
      5f7734c4
    • Chris Lattner's avatar
      split ssa updating code out to its own helper function. Don't bother · 30f318e5
      Chris Lattner authored
      moving the OrigHeader block anymore: we just merge it away anyway so
      its code layout doesn't matter.
      
      llvm-svn: 123077
      30f318e5
    • Chris Lattner's avatar
      Implement a TODO: Enhance loopinfo to merge away the unconditional branch · 2615130e
      Chris Lattner authored
      that it was leaving in loops after rotation (between the original latch
      block and the original header.
      
      With this change, it is possible for rotated loops to have just a single
      basic block, which is useful.
      
      llvm-svn: 123075
      2615130e
    • Chris Lattner's avatar
      various code cleanups, enhance MergeBlockIntoPredecessor to preserve · 930b716e
      Chris Lattner authored
      loop info.
      
      llvm-svn: 123074
      930b716e
    • Chris Lattner's avatar
      inline preserveCanonicalLoopForm now that it is simple. · fee37c5f
      Chris Lattner authored
      llvm-svn: 123073
      fee37c5f
    • Chris Lattner's avatar
      Three major changes: · 063dca0f
      Chris Lattner authored
      1. Rip out LoopRotate's domfrontier updating code.  It isn't
         needed now that LICM doesn't use DF and it is super complex
         and gross.
      2. Make DomTree updating code a lot simpler and faster.  The 
         old loop over all the blocks was just to find a block??
      3. Change the code that inserts the new preheader to just use
         SplitCriticalEdge instead of doing an overcomplex 
         reimplementation of it.
      
      No behavior change, except for the name of the inserted preheader.
      
      llvm-svn: 123072
      063dca0f
    • Chris Lattner's avatar
      reduce nesting. · 30d95f9f
      Chris Lattner authored
      llvm-svn: 123071
      30d95f9f
    • Chris Lattner's avatar
      LoopRotate requires canonical loop form, so it always has preheaders · 7fab23bc
      Chris Lattner authored
      and latch blocks.  Reorder entry conditions to make hte pass faster
      and more logical.
      
      llvm-svn: 123069
      7fab23bc
    • Chris Lattner's avatar
      use the LI ivar. · d62691f4
      Chris Lattner authored
      llvm-svn: 123068
      d62691f4
    • Chris Lattner's avatar
      some cleanups: remove dead arguments and eliminate ivars · 385f2ec6
      Chris Lattner authored
      that are just passed to one function.
      
      llvm-svn: 123067
      385f2ec6
    • Chris Lattner's avatar
      fix an issue duncan pointed out, which could cause loop rotate · 25ba40a0
      Chris Lattner authored
      to violate LCSSA form
      
      llvm-svn: 123066
      25ba40a0
    • Cameron Zwarich's avatar
      Fix coding style issues. · b4ab257b
      Cameron Zwarich authored
      llvm-svn: 123065
      b4ab257b
    • Cameron Zwarich's avatar
      Make more passes preserve dominators (or state that they preserve dominators if · 84986b29
      Cameron Zwarich authored
      they all ready do). This removes two dominator recomputations prior to isel,
      which is a 1% improvement in total llc time for 403.gcc.
      
      The only potentially suspect thing is making GCStrategy recompute dominators if
      it used a custom lowering strategy.
      
      llvm-svn: 123064
      84986b29
    • Rafael Espindola's avatar
      First step in fixing PR8927: · 45e6c195
      Rafael Espindola authored
      Add a unnamed_addr bit to global variables and functions. This will be used
      to indicate that the address is not significant and therefore the constant
      or function can be merged with others.
      
      If an optimization pass can show that an address is not used, it can set this.
      
      Examples of things that can have this set by the FE are globals created to
      hold string literals and C++ constructors.
      
      Adding unnamed_addr to a non-const global should have no effect unless
      an optimization can transform that global into a constant.
      
      Aliases are not allowed to have unnamed_addr since I couldn't figure
      out any use for it.
      
      llvm-svn: 123063
      45e6c195
    • Cameron Zwarich's avatar
      Contract subloop bodies. However, it is still important to visit the phis at the · 80bd9af7
      Cameron Zwarich authored
      top of subloop headers, as the phi uses logically occur outside of the subloop.
      
      llvm-svn: 123062
      80bd9af7
    • Frits van Bommel's avatar
    • Chris Lattner's avatar
      Have loop-rotate simplify instructions (yay instsimplify!) as it clones · 8c5defd0
      Chris Lattner authored
      them into the loop preheader, eliminating silly instructions like
      "icmp i32 0, 100" in fixed tripcount loops.  This also better exposes the 
      bigger problem with loop rotate that I'd like to fix: once this has been
      folded, the duplicated conditional branch *often* turns into an uncond branch.
      
      Not aggressively handling this is pessimizing later loop optimizations 
      somethin' fierce by making "dominates all exit blocks" checks fail.
      
      llvm-svn: 123060
      8c5defd0
    • Chris Lattner's avatar
      Revamp the ValueMapper interfaces in a couple ways: · 43f8d164
      Chris Lattner authored
      1. Take a flags argument instead of a bool.  This makes
         it more clear to the reader what it is used for.
      2. Add a flag that says that "remapping a value not in the
         map is ok".
      3. Reimplement MapValue to share a bunch of code and be a lot
         more efficient.  For lookup failures, don't drop null values
         into the map.
      4. Using the new flag a bunch of code can vaporize in LinkModules
         and LoopUnswitch, kill it.
      
      No functionality change.
      
      llvm-svn: 123058
      43f8d164
    • Chris Lattner's avatar
      two minor changes: switch to the standard ValueToValueMapTy · 2b3f20e6
      Chris Lattner authored
      map from ValueMapper.h (giving us access to its utilities)
      and add a fastpath in the loop rotation code, avoiding expensive
      ssa updator manipulation for values with nothing to update.
      
      llvm-svn: 123057
      2b3f20e6
    • Evan Cheng's avatar
      Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. · 078b0b09
      Evan Cheng authored
      llvm-svn: 123048
      078b0b09
    • Evan Cheng's avatar
      Do not model all INLINEASM instructions as having unmodelled side effects. · 6eb516db
      Evan Cheng authored
      Instead encode llvm IR level property "HasSideEffects" in an operand (shared
      with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
      the operand when the instruction is an INLINEASM.
      
      This allows memory instructions to be moved around INLINEASM instructions.
      
      llvm-svn: 123044
      6eb516db
    • Bob Wilson's avatar
      Add an explanatory message for an assertion. · 3fa9c064
      Bob Wilson authored
      llvm-svn: 123042
      3fa9c064
  2. Jan 07, 2011
Loading