- Jan 08, 2011
-
-
Chris Lattner authored
P and P+1 are relative to the same base pointer. llvm-svn: 123087
-
Chris Lattner authored
memset into a single larger memset. llvm-svn: 123086
-
Chris Lattner authored
Split memset formation logic out into its own "tryMergingIntoMemset" helper function. llvm-svn: 123081
-
Chris Lattner authored
to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079
-
Chris Lattner authored
moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077
-
Chris Lattner authored
that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075
-
Chris Lattner authored
loop info. llvm-svn: 123074
-
Chris Lattner authored
llvm-svn: 123073
-
Chris Lattner authored
1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072
-
Chris Lattner authored
llvm-svn: 123071
-
Chris Lattner authored
and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069
-
Chris Lattner authored
llvm-svn: 123068
-
Chris Lattner authored
that are just passed to one function. llvm-svn: 123067
-
Chris Lattner authored
to violate LCSSA form llvm-svn: 123066
-
Cameron Zwarich authored
llvm-svn: 123065
-
Cameron Zwarich authored
they all ready do). This removes two dominator recomputations prior to isel, which is a 1% improvement in total llc time for 403.gcc. The only potentially suspect thing is making GCStrategy recompute dominators if it used a custom lowering strategy. llvm-svn: 123064
-
Cameron Zwarich authored
top of subloop headers, as the phi uses logically occur outside of the subloop. llvm-svn: 123062
-
Frits van Bommel authored
llvm-svn: 123061
-
Chris Lattner authored
them into the loop preheader, eliminating silly instructions like "icmp i32 0, 100" in fixed tripcount loops. This also better exposes the bigger problem with loop rotate that I'd like to fix: once this has been folded, the duplicated conditional branch *often* turns into an uncond branch. Not aggressively handling this is pessimizing later loop optimizations somethin' fierce by making "dominates all exit blocks" checks fail. llvm-svn: 123060
-
Chris Lattner authored
1. Take a flags argument instead of a bool. This makes it more clear to the reader what it is used for. 2. Add a flag that says that "remapping a value not in the map is ok". 3. Reimplement MapValue to share a bunch of code and be a lot more efficient. For lookup failures, don't drop null values into the map. 4. Using the new flag a bunch of code can vaporize in LinkModules and LoopUnswitch, kill it. No functionality change. llvm-svn: 123058
-
Chris Lattner authored
map from ValueMapper.h (giving us access to its utilities) and add a fastpath in the loop rotation code, avoiding expensive ssa updator manipulation for values with nothing to update. llvm-svn: 123057
-
- Jan 07, 2011
-
-
Tobias Grosser authored
X = sext x; x >s c ? X : C+1 --> X = sext x; X <s C+1 ? C+1 : X X = sext x; x <s c ? X : C-1 --> X = sext x; X >s C-1 ? C-1 : X X = zext x; x >u c ? X : C+1 --> X = zext x; X <u C+1 ? C+1 : X X = zext x; x <u c ? X : C-1 --> X = zext x; X >u C-1 ? C-1 : X X = sext x; x >u c ? X : C+1 --> X = sext x; X <u C+1 ? C+1 : X X = sext x; x <u c ? X : C-1 --> X = sext x; X >u C-1 ? C-1 : X Instead of calculating this with mixed types promote all to the larger type. This enables scalar evolution to analyze this expression. PR8866 llvm-svn: 123034
-
Tobias Grosser authored
llvm-svn: 123033
-
Benjamin Kramer authored
llvm-svn: 123030
-
Jay Foad authored
llvm-svn: 123025
-
- Jan 06, 2011
-
-
Benjamin Kramer authored
This happens when we take the (non-constant) length from a malloc. llvm-svn: 122961
-
Benjamin Kramer authored
InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc. llvm-svn: 122959
-
Benjamin Kramer authored
llvm-svn: 122958
-
Cameron Zwarich authored
OptimizeInst() so that they can be used on a worklist instruction. llvm-svn: 122945
-
Cameron Zwarich authored
llvm-svn: 122944
-
Cameron Zwarich authored
into a separate function, so that it can be called from a loop using a worklist rather than a loop traversing a whole basic block. llvm-svn: 122943
-
Jakob Stoklund Olesen authored
Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940
-
Cameron Zwarich authored
worklist, the key will need to become std::pair<BasicBlock*, Value*>. llvm-svn: 122932
-
- Jan 05, 2011
-
-
Cameron Zwarich authored
llvm-svn: 122891
-
Cameron Zwarich authored
regressing code quality. llvm-svn: 122887
-
Cameron Zwarich authored
llvm-svn: 122876
-
Cameron Zwarich authored
step is to only process instructions in subloops if they have been modified by an earlier simplification. llvm-svn: 122869
-
Cameron Zwarich authored
skipping them, but it should probably use a worklist and only revisit those instructions in subloops that have actually changed. It should probably also use a worklist after the first iteration like instsimplify now does. Regardless, it's only 0.3% of opt -O2 time on 403.gcc if it replaces the instcombine placed in the middle of the loop passes. llvm-svn: 122868
-
- Jan 04, 2011
-
-
Owen Anderson authored
Don't bother value numbering instructions with void types in GVN. In theory this should allow us to insert fewer things into the value numbering maps, but any speedup is beneath the noise threshold on my machine on 403.gcc. llvm-svn: 122844
-
Owen Anderson authored
llvm-svn: 122828
-