- Jan 12, 2011
-
-
Chris Lattner authored
llvm-svn: 123302
-
Chris Lattner authored
of the bootstrap miscompare issue. llvm-svn: 123299
-
Chris Lattner authored
the source of the bootstrap problem. llvm-svn: 123298
-
- Jan 11, 2011
-
-
Jakob Stoklund Olesen authored
llvm-svn: 123288
-
Jakob Stoklund Olesen authored
DT->changeImmediateDominator() trivially ignores identity updates, so there is really no need for the uniqueing provided by SmallPtrSet. I expect this to fix PR8954. llvm-svn: 123286
-
Cameron Zwarich authored
once at the beginning of GVN instead of once per iteration. llvm-svn: 123278
-
Cameron Zwarich authored
llvm-svn: 123270
-
Owen Anderson authored
llvm-svn: 123248
-
Chris Lattner authored
llvm-svn: 123247
-
Frits van Bommel authored
Factor the actual simplification out of SimplifyIndirectBrOnSelect and into a new helper function so it can be reused in e.g. an upcoming SimplifySwitchOnSelect. No functional change. llvm-svn: 123234
-
Chris Lattner authored
actually reached in the testcase in PR8954, but it's safe and good practice. llvm-svn: 123224
-
Chris Lattner authored
is floating around in the ether. llvm-svn: 123223
-
Chris Lattner authored
phi nodes. It is called from MergeBlockIntoPredecessor which is called from GVN, which claims to preserve these. I'm skeptical that this is the actual problem behind PR8954, but this is a stab in the right direction. llvm-svn: 123222
-
Chris Lattner authored
llvm-svn: 123221
-
Chris Lattner authored
neccesarily an uncond branch to the header. This fixes PR8955 (the assertion tripping). llvm-svn: 123219
-
Owen Anderson authored
Fix a random missed optimization by making InstCombine more aggressive when determining which bits are demanded by a comparison against a constant. llvm-svn: 123203
-
- Jan 10, 2011
-
-
Chandler Carruth authored
intrinsics element dependencies. Reviewed by Nick. llvm-svn: 123161
-
Chris Lattner authored
llvm-svn: 123149
-
Chris Lattner authored
back to life. llvm-svn: 123146
-
Chris Lattner authored
llvm-svn: 123144
-
- Jan 09, 2011
-
-
Chris Lattner authored
without informing memdep. This could cause nondeterminstic weirdness based on where instructions happen to get allocated, and will hopefully breath some life into some broken testers. llvm-svn: 123124
-
Tobias Grosser authored
llvm-svn: 123121
-
Cameron Zwarich authored
llvm-svn: 123117
-
Chris Lattner authored
that have the bit set. llvm-svn: 123104
-
- Jan 08, 2011
-
-
Chris Lattner authored
updating memdep when fusing stores together. This fixes the crash optimizing the bullet benchmark. llvm-svn: 123091
-
Chris Lattner authored
llvm-svn: 123090
-
Chris Lattner authored
larger memsets. Among other things, this fixes rdar://8760394 and allows us to handle "Example 2" from http://blog.regehr.org/archives/320, compiling it into a single 4096-byte memset: _mad_synth_mute: ## @mad_synth_mute ## BB#0: ## %entry pushq %rax movl $4096, %esi ## imm = 0x1000 callq ___bzero popq %rax ret llvm-svn: 123089
-
Chris Lattner authored
P and P+1 are relative to the same base pointer. llvm-svn: 123087
-
Chris Lattner authored
memset into a single larger memset. llvm-svn: 123086
-
Chris Lattner authored
Split memset formation logic out into its own "tryMergingIntoMemset" helper function. llvm-svn: 123081
-
Chris Lattner authored
to be foldable into an uncond branch. When this happens, we can make a much simpler CFG for the loop, which is important for nested loop cases where we want the outer loop to be aggressively optimized. Handle this case more aggressively. For example, previously on phi-duplicate.ll we would get this: define void @test(i32 %N, double* %G) nounwind ssp { entry: %cmp1 = icmp slt i64 1, 1000 br i1 %cmp1, label %bb.nph, label %for.end bb.nph: ; preds = %entry br label %for.body for.body: ; preds = %bb.nph, %for.cond %j.02 = phi i64 [ 1, %bb.nph ], [ %inc, %for.cond ] %arrayidx = getelementptr inbounds double* %G, i64 %j.02 %tmp3 = load double* %arrayidx %sub = sub i64 %j.02, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.02 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.02, 1 br label %for.cond for.cond: ; preds = %for.body %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge for.cond.for.end_crit_edge: ; preds = %for.cond br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry ret void } Now we get the much nicer: define void @test(i32 %N, double* %G) nounwind ssp { entry: br label %for.body for.body: ; preds = %entry, %for.body %j.01 = phi i64 [ 1, %entry ], [ %inc, %for.body ] %arrayidx = getelementptr inbounds double* %G, i64 %j.01 %tmp3 = load double* %arrayidx %sub = sub i64 %j.01, 1 %arrayidx6 = getelementptr inbounds double* %G, i64 %sub %tmp7 = load double* %arrayidx6 %add = fadd double %tmp3, %tmp7 %arrayidx10 = getelementptr inbounds double* %G, i64 %j.01 store double %add, double* %arrayidx10 %inc = add nsw i64 %j.01, 1 %cmp = icmp slt i64 %inc, 1000 br i1 %cmp, label %for.body, label %for.end for.end: ; preds = %for.body ret void } With all of these recent changes, we are now able to compile: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } into a single memset of 10000 bytes. This series of changes should also be helpful for other nested loop scenarios as well. llvm-svn: 123079
-
Chris Lattner authored
moving the OrigHeader block anymore: we just merge it away anyway so its code layout doesn't matter. llvm-svn: 123077
-
Chris Lattner authored
that it was leaving in loops after rotation (between the original latch block and the original header. With this change, it is possible for rotated loops to have just a single basic block, which is useful. llvm-svn: 123075
-
Chris Lattner authored
loop info. llvm-svn: 123074
-
Chris Lattner authored
llvm-svn: 123073
-
Chris Lattner authored
1. Rip out LoopRotate's domfrontier updating code. It isn't needed now that LICM doesn't use DF and it is super complex and gross. 2. Make DomTree updating code a lot simpler and faster. The old loop over all the blocks was just to find a block?? 3. Change the code that inserts the new preheader to just use SplitCriticalEdge instead of doing an overcomplex reimplementation of it. No behavior change, except for the name of the inserted preheader. llvm-svn: 123072
-
Chris Lattner authored
llvm-svn: 123071
-
Chris Lattner authored
and latch blocks. Reorder entry conditions to make hte pass faster and more logical. llvm-svn: 123069
-
Chris Lattner authored
llvm-svn: 123068
-
Chris Lattner authored
that are just passed to one function. llvm-svn: 123067
-