- Apr 18, 2004
-
-
Chris Lattner authored
llvm-svn: 13036
-
Chris Lattner authored
block. The primary motivation for doing this is that we can now unroll nested loops. This makes a pretty big difference in some cases. For example, in 183.equake, we are now beating the native compiler with the CBE, and we are a lot closer with LLC. I'm now going to play around a bit with the unroll factor and see what effect it really has. llvm-svn: 13034
-
Chris Lattner authored
While we're at it, add support for updating loop information correctly. llvm-svn: 13033
-
Chris Lattner authored
have a canonical indvar llvm-svn: 13032
-
Chris Lattner authored
limited. Even in it's extremely simple state (it can only *fully* unroll single basic block loops that execute a constant number of times), it already helps improve performance a LOT on some benchmarks, particularly with the native code generators. llvm-svn: 13028
-
Chris Lattner authored
llvm-svn: 13025
-
Chris Lattner authored
llvm-svn: 13023
-
Chris Lattner authored
operations. This allows us to compile this testcase: int main() { int h = 1; do h = 3 * h + 1; while (h <= 256); printf("%d\n", h); return 0; } into this: int %main() { entry: call void %__main( ) %tmp.6 = call int (sbyte*, ...)* %printf( sbyte* getelementptr ([4 x sbyte]* %.str_1, long 0, long 0), int 364 ) ; <int> [#uses=0] ret int 0 } This testcase was taken directly from 256.bzip2, believe it or not. This code is not as general as I would like. Next up is to refactor it a bit to handle more cases. llvm-svn: 13019
-
- Apr 17, 2004
-
-
Chris Lattner authored
exit values. llvm-svn: 13018
-
Chris Lattner authored
even if the loop is using expressions that we can't compute as a closed-form. This allows us to calculate that this function always returns 55: int test() { double X; int Count = 0; for (X = 100; X > 1; X = sqrt(X), ++Count) /*empty*/; return Count; } And allows us to compute trip counts for loops like: int h = 1; do h = 3 * h + 1; while (h <= 256); (which occurs in bzip2), and for this function, which occurs after inlining and other optimizations: int popcount() { int x = 666; int result = 0; while (x != 0) { result = result + (x & 0x1); x = x >> 1; } return result; } We still cannot compute the exit values of result or h in the two loops above, which means we cannot delete the loop, but we are getting closer. Being able to compute a constant trip count for these two loops will allow us to unroll them completely though. llvm-svn: 13017
-
Chris Lattner authored
(familiar) function: int _strlen(const char *str) { int len = 0; while (*str++) len++; return len; } And transforming it to use a ulong induction variable, because the type of the pointer index was left as a constant long. This is obviously very bad. The fix is to shrink long constants in getelementptr instructions to intptr_t, making the indvars pass insert a uint induction variable, which is much more efficient. Here's the before code for this function: int %_strlen(sbyte* %str) { entry: %tmp.13 = load sbyte* %str ; <sbyte> [#uses=1] %tmp.24 = seteq sbyte %tmp.13, 0 ; <bool> [#uses=1] br bool %tmp.24, label %loopexit, label %no_exit no_exit: ; preds = %entry, %no_exit *** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=2] *** %indvar = phi ulong [ %indvar.next, %no_exit ], [ 0, %entry ] ; <ulong> [#uses=2] %indvar1 = cast ulong %indvar to uint ; <uint> [#uses=1] %inc.02.sum = add uint %indvar1, 1 ; <uint> [#uses=1] %inc.0.0 = getelementptr sbyte* %str, uint %inc.02.sum ; <sbyte*> [#uses=1] %tmp.1 = load sbyte* %inc.0.0 ; <sbyte> [#uses=1] %tmp.2 = seteq sbyte %tmp.1, 0 ; <bool> [#uses=1] %indvar.next = add ulong %indvar, 1 ; <ulong> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.2, label %loopexit.loopexit, label %no_exit loopexit.loopexit: ; preds = %no_exit %indvar = cast uint %indvar to int ; <int> [#uses=1] %inc.1 = add int %indvar, 1 ; <int> [#uses=1] ret int %inc.1 loopexit: ; preds = %entry ret int 0 } Here's the after code: int %_strlen(sbyte* %str) { entry: %inc.02 = getelementptr sbyte* %str, uint 1 ; <sbyte*> [#uses=1] %tmp.13 = load sbyte* %str ; <sbyte> [#uses=1] %tmp.24 = seteq sbyte %tmp.13, 0 ; <bool> [#uses=1] br bool %tmp.24, label %loopexit, label %no_exit no_exit: ; preds = %entry, %no_exit *** %indvar = phi uint [ %indvar.next, %no_exit ], [ 0, %entry ] ; <uint> [#uses=3] %indvar = cast uint %indvar to int ; <int> [#uses=1] %inc.0.0 = getelementptr sbyte* %inc.02, uint %indvar ; <sbyte*> [#uses=1] %inc.1 = add int %indvar, 1 ; <int> [#uses=1] %tmp.1 = load sbyte* %inc.0.0 ; <sbyte> [#uses=1] %tmp.2 = seteq sbyte %tmp.1, 0 ; <bool> [#uses=1] %indvar.next = add uint %indvar, 1 ; <uint> [#uses=1] br bool %tmp.2, label %loopexit, label %no_exit loopexit: ; preds = %entry, %no_exit %len.0.1 = phi int [ 0, %entry ], [ %inc.1, %no_exit ] ; <int> [#uses=1] ret int %len.0.1 } llvm-svn: 13016
-
Chris Lattner authored
the trip count for the loop, insert one so that we can canonicalize the exit condition. llvm-svn: 13015
-
Chris Lattner authored
llvm-svn: 13011
-
- Apr 16, 2004
-
-
Chris Lattner authored
make the verifier more strict. This fixes building zlib llvm-svn: 13002
-
Misha Brukman authored
llvm-svn: 13001
-
Brian Gaeke authored
Debian.) llvm-svn: 12986
-
Misha Brukman authored
llvm-svn: 12983
-
Chris Lattner authored
llvm-svn: 12980
-
Chris Lattner authored
that does not dominate all of its users, but is in the same basic block as its users. This class of error is what caused the mysterious CBE only failures last night. llvm-svn: 12979
-
Chris Lattner authored
that didn't exist, missing the ones that do :( llvm-svn: 12978
-
Chris Lattner authored
Basically we were using SimplifyCFG as a huge sledgehammer for a simple optimization. Because simplifycfg does so many things, we can't use it for this purpose. llvm-svn: 12977
-
- Apr 15, 2004
-
-
Chris Lattner authored
the back-edge block, we must check the preincremented value. llvm-svn: 12968
-
Brian Gaeke authored
llvm-svn: 12967
-
Chris Lattner authored
Instead of producing code like this: Loop: X = phi 0, X2 ... X2 = X + 1 if (X != N-1) goto Loop We now generate code that looks like this: Loop: X = phi 0, X2 ... X2 = X + 1 if (X2 != N) goto Loop This has two big advantages: 1. The trip count of the loop is now explicit in the code, allowing the direct implementation of Loop::getTripCount() 2. This reduces register pressure in the loop, and allows X and X2 to be put into the same register. As a consequence of the second point, the code we generate for loops went from: .LBB2: # no_exit.1 ... mov %EDI, %ESI inc %EDI cmp %ESI, 2 mov %ESI, %EDI jne .LBB2 # PC rel: no_exit.1 To: .LBB2: # no_exit.1 ... inc %ESI cmp %ESI, 3 jne .LBB2 # PC rel: no_exit.1 ... which has two fewer moves, and uses one less register. llvm-svn: 12961
-
Chris Lattner authored
llvm-svn: 12960
-
Chris Lattner authored
llvm-svn: 12958
-
Chris Lattner authored
llvm-svn: 12956
-
Chris Lattner authored
insert it once! llvm-svn: 12955
-
- Apr 14, 2004
-
-
John Criswell authored
The iterator is pointing at the next instruction which should not disappear when doing the load/store replacement. llvm-svn: 12954
-
Brian Gaeke authored
llvm-svn: 12953
-
Chris Lattner authored
at the bottom of the loop instead of the top. This reduces the number of overlapping live ranges a lot, for example, eliminating a spill in an important loop in 183.equake with linear scan. I still need to make the exit comparison of the loop use the post-incremented version of this variable, but this is an easy first step. llvm-svn: 12952
-
Brian Gaeke authored
This should unbreak the Sparc JIT again. llvm-svn: 12949
-
John Criswell authored
functions and is not needed here. Simplify the pointer type check per Chris's suggestions. llvm-svn: 12945
-
John Criswell authored
that matches its return type. llvm-svn: 12944
-
John Criswell authored
Sorry these didn't get in yesterday. llvm-svn: 12942
-
Chris Lattner authored
llvm-svn: 12940
-
Chris Lattner authored
even when the "optimization" I added before is turned off. It generates this extremely pointless code: test: fld QWORD PTR [%ESP + 4] mov %AL, 0 test %AL, %AL fcmove %ST(0), %ST(0) ret Good thing the optimizer will have removed this before code generation anyway. :) llvm-svn: 12939
-
John Criswell authored
On x86, memory operations occur in-order, so these are just lowered into volatile loads and stores. llvm-svn: 12936
-
- Apr 13, 2004
-
-
Chris Lattner authored
X86/2004-04-13-FPCMOV-Crash.llx A more robust fix is to follow. llvm-svn: 12935
-
Chris Lattner authored
test/Regression/Transforms/SCCP/calltest.ll llvm-svn: 12921
-