- Jan 02, 2011
-
-
Chris Lattner authored
header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many *many* more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685
-
Cameron Zwarich authored
compile, and everyone's tests have shown it to be slower in practice, even for quite large graphs. I also hope to do an optimization that is only correct with the simpler data structure, which would break this even further. llvm-svn: 122684
-
Chris Lattner authored
llvm-svn: 122683
-
Chris Lattner authored
llvm-svn: 122682
-
Chris Lattner authored
size of a loop header instead of its own code size estimator. This allows it to handle bitcasts etc more precisely. llvm-svn: 122681
-
Cameron Zwarich authored
naively implemented, the Lengauer-Tarjan algorithm requires a separate bucket for each vertex. However, this is unnecessary, because each vertex is only placed into a single bucket (that of its semidominator), and each vertex's bucket is processed before it is added to any bucket itself. Instead of using a bucket per vertex, we use a single array Buckets that has two purposes. Before the vertex V with DFS number i is processed, Buckets[i] stores the index of the first element in V's bucket. After V's bucket is processed, Buckets[i] stores the index of the next element in the bucket to which V now belongs, if any. Reading from the buckets can also be optimized. Instead of processing the bucket of V's parent at the end of processing V, we process the bucket of V itself at the beginning of processing V. This means that the case of the root vertex can be simplified somewhat. It also means that we don't need to look up the DFS number of the semidominator of every node in the bucket we are processing, since we know it is the current index being processed. This is a 6.5% speedup running -domtree on test-suite + SPEC2000/2006, with larger speedups of around 12% on the larger benchmarks like GCC. llvm-svn: 122680
-
Rafael Espindola authored
statements using the "x" constraint. llvm-svn: 122679
-
Chris Lattner authored
llvm-svn: 122678
-
Nick Lewycky authored
maintains the guarantee that the DenseSet expects two elements it contains to not go from inequal to equal under its nose. As a side-effect, this also lets us switch from iterating to a fixed-point to actually maintaining a work queue of functions to look at again, and we don't add thunks to our work queue so we don't need to detect and ignore them. llvm-svn: 122677
-
- Jan 01, 2011
-
-
Chris Lattner authored
llvm-svn: 122676
-
Chris Lattner authored
llvm-svn: 122675
-
Chris Lattner authored
loop idiom pass exposed. llvm-svn: 122674
-
Rafael Espindola authored
llvm-svn: 122672
-
Benjamin Kramer authored
llvm-svn: 122671
-
Rafael Espindola authored
llvm-svn: 122670
-
Rafael Espindola authored
llvm-svn: 122669
-
Benjamin Kramer authored
llvm-svn: 122668
-
Rafael Espindola authored
llvm-svn: 122667
-
Anton Korobeynikov authored
llvm-svn: 122666
-
Nick Lewycky authored
llvm-svn: 122665
-
Chris Lattner authored
limitations, this kicks in dozens of times in the 4 specfp2000 benchmarks, and hundreds of times in the int part. It also kicks in hundreds of times in multisource. This kicks in right before loop deletion, which has the pleasant effect of deleting loops that *just* do a memset. llvm-svn: 122664
-
Anton Korobeynikov authored
earlyclobber stuff. This should fix PRs 2313 and 8157. Unfortunately, no testcase, since it'd be dependent on register assignments. llvm-svn: 122663
-
Chris Lattner authored
new testcase. llvm-svn: 122662
-
Duncan Sands authored
is the wrong hammer for this nail, and is probably right. llvm-svn: 122661
-
Chris Lattner authored
aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660
-
Chris Lattner authored
should be correct now. llvm-svn: 122659
-
Rafael Espindola authored
llvm-svn: 122658
-
Duncan Sands authored
even compile, let alone work. llvm-svn: 122657
-
Duncan Sands authored
using a separate objects directory. llvm-svn: 122656
-
Duncan Sands authored
not locally. llvm-svn: 122655
-
Duncan Sands authored
numbering, in which it considers (for example) "%a = add i32 %x, %y" and "%b = add i32 %x, %y" to be equal because the operands are equal and the result of the instructions only depends on the values of the operands. This has almost no effect (it removes 4 instructions from gcc-as-one-file), and perhaps slows down compilation: I measured a 0.4% slowdown on the large gcc-as-one-file testcase, but it wasn't statistically significant. llvm-svn: 122654
-
Che-Liang Chiou authored
llvm-svn: 122653
-
Che-Liang Chiou authored
llvm-svn: 122652
-
Erick Tryzelaar authored
llvm-svn: 122651
-
Erick Tryzelaar authored
llvm-svn: 122650
-
- Dec 31, 2010
-
-
Oscar Fuentes authored
is necessary for executing the custom command that runs the assember. Fixes PR8877. llvm-svn: 122649
-
Oscar Fuentes authored
options. If we are building with exceptions/rtti disabled, we replace /EHsc with /EHs-c- and /GR with /GR-, respectively. If we just add the disabling options we get warnings like this: cl : Command line warning D9025 : overriding '/EHs' with '/EHs-' llvm-svn: 122648
-
Duncan Sands authored
operands are visited before the instructions themselves. llvm-svn: 122647
-
Nick Lewycky authored
open them in fundamental-mode instead of c++-mode. Also twiddle whitespace for consistency in ToolChains.cpp. llvm-svn: 122646
-
Duncan Sands authored
llvm-svn: 122645
-