- Feb 15, 2011
-
-
Duncan Sands authored
llvm-svn: 125563
-
- Jan 04, 2011
-
-
Chris Lattner authored
when safe. The testcase is basically this nested loop: void foo(char *X) { for (int i = 0; i != 100; ++i) for (int j = 0; j != 100; ++j) X[j+i*100] = 0; } which gets turned into a single memset now. clang -O3 doesn't optimize this yet though due to a phase ordering issue I haven't analyzed yet. llvm-svn: 122806
-
Chris Lattner authored
instruction *after* the store. The store will always be deleted if the transformation kicks in, so we'd do an N^2 scan of every loop block. Whoops. llvm-svn: 122805
-
Chris Lattner authored
stop setting NSW: signed overflow is possible. Thanks to Dan for pointing these out. llvm-svn: 122790
-
Owen Anderson authored
llvm-svn: 122788
-
- Jan 03, 2011
-
-
Chris Lattner authored
llvm-svn: 122720
-
Chris Lattner authored
Teach it to CSE the rest of the non-side-effecting instructions. llvm-svn: 122716
-
- Jan 02, 2011
-
-
Chris Lattner authored
sure that the loop we're promoting into a memcpy doesn't mutate the input of the memcpy. Before we were just checking that the dest of the memcpy wasn't mod/ref'd by the loop. llvm-svn: 122712
-
Chris Lattner authored
mess with it. We'd rather peel/unroll it than convert all of its stores into memsets. llvm-svn: 122711
-
Chris Lattner authored
blocks in a loop, instead of just the header block. This makes it more aggressive, able to handle Duncan's Ada examples. llvm-svn: 122704
-
Chris Lattner authored
llvm-svn: 122701
-
Chris Lattner authored
header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many *many* more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685
-
Chris Lattner authored
llvm-svn: 122683
-
Chris Lattner authored
llvm-svn: 122682
-
Chris Lattner authored
llvm-svn: 122678
-
- Jan 01, 2011
-
-
Chris Lattner authored
new testcase. llvm-svn: 122662
-
Chris Lattner authored
aggressively. In practice, this doesn't help anything though, see the todo. llvm-svn: 122660
-
Chris Lattner authored
should be correct now. llvm-svn: 122659
-
- Dec 28, 2010
-
-
Chris Lattner authored
check for "multiple of a byte" in size to make it clear that the >> 3 below is safe. llvm-svn: 122604
-
Duncan Sands authored
llvm-svn: 122593
-
- Dec 27, 2010
-
-
Chris Lattner authored
llvm-svn: 122585
-
Chris Lattner authored
llvm-svn: 122574
-
Chris Lattner authored
memsets. This is still missing one important validity check, but this is enough to compile stuff like this: void test0(std::vector<char> &X) { for (std::vector<char>::iterator I = X.begin(), E = X.end(); I != E; ++I) *I = 0; } void test1(std::vector<int> &X) { for (long i = 0, e = X.size(); i != e; ++i) X[i] = 0x01010101; } With: $ clang t.cpp -S -o - -O2 -emit-llvm | opt -loop-idiom | opt -O3 | llc to: __Z5test0RSt6vectorIcSaIcEE: ## @_Z5test0RSt6vectorIcSaIcEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rsi cmpq %rsi, %rax je LBB0_2 ## BB#1: ## %bb.nph subq %rax, %rsi movq %rax, %rdi callq ___bzero LBB0_2: ## %for.end addq $8, %rsp ret ... __Z5test1RSt6vectorIiSaIiEE: ## @_Z5test1RSt6vectorIiSaIiEE ## BB#0: ## %entry subq $8, %rsp movq (%rdi), %rax movq 8(%rdi), %rdx subq %rax, %rdx cmpq $4, %rdx jb LBB1_2 ## BB#1: ## %for.body.preheader andq $-4, %rdx movl $1, %esi movq %rax, %rdi callq _memset LBB1_2: ## %for.end addq $8, %rsp ret llvm-svn: 122573
-
- Dec 26, 2010
-
-
Chris Lattner authored
llvm-svn: 122567
-
Chris Lattner authored
llvm-svn: 122563
-