- Jan 07, 2011
-
-
Evan Cheng authored
Revert r122955. It seems using movups to lower memcpy can cause massive regression (even on Nehalem) in edge cases. I also didn't see any real performance benefit. llvm-svn: 123015
-
Bob Wilson authored
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle vectors from being translated to EXTRACT_SUBVECTOR. Patch by Tim Northover. The test changes are needed to keep those spill-q tests from testing aligned spills and restores. If the only aligned stack objects are spill slots, we no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR was legalized by loading from the stack, which created an aligned frame index. Now, however, there is nothing except the spill slot in the stack frame, so I added an aligned alloca. llvm-svn: 122995
-
- Jan 06, 2011
-
-
Chris Lattner authored
llvm-svn: 122978
-
Bob Wilson authored
llvm-svn: 122970
-
Bob Wilson authored
llvm-svn: 122969
-
Bob Wilson authored
llvm-svn: 122968
-
Benjamin Kramer authored
llvm-svn: 122966
-
Rafael Espindola authored
Patch by Richard Simth. llvm-svn: 122962
-
Benjamin Kramer authored
llvm-svn: 122960
-
Benjamin Kramer authored
InstCombine: If we call llvm.objectsize on a malloc call we can replace it with the size passed to malloc. llvm-svn: 122959
-
Benjamin Kramer authored
llvm-svn: 122957
-
Evan Cheng authored
The theory is it's still faster than a pair of movq / a quad of movl. This will probably hurt older chips like P4 but should run faster on current and future Intel processors. rdar://8817010 llvm-svn: 122955
-
Chris Lattner authored
llvm-svn: 122954
-
Chris Lattner authored
llvm-svn: 122953
-
Evan Cheng authored
etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952
-
Bill Wendling authored
works only on MinGW32. On 64-bit, the function to call is "__chkstk". Patch by KS Sreeram! llvm-svn: 122934
-
Bill Wendling authored
beginning of the "main" function. The assembler complains about the invalid suffix for the 'call' instruction. The right instruction is "callq __main". Patch by KS Sreeram! llvm-svn: 122933
-
- Jan 05, 2011
-
-
Chris Lattner authored
llvm-svn: 122921
-
Chris Lattner authored
llvm-svn: 122920
-
Chris Lattner authored
llvm-svn: 122893
-
Wesley Peck authored
Commit 122778 broke DWARF debug output when using the MBlaze backend. Fixed by overriding TargetFrameInfo::getFrameIndexOffset to take into account the new frame index information. llvm-svn: 122889
-
- Jan 04, 2011
-
-
Jakob Stoklund Olesen authored
bundles in the pass. llvm-svn: 122833
-
Jakob Stoklund Olesen authored
The analysis will be needed by both the greedy register allocator and the X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't change. This pass is very fast, usually showing up as 0.0% wall time. llvm-svn: 122832
-
Dale Johannesen authored
warning is overzealous but gcc is what it is.) llvm-svn: 122829
-
Andrew Trick authored
llvm-svn: 122794
-
Bill Wendling authored
llvm-svn: 122789
-
- Jan 03, 2011
-
-
Evan Cheng authored
prologue and epilogue if the adjustment is 8. Similarly, use pushl / popl if the adjustment is 4 in 32-bit mode. In the epilogue, takes care to pop to a caller-saved register that's not live at the exit (either return or tailcall instruction). rdar://8771137 llvm-svn: 122783
-
Wesley Peck authored
llvm-svn: 122778
-
- Jan 02, 2011
-
-
Benjamin Kramer authored
This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710
-
Oscar Fuentes authored
llvm-svn: 122706
-
Chris Lattner authored
llvm-svn: 122700
-
Chris Lattner authored
header for now for memset/memcpy opportunities. It turns out that loop-rotate is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for loops" into 2 basic block loops that loop-idiom was ignoring. With this fix, we form many *many* more memcpy and memsets than before, including on the "history" loops in the viterbi benchmark, which look like this: for (j=0; j<MAX_history; ++j) { history_new[i][j+1] = history[2*i][j]; } Transforming these loops into memcpy's speeds up the viterbi benchmark from 11.98s to 3.55s on my machine. Woo. llvm-svn: 122685
-
- Jan 01, 2011
-
-
Chris Lattner authored
llvm-svn: 122676
-
Chris Lattner authored
llvm-svn: 122675
-
Rafael Espindola authored
llvm-svn: 122667
-
Anton Korobeynikov authored
earlyclobber stuff. This should fix PRs 2313 and 8157. Unfortunately, no testcase, since it'd be dependent on register assignments. llvm-svn: 122663
-
Duncan Sands authored
is the wrong hammer for this nail, and is probably right. llvm-svn: 122661
-
Duncan Sands authored
numbering, in which it considers (for example) "%a = add i32 %x, %y" and "%b = add i32 %x, %y" to be equal because the operands are equal and the result of the instructions only depends on the values of the operands. This has almost no effect (it removes 4 instructions from gcc-as-one-file), and perhaps slows down compilation: I measured a 0.4% slowdown on the large gcc-as-one-file testcase, but it wasn't statistically significant. llvm-svn: 122654
-
Che-Liang Chiou authored
llvm-svn: 122653
-
Che-Liang Chiou authored
llvm-svn: 122652
-