Skip to content
  1. Jan 30, 2011
    • Benjamin Kramer's avatar
      Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x,... · 946e1522
      Benjamin Kramer authored
      Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off.
      
      This happens all the time when a smul is promoted to a larger type.
      
      On x86-64 we now compile "int test(int x) { return x/10; }" into
        movslq  %edi, %rax
        imulq $1717986919, %rax, %rax
        movq  %rax, %rcx
        shrq  $63, %rcx
        sarq  $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax"
        addl  %ecx, %eax
      
      This fires 96 times in gcc.c on x86-64.
      
      llvm-svn: 124559
      946e1522
  2. Jan 24, 2011
  3. Jan 23, 2011
  4. Jan 18, 2011
  5. Jan 16, 2011
  6. Jan 13, 2011
  7. Jan 11, 2011
  8. Jan 10, 2011
  9. Jan 09, 2011
  10. Jan 07, 2011
  11. Jan 06, 2011
  12. Jan 02, 2011
    • Chris Lattner's avatar
      update a bunch of entries. · 51415d26
      Chris Lattner authored
      llvm-svn: 122700
      51415d26
    • Chris Lattner's avatar
      Allow loop-idiom to run on multiple BB loops, but still only scan the loop · ddf58010
      Chris Lattner authored
      header for now for memset/memcpy opportunities.  It turns out that loop-rotate
      is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for 
      loops" into 2 basic block loops that loop-idiom was ignoring.
      
      With this fix, we form many *many* more memcpy and memsets than before, including
      on the "history" loops in the viterbi benchmark, which look like this:
      
              for (j=0; j<MAX_history; ++j) {
                history_new[i][j+1] = history[2*i][j];
              }
      
      Transforming these loops into memcpy's speeds up the viterbi benchmark from
      11.98s to 3.55s on my machine.  Woo.
      
      llvm-svn: 122685
      ddf58010
  13. Jan 01, 2011
  14. Dec 28, 2010
  15. Dec 23, 2010
  16. Dec 19, 2010
    • Chris Lattner's avatar
      recognize an unsigned add with overflow idiom into uadd. · 5e0c0c72
      Chris Lattner authored
      This resolves a README entry and technically resolves PR4916,
      but we still get poor code for the testcase in that PR because
      GVN isn't CSE'ing uadd with add, filed as PR8817.
      
      Previously we got:
      
      _test7:                                 ## @test7
      	addq	%rsi, %rdi
      	cmpq	%rdi, %rsi
      	movl	$42, %eax
      	cmovaq	%rsi, %rax
      	ret
      
      Now we get:
      
      _test7:                                 ## @test7
      	addq	%rsi, %rdi
      	movl	$42, %eax
      	cmovbq	%rsi, %rax
      	ret
      
      llvm-svn: 122182
      5e0c0c72
  17. Dec 15, 2010
Loading