Skip to content
  1. Jan 01, 2011
  2. Dec 29, 2010
  3. Dec 23, 2010
    • Benjamin Kramer's avatar
      DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. ... · 1f4dfbbc
      Benjamin Kramer authored
      DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal.  The latter usually compiles into smaller code.
      
      example code:
      unsigned foo(unsigned x, unsigned y) {
        if (x != 0) y--;
        return y;
      }
      
      before:
        _foo:                           ## @foo
          cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
          sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
          notl  %eax                    ## encoding: [0xf7,0xd0]
          addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
          ret                           ## encoding: [0xc3]
      
      after:
        _foo:                           ## @foo
          cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
          movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
          adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
          ret                           ## encoding: [0xc3]
      
      llvm-svn: 122455
      1f4dfbbc
    • Benjamin Kramer's avatar
      X86: Lower a select directly to a setcc_carry if possible. · 6020ed9d
      Benjamin Kramer authored
        int test(unsigned long a, unsigned long b) { return -(a < b); }
      compiles to
        _test:                              ## @test
          cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
          sbbl  %eax, %eax                  ## encoding: [0x19,0xc0]
          ret                               ## encoding: [0xc3]
      instead of
        _test:                              ## @test
          xorl  %ecx, %ecx                  ## encoding: [0x31,0xc9]
          cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
          movl  $-1, %eax                   ## encoding: [0xb8,0xff,0xff,0xff,0xff]
          cmovael %ecx, %eax                ## encoding: [0x0f,0x43,0xc1]
          ret                               ## encoding: [0xc3]
      
      llvm-svn: 122451
      6020ed9d
  4. Dec 22, 2010
  5. Dec 21, 2010
  6. Dec 20, 2010
  7. Dec 19, 2010
  8. Dec 18, 2010
  9. Dec 17, 2010
  10. Dec 15, 2010
  11. Dec 14, 2010
    • Evan Cheng's avatar
      Fix a minor bug in two-address pass. It was missing a commute opportunity. · 19dc77ce
      Evan Cheng authored
      regB = move RCX
      regA = op regB, regC
      RAX  = move regA
      where both regB and regC are killed. If regB is constrainted to non-compatible
      physical registers but regC is not constrainted at all, then it's better to
      commute the instruction.
             movl    %edi, %eax
             shlq    $32, %rcx
             leaq    (%rcx,%rax), %rax
      =>
             movl    %edi, %eax
             shlq    $32, %rcx
             orq     %rcx, %rax
      rdar://8762995
      
      llvm-svn: 121793
      19dc77ce
  12. Dec 13, 2010
    • Chris Lattner's avatar
      rename test · 8e21a02c
      Chris Lattner authored
      llvm-svn: 121697
      8e21a02c
    • Chris Lattner's avatar
      Add a couple dag combines to transform mulhi/mullo into a wider multiply · 10bd29f1
      Chris Lattner authored
      when the wider type is legal.  This allows us to compile:
      
      define zeroext i16 @test1(i16 zeroext %x) nounwind {
      entry:
      	%div = udiv i16 %x, 33
      	ret i16 %div
      }
      
      into:
      
      test1:                                  # @test1
      	movzwl	4(%esp), %eax
      	imull	$63551, %eax, %eax      # imm = 0xF83F
      	shrl	$21, %eax
      	ret
      
      instead of:
      
      test1:                                  # @test1
              movw    $-1985, %ax             # imm = 0xFFFFFFFFFFFFF83F
              mulw    4(%esp)
              andl    $65504, %edx            # imm = 0xFFE0
              movl    %edx, %eax
              shrl    $5, %eax
              ret
      
      Implementing rdar://8760399 and example #4 from:
      http://blog.regehr.org/archives/320
      
      We should implement the same thing for [su]mul_hilo, but I don't
      have immediate plans to do this.
      
      llvm-svn: 121696
      10bd29f1
  13. Dec 10, 2010
    • Nate Begeman's avatar
      Formalize the notion that AVX and SSE are non-overlapping extensions from the... · 8b08f523
      Nate Begeman authored
      Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view.  Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX".  Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
      
      llvm-svn: 121439
      8b08f523
  14. Dec 09, 2010
  15. Dec 06, 2010
  16. Dec 05, 2010
    • Chris Lattner's avatar
      Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags · 68861717
      Chris Lattner authored
      result.  This allows us to compile:
      
      void *test12(long count) {
            return new int[count];
      }
      
      into:
      
      test12:
      	movl	$4, %ecx
      	movq	%rdi, %rax
      	mulq	%rcx
      	movq	$-1, %rdi
      	cmovnoq	%rax, %rdi
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      test12:
      	movl	$4, %ecx
      	movq	%rdi, %rax
      	mulq	%rcx
      	seto	%cl
      	testb	%cl, %cl
      	movq	$-1, %rdi
      	cmoveq	%rax, %rdi
      	jmp	__Znam
      
      Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
      which would eliminate the need for the 'movq %rdi, %rax'.
      
      llvm-svn: 120936
      68861717
    • Chris Lattner's avatar
      it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0
      Chris Lattner authored
      backend that they were all implemented except umul.  This one fell back
      to the default implementation that did a hi/lo multiply and compared the
      top.  Fix this to check the overflow flag that the 'mul' instruction
      sets, so we can avoid an explicit test.  Now we compile:
      
      void *func(long count) {
            return new int[count];
      }
      
      into:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
      	testb	%cl, %cl                ## encoding: [0x84,0xc9]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      Other than the silly seto+test, this is using the o bit directly, so it's going in the right
      direction.
      
      llvm-svn: 120935
      364bb0a0
    • Chris Lattner's avatar
      fix the rest of the linux miscompares :) · 183ddd8e
      Chris Lattner authored
      llvm-svn: 120933
      183ddd8e
    • Chris Lattner's avatar
      generalize the previous check to handle -1 on either side of the · 116580a1
      Chris Lattner authored
      select, inserting a not to compensate.  Add a missing isZero check
      that I lost somehow.
      
      This improves codegen of:
      
      void *func(long count) {
            return new int[count];
      }
      
      from:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
                                              ## encoding: [0xeb,A]
      
      to:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
      	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
      	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
      	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
      	jmp	__Znam                  ## TAILCALL
                                              ## encoding: [0xeb,A]
      
      llvm-svn: 120932
      116580a1
Loading