Skip to content
  1. Jan 31, 2011
  2. Jan 27, 2011
    • David Greene's avatar
      · 34f7c0d8
      David Greene authored
      [AVX] Clean up the code to configure target lowering for AVX.  Specify
      how to lower more/new operations.  This is a prerequisite for adding
      additional AVX lowering.
      
      llvm-svn: 124447
      34f7c0d8
  3. Jan 26, 2011
    • David Greene's avatar
      · bab5e6ed
      David Greene authored
      [AVX] Add INSERT_SUBVECTOR and support it on x86.  This provides a
      default implementation for x86, going through the stack in a similr
      fashion to how the codegen implements BUILD_VECTOR.  Eventually this
      will get matched to VINSERTF128 if AVX is available.
      
      llvm-svn: 124307
      bab5e6ed
    • David Greene's avatar
      · b6f16119
      David Greene authored
      [AVX] Support EXTRACT_SUBVECTOR on x86.  This provides a default
      implementation of EXTRACT_SUBVECTOR for x86, going through the stack
      in a similr fashion to how the codegen implements BUILD_VECTOR.
      Eventually this will get matched to VEXTRACTF128 if AVX is available.
      
      llvm-svn: 124292
      b6f16119
    • NAKAMURA Takumi's avatar
      Target/X86: Tweak win64's tailcall. · 0cfdac07
      NAKAMURA Takumi authored
      llvm-svn: 124272
      0cfdac07
    • NAKAMURA Takumi's avatar
      Fix whitespace. · 9d29eff1
      NAKAMURA Takumi authored
      llvm-svn: 124270
      9d29eff1
  4. Jan 16, 2011
  5. Jan 10, 2011
  6. Jan 08, 2011
  7. Jan 07, 2011
  8. Jan 06, 2011
  9. Dec 23, 2010
    • Benjamin Kramer's avatar
      X86: Lower a select directly to a setcc_carry if possible. · 6020ed9d
      Benjamin Kramer authored
        int test(unsigned long a, unsigned long b) { return -(a < b); }
      compiles to
        _test:                              ## @test
          cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
          sbbl  %eax, %eax                  ## encoding: [0x19,0xc0]
          ret                               ## encoding: [0xc3]
      instead of
        _test:                              ## @test
          xorl  %ecx, %ecx                  ## encoding: [0x31,0xc9]
          cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
          movl  $-1, %eax                   ## encoding: [0xb8,0xff,0xff,0xff,0xff]
          cmovael %ecx, %eax                ## encoding: [0x0f,0x43,0xc1]
          ret                               ## encoding: [0xc3]
      
      llvm-svn: 122451
      6020ed9d
  10. Dec 21, 2010
  11. Dec 20, 2010
  12. Dec 19, 2010
  13. Dec 17, 2010
  14. Dec 10, 2010
    • Nate Begeman's avatar
      Formalize the notion that AVX and SSE are non-overlapping extensions from the... · 8b08f523
      Nate Begeman authored
      Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view.  Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX".  Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
      
      llvm-svn: 121439
      8b08f523
  15. Dec 09, 2010
  16. Dec 05, 2010
    • Chris Lattner's avatar
      Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags · 68861717
      Chris Lattner authored
      result.  This allows us to compile:
      
      void *test12(long count) {
            return new int[count];
      }
      
      into:
      
      test12:
      	movl	$4, %ecx
      	movq	%rdi, %rax
      	mulq	%rcx
      	movq	$-1, %rdi
      	cmovnoq	%rax, %rdi
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      test12:
      	movl	$4, %ecx
      	movq	%rdi, %rax
      	mulq	%rcx
      	seto	%cl
      	testb	%cl, %cl
      	movq	$-1, %rdi
      	cmoveq	%rax, %rdi
      	jmp	__Znam
      
      Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
      which would eliminate the need for the 'movq %rdi, %rax'.
      
      llvm-svn: 120936
      68861717
    • Chris Lattner's avatar
      it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0
      Chris Lattner authored
      backend that they were all implemented except umul.  This one fell back
      to the default implementation that did a hi/lo multiply and compared the
      top.  Fix this to check the overflow flag that the 'mul' instruction
      sets, so we can avoid an explicit test.  Now we compile:
      
      void *func(long count) {
            return new int[count];
      }
      
      into:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
      	testb	%cl, %cl                ## encoding: [0x84,0xc9]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      Other than the silly seto+test, this is using the o bit directly, so it's going in the right
      direction.
      
      llvm-svn: 120935
      364bb0a0
    • Chris Lattner's avatar
      generalize the previous check to handle -1 on either side of the · 116580a1
      Chris Lattner authored
      select, inserting a not to compensate.  Add a missing isZero check
      that I lost somehow.
      
      This improves codegen of:
      
      void *func(long count) {
            return new int[count];
      }
      
      from:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
                                              ## encoding: [0xeb,A]
      
      to:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
      	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
      	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
      	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
      	jmp	__Znam                  ## TAILCALL
                                              ## encoding: [0xeb,A]
      
      llvm-svn: 120932
      116580a1
    • Chris Lattner's avatar
      Improve an integer select optimization in two ways: · 342e6ea5
      Chris Lattner authored
      1. generalize 
          (select (x == 0), -1, 0) -> (sign_bit (x - 1))
      to:
          (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y
      
      2. Handle the identical pattern that happens with !=:
         (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y
      
      cmov is often high latency and can't fold immediates or
      memory operands.  For example for (x == 0) ? -1 : 1, before 
      we got:
      
      < 	testb	%sil, %sil
      < 	movl	$-1, %ecx
      < 	movl	$1, %eax
      < 	cmovel	%ecx, %eax
      
      now we get:
      
      > 	cmpb	$1, %sil
      > 	sbbl	%eax, %eax
      > 	orl	$1, %eax
      
      llvm-svn: 120929
      342e6ea5
  17. Dec 04, 2010
  18. Dec 01, 2010
Loading