Skip to content
  1. Feb 05, 2011
    • David Greene's avatar
      · 96d07a82
      David Greene authored
      [AVX] Revert 124910 until clients are ready.
      
      llvm-svn: 124912
      96d07a82
    • David Greene's avatar
      · bdd48150
      David Greene authored
      [AVX] Add some utilities to insert and extract 128-bit subvectors.
      This allows us to easily support 256-bit operations that don't have
      native 256-bit support.  This applies to integer operations, certain
      types of shuffles and various othher things.
      
      llvm-svn: 124910
      bdd48150
  2. Feb 04, 2011
    • David Greene's avatar
      · 653f1eed
      David Greene authored
      [AVX] Support VSINSERTF128 with more patterns and appropriate
      infrastructure.  This makes lowering 256-bit vectors to 128-bit
      vectors simple when 256-bit vector support is not available.
      
      llvm-svn: 124868
      653f1eed
  3. Feb 03, 2011
    • David Greene's avatar
      · c4da110f
      David Greene authored
      [AVX] VEXTRACTF128 support.  This commit includes patterns for
      matching EXTRACT_SUBVECTOR to VEXTRACTF128 along with support routines
      to examine and translate index values.  VINSERTF128 comes next.  With
      these two in place we can begin supporting more AVX operations as
      INSERT/EXTRACT can be used as a fallback when 256-bit support is not
      available.
      
      llvm-svn: 124797
      c4da110f
    • Rafael Espindola's avatar
      Fix PR9127 by reversing the operands even if they have more then one use. · d11311f2
      Rafael Espindola authored
      Reversing the operands allows us to fold, but doesn't force us to. Also, at
      this point the DAG is still being optimized, so the check for hasOneUse is not
      very precise.
      
      llvm-svn: 124773
      d11311f2
  4. Feb 01, 2011
  5. Jan 31, 2011
  6. Jan 27, 2011
    • David Greene's avatar
      · 34f7c0d8
      David Greene authored
      [AVX] Clean up the code to configure target lowering for AVX.  Specify
      how to lower more/new operations.  This is a prerequisite for adding
      additional AVX lowering.
      
      llvm-svn: 124447
      34f7c0d8
  7. Jan 26, 2011
    • David Greene's avatar
      · bab5e6ed
      David Greene authored
      [AVX] Add INSERT_SUBVECTOR and support it on x86.  This provides a
      default implementation for x86, going through the stack in a similr
      fashion to how the codegen implements BUILD_VECTOR.  Eventually this
      will get matched to VINSERTF128 if AVX is available.
      
      llvm-svn: 124307
      bab5e6ed
    • David Greene's avatar
      · b6f16119
      David Greene authored
      [AVX] Support EXTRACT_SUBVECTOR on x86.  This provides a default
      implementation of EXTRACT_SUBVECTOR for x86, going through the stack
      in a similr fashion to how the codegen implements BUILD_VECTOR.
      Eventually this will get matched to VEXTRACTF128 if AVX is available.
      
      llvm-svn: 124292
      b6f16119
    • NAKAMURA Takumi's avatar
      Target/X86: Tweak win64's tailcall. · 0cfdac07
      NAKAMURA Takumi authored
      llvm-svn: 124272
      0cfdac07
    • NAKAMURA Takumi's avatar
      Fix whitespace. · 9d29eff1
      NAKAMURA Takumi authored
      llvm-svn: 124270
      9d29eff1
  8. Jan 16, 2011
  9. Jan 10, 2011
  10. Jan 08, 2011
  11. Jan 07, 2011
  12. Jan 06, 2011
  13. Dec 23, 2010
    • Benjamin Kramer's avatar
      X86: Lower a select directly to a setcc_carry if possible. · 6020ed9d
      Benjamin Kramer authored
        int test(unsigned long a, unsigned long b) { return -(a < b); }
      compiles to
        _test:                              ## @test
          cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
          sbbl  %eax, %eax                  ## encoding: [0x19,0xc0]
          ret                               ## encoding: [0xc3]
      instead of
        _test:                              ## @test
          xorl  %ecx, %ecx                  ## encoding: [0x31,0xc9]
          cmpq  %rsi, %rdi                  ## encoding: [0x48,0x39,0xf7]
          movl  $-1, %eax                   ## encoding: [0xb8,0xff,0xff,0xff,0xff]
          cmovael %ecx, %eax                ## encoding: [0x0f,0x43,0xc1]
          ret                               ## encoding: [0xc3]
      
      llvm-svn: 122451
      6020ed9d
  14. Dec 21, 2010
  15. Dec 20, 2010
  16. Dec 19, 2010
  17. Dec 17, 2010
  18. Dec 10, 2010
    • Nate Begeman's avatar
      Formalize the notion that AVX and SSE are non-overlapping extensions from the... · 8b08f523
      Nate Begeman authored
      Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view.  Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX".  Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.
      
      llvm-svn: 121439
      8b08f523
  19. Dec 09, 2010
  20. Dec 05, 2010
    • Chris Lattner's avatar
      Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags · 68861717
      Chris Lattner authored
      result.  This allows us to compile:
      
      void *test12(long count) {
            return new int[count];
      }
      
      into:
      
      test12:
      	movl	$4, %ecx
      	movq	%rdi, %rax
      	mulq	%rcx
      	movq	$-1, %rdi
      	cmovnoq	%rax, %rdi
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      test12:
      	movl	$4, %ecx
      	movq	%rdi, %rax
      	mulq	%rcx
      	seto	%cl
      	testb	%cl, %cl
      	movq	$-1, %rdi
      	cmoveq	%rax, %rdi
      	jmp	__Znam
      
      Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
      which would eliminate the need for the 'movq %rdi, %rax'.
      
      llvm-svn: 120936
      68861717
    • Chris Lattner's avatar
      it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0
      Chris Lattner authored
      backend that they were all implemented except umul.  This one fell back
      to the default implementation that did a hi/lo multiply and compared the
      top.  Fix this to check the overflow flag that the 'mul' instruction
      sets, so we can avoid an explicit test.  Now we compile:
      
      void *func(long count) {
            return new int[count];
      }
      
      into:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
      	testb	%cl, %cl                ## encoding: [0x84,0xc9]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      Other than the silly seto+test, this is using the o bit directly, so it's going in the right
      direction.
      
      llvm-svn: 120935
      364bb0a0
    • Chris Lattner's avatar
      generalize the previous check to handle -1 on either side of the · 116580a1
      Chris Lattner authored
      select, inserting a not to compensate.  Add a missing isZero check
      that I lost somehow.
      
      This improves codegen of:
      
      void *func(long count) {
            return new int[count];
      }
      
      from:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
                                              ## encoding: [0xeb,A]
      
      to:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
      	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
      	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
      	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
      	jmp	__Znam                  ## TAILCALL
                                              ## encoding: [0xeb,A]
      
      llvm-svn: 120932
      116580a1
Loading