Skip to content
  1. Mar 11, 2009
  2. Mar 07, 2009
    • Dan Gohman's avatar
      Arithmetic instructions don't set EFLAGS bits OF and CF bits · ff659b5b
      Dan Gohman authored
      the same say the "test" instruction does in overflow cases,
      so eliminating the test is only safe when those bits aren't
      needed, as is the case for COND_E and COND_NE, or if it
      can be proven that no overflow will occur. For now, just
      restrict the optimization to COND_E and COND_NE and don't
      do any overflow analysis.
      
      llvm-svn: 66318
      ff659b5b
  3. Mar 05, 2009
  4. Mar 04, 2009
  5. Feb 27, 2009
    • Rafael Espindola's avatar
      Refactor TLS code and add some tests. The tests and expected results are: · 000421ea
      Rafael Espindola authored
       pic |  declaration | linkage  | visibility |
      
      !pic |  declaration | external | default    | tls1.ll     tls2.ll     | local exec
       pic |  declaration | external | default    | tls1-pic.ll tls2-pic.ll | general dynamic
      !pic | !declaration | external | default    | tls3.ll     tls4.ll     | initial exec
       pic | !declaration | external | default    | tls3-pic.ll tls4-pic.ll | general dynamic
      
      !pic |  declaration | external | hidden     | tls7.ll     tls8.ll     | local exec
       pic |  declaration | external | hidden     | X                       | local dynamic
      !pic | !declaration | external | hidden     | tls9.ll     tls10.ll    | local exec
       pic | !declaration | external | hidden     | X                       | local dynamic
      
      !pic |  declaration | internal | default    | tls5.ll     tls6.ll     | local exec
       pic |  declaration | internal | default    | X                       | local dynamic
      
      The ones marked with an X have not been implemented since local dynamic is not implemented.
      
      llvm-svn: 65632
      000421ea
  6. Feb 25, 2009
  7. Feb 23, 2009
    • Evan Cheng's avatar
      Only v1i16 (i.e. _m64) is returned via RAX / RDX. · 9f8fddee
      Evan Cheng authored
      llvm-svn: 65313
      9f8fddee
    • Nate Begeman's avatar
      Generate better code for v8i16 shuffles on SSE2 · e684da3e
      Nate Begeman authored
      Generate better code for v16i8 shuffles on SSE2 (avoids stack)
      Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops.
      Document the shuffle matching logic and add some FIXMEs for later further
        cleanups.
      New tests that test the above.
      
      Examples:
      
      New:
      _shuf2:
      	pextrw	$7, %xmm0, %eax
      	punpcklqdq	%xmm1, %xmm0
      	pshuflw	$128, %xmm0, %xmm0
      	pinsrw	$2, %eax, %xmm0
      
      Old:
      _shuf2:
      	pextrw	$2, %xmm0, %eax
      	pextrw	$7, %xmm0, %ecx
      	pinsrw	$2, %ecx, %xmm0
      	pinsrw	$3, %eax, %xmm0
      	movd	%xmm1, %eax
      	pinsrw	$4, %eax, %xmm0
      	ret
      
      =========
      
      New:
      _shuf4:
      	punpcklqdq	%xmm1, %xmm0
      	pshufb	LCPI1_0, %xmm0
      
      Old:
      _shuf4:
      	pextrw	$3, %xmm0, %eax
      	movsd	%xmm1, %xmm0
      	pextrw	$3, %xmm1, %ecx
      	pinsrw	$4, %ecx, %xmm0
      	pinsrw	$5, %eax, %xmm0
      
      ========
      
      New:
      _shuf1:
      	pushl	%ebx
      	pushl	%edi
      	pushl	%esi
      	pextrw	$1, %xmm0, %eax
      	rolw	$8, %ax
      	movd	%xmm0, %ecx
      	rolw	$8, %cx
      	pextrw	$5, %xmm0, %edx
      	pextrw	$4, %xmm0, %esi
      	pextrw	$3, %xmm0, %edi
      	pextrw	$2, %xmm0, %ebx
      	movaps	%xmm0, %xmm1
      	pinsrw	$0, %ecx, %xmm1
      	pinsrw	$1, %eax, %xmm1
      	rolw	$8, %bx
      	pinsrw	$2, %ebx, %xmm1
      	rolw	$8, %di
      	pinsrw	$3, %edi, %xmm1
      	rolw	$8, %si
      	pinsrw	$4, %esi, %xmm1
      	rolw	$8, %dx
      	pinsrw	$5, %edx, %xmm1
      	pextrw	$7, %xmm0, %eax
      	rolw	$8, %ax
      	movaps	%xmm1, %xmm0
      	pinsrw	$7, %eax, %xmm0
      	popl	%esi
      	popl	%edi
      	popl	%ebx
      	ret
      
      Old:
      _shuf1:
      	subl	$252, %esp
      	movaps	%xmm0, (%esp)
      	movaps	%xmm0, 16(%esp)
      	movaps	%xmm0, 32(%esp)
      	movaps	%xmm0, 48(%esp)
      	movaps	%xmm0, 64(%esp)
      	movaps	%xmm0, 80(%esp)
      	movaps	%xmm0, 96(%esp)
      	movaps	%xmm0, 224(%esp)
      	movaps	%xmm0, 208(%esp)
      	movaps	%xmm0, 192(%esp)
      	movaps	%xmm0, 176(%esp)
      	movaps	%xmm0, 160(%esp)
      	movaps	%xmm0, 144(%esp)
      	movaps	%xmm0, 128(%esp)
      	movaps	%xmm0, 112(%esp)
      	movzbl	14(%esp), %eax
      	movd	%eax, %xmm1
      	movzbl	22(%esp), %eax
      	movd	%eax, %xmm2
      	punpcklbw	%xmm1, %xmm2
      	movzbl	42(%esp), %eax
      	movd	%eax, %xmm1
      	movzbl	50(%esp), %eax
      	movd	%eax, %xmm3
      	punpcklbw	%xmm1, %xmm3
      	punpcklbw	%xmm2, %xmm3
      	movzbl	77(%esp), %eax
      	movd	%eax, %xmm1
      	movzbl	84(%esp), %eax
      	movd	%eax, %xmm2
      	punpcklbw	%xmm1, %xmm2
      	movzbl	104(%esp), %eax
      	movd	%eax, %xmm1
      	punpcklbw	%xmm1, %xmm0
      	punpcklbw	%xmm2, %xmm0
      	movaps	%xmm0, %xmm1
      	punpcklbw	%xmm3, %xmm1
      	movzbl	127(%esp), %eax
      	movd	%eax, %xmm0
      	movzbl	135(%esp), %eax
      	movd	%eax, %xmm2
      	punpcklbw	%xmm0, %xmm2
      	movzbl	155(%esp), %eax
      	movd	%eax, %xmm0
      	movzbl	163(%esp), %eax
      	movd	%eax, %xmm3
      	punpcklbw	%xmm0, %xmm3
      	punpcklbw	%xmm2, %xmm3
      	movzbl	188(%esp), %eax
      	movd	%eax, %xmm0
      	movzbl	197(%esp), %eax
      	movd	%eax, %xmm2
      	punpcklbw	%xmm0, %xmm2
      	movzbl	217(%esp), %eax
      	movd	%eax, %xmm4
      	movzbl	225(%esp), %eax
      	movd	%eax, %xmm0
      	punpcklbw	%xmm4, %xmm0
      	punpcklbw	%xmm2, %xmm0
      	punpcklbw	%xmm3, %xmm0
      	punpcklbw	%xmm1, %xmm0
      	addl	$252, %esp
      	ret
      
      llvm-svn: 65311
      e684da3e
    • Scott Michel's avatar
      Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR · 9d31aca6
      Scott Michel authored
      instruction. The class also consolidates the code for detecting constant
      splats that's shared across PowerPC and the CellSPU backends (and might be
      useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
      generating new BUILD_VECTOR nodes.
      
      llvm-svn: 65296
      9d31aca6
  8. Feb 22, 2009
  9. Feb 20, 2009
  10. Feb 17, 2009
  11. Feb 13, 2009
  12. Feb 12, 2009
  13. Feb 07, 2009
  14. Feb 06, 2009
  15. Feb 04, 2009
  16. Feb 03, 2009
  17. Feb 02, 2009
  18. Feb 01, 2009
  19. Jan 31, 2009
  20. Jan 30, 2009
Loading