Skip to content
  1. Oct 01, 2010
    • Dale Johannesen's avatar
      Massive rewrite of MMX: · dd224d23
      Dale Johannesen authored
      The x86_mmx type is used for MMX intrinsics, parameters and
      return values where these use MMX registers, and is also
      supported in load, store, and bitcast.
      
      Only the above operations generate MMX instructions, and optimizations
      do not operate on or produce MMX intrinsics. 
      
      MMX-sized vectors <2 x i32> etc. are lowered to XMM or split into
      smaller pieces.  Optimizations may occur on these forms and the
      result casted back to x86_mmx, provided the result feeds into a
      previous existing x86_mmx operation.
      
      The point of all this is prevent optimizations from introducing
      MMX operations, which is unsafe due to the EMMS problem.
      
      llvm-svn: 115243
      dd224d23
  2. Sep 22, 2010
  3. Sep 21, 2010
  4. Sep 13, 2010
  5. Sep 01, 2010
  6. Aug 31, 2010
  7. Aug 21, 2010
  8. Aug 11, 2010
    • Bruno Cardoso Lopes's avatar
      Add AVX matching patterns to Packed Bit Test intrinsics. · 91d61df3
      Bruno Cardoso Lopes authored
      Apply the same approach of SSE4.1 ptest intrinsics but
      create a new x86 node "testp" since AVX introduces
      vtest{ps}{pd} instructions which set ZF and CF depending
      on sign bit AND and ANDN of packed floating-point sources.
      
      This is slightly different from what the "ptest" does.
      Tests comming with the other 256 intrinsics tests.
      
      llvm-svn: 110744
      91d61df3
  9. Jul 28, 2010
    • Nate Begeman's avatar
      ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller... · 269a6da0
      Nate Begeman authored
      ~40% faster vector shl <4 x i32> on SSE 4.1  Larger improvements for smaller types coming in future patches.
      
      For:
      
      define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
      entry:
        %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
        %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
        ret <2 x i64> %tmp2
      }
      
      We get:
      
      _shl:                                   ## @shl
      	pslld	$23, %xmm1
      	paddd	LCPI0_0, %xmm1
      	cvttps2dq	%xmm1, %xmm1
      	pmulld	%xmm1, %xmm0
      	ret
      
      Instead of:
      
      _shl:                                   ## @shl
      	pshufd	$3, %xmm0, %xmm2
      	movd	%xmm2, %eax
      	pshufd	$3, %xmm1, %xmm2
      	movd	%xmm2, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm2
      	pshufd	$1, %xmm0, %xmm3
      	movd	%xmm3, %eax
      	pshufd	$1, %xmm1, %xmm3
      	movd	%xmm3, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm3
      	punpckldq	%xmm2, %xmm3
      	movd	%xmm0, %eax
      	movd	%xmm1, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm2
      	movhlps	%xmm0, %xmm0
      	movd	%xmm0, %eax
      	movhlps	%xmm1, %xmm1
      	movd	%xmm1, %ecx
      	shll	%cl, %eax
      	movd	%eax, %xmm0
      	punpckldq	%xmm0, %xmm2
      	movdqa	%xmm2, %xmm0
      	punpckldq	%xmm3, %xmm0
      	ret
      
      llvm-svn: 109549
      269a6da0
  10. Jul 26, 2010
  11. Jul 24, 2010
    • Evan Cheng's avatar
      Add an ILP scheduler. This is a register pressure aware scheduler that's · 37b740c4
      Evan Cheng authored
      appropriate for targets without detailed instruction iterineries.
      The scheduler schedules for increased instruction level parallelism in
      low register pressure situation; it schedules to reduce register pressure
      when the register pressure becomes high.
      
      On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
      by 16%.
      
      llvm-svn: 109300
      37b740c4
  12. Jul 22, 2010
  13. Jul 21, 2010
  14. Jul 15, 2010
  15. Jul 10, 2010
  16. Jul 09, 2010
    • Bob Wilson's avatar
      --- Reverse-merging r107947 into '.': · 6586e9b2
      Bob Wilson authored
      U    utils/TableGen/FastISelEmitter.cpp
      --- Reverse-merging r107943 into '.':
      U    test/CodeGen/X86/fast-isel.ll
      U    test/CodeGen/X86/fast-isel-loads.ll
      U    include/llvm/Target/TargetLowering.h
      U    include/llvm/Support/PassNameParser.h
      U    include/llvm/CodeGen/FunctionLoweringInfo.h
      U    include/llvm/CodeGen/CallingConvLower.h
      U    include/llvm/CodeGen/FastISel.h
      U    include/llvm/CodeGen/SelectionDAGISel.h
      U    lib/CodeGen/LLVMTargetMachine.cpp
      U    lib/CodeGen/CallingConvLower.cpp
      U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
      U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
      U    lib/CodeGen/SelectionDAG/FastISel.cpp
      U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
      U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
      U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
      U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
      U    lib/Target/XCore/XCoreISelLowering.cpp
      U    lib/Target/XCore/XCoreISelLowering.h
      U    lib/Target/X86/X86ISelLowering.cpp
      U    lib/Target/X86/X86FastISel.cpp
      U    lib/Target/X86/X86ISelLowering.h
      
      llvm-svn: 107987
      6586e9b2
    • Dan Gohman's avatar
      Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting · 0b5aa1cd
      Dan Gohman authored
      a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.
      
      llvm-svn: 107943
      0b5aa1cd
  17. Jul 08, 2010
  18. Jul 07, 2010
  19. Jul 06, 2010
  20. Jun 25, 2010
  21. Jun 03, 2010
  22. May 21, 2010
  23. May 11, 2010
  24. May 01, 2010
  25. Apr 26, 2010
  26. Apr 25, 2010
  27. Apr 23, 2010
Loading