Skip to content
  1. Aug 30, 2010
  2. Aug 29, 2010
  3. Aug 28, 2010
    • Chris Lattner's avatar
      I have manually decoded the imm field of an insertps one too many · 7a05e6dc
      Chris Lattner authored
      times.  This patch causes llc and llvm-mc (which both default to
      verbose-asm) to print out comments after a few common shuffle 
      instructions which indicates the shuffle mask, e.g.:
      
      	insertps	$113, %xmm3, %xmm0     ## xmm0 = zero,xmm0[1,2],xmm3[1]
      	unpcklps	%xmm1, %xmm0    ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
      	pshufd	$1, %xmm1, %xmm1        ## xmm1 = xmm1[1,0,0,0]
      
      This is carefully factored to keep the information extraction (of the
      shuffle mask) separate from the printing logic.  I plan to move the
      extraction part out somewhere else at some point for other parts of
      the x86 backend that want to introspect on the behavior of shuffles.
      
      llvm-svn: 112387
      7a05e6dc
    • Chris Lattner's avatar
      fix the buildvector->insertp[sd] logic to not always create a redundant · 94656b1c
      Chris Lattner authored
      insertp[sd] $0, which is a noop.  Before:
      
      _f32:                                   ## @f32
      	pshufd	$1, %xmm1, %xmm2
      	pshufd	$1, %xmm0, %xmm3
      	addss	%xmm2, %xmm3
      	addss	%xmm1, %xmm0
                                              ## kill: XMM0<def> XMM0<kill> XMM0<def>
      	insertps	$0, %xmm0, %xmm0
      	insertps	$16, %xmm3, %xmm0
      	ret
      
      after:
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm3
      	addss	%xmm1, %xmm3
      	movdqa	%xmm2, %xmm0
      	insertps	$16, %xmm3, %xmm0
      	ret
      
      The extra movs are due to a random (poor) scheduling decision.
      
      llvm-svn: 112379
      94656b1c
    • Chris Lattner's avatar
      fix the BuildVector -> unpcklps logic to not do pointless shuffles · bcb6090a
      Chris Lattner authored
      when the top elements of a vector are undefined.  This happens all
      the time for X86-64 ABI stuff because only the low 2 elements of
      a 4 element vector are defined.  For example, on:
      
      _Complex float f32(_Complex float A, _Complex float B) {
        return A+B;
      }
      
      We used to produce (with SSE2, SSE4.1+ uses insertps):
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$16, %xmm2, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm0
      	addss	%xmm1, %xmm0
      	pshufd	$16, %xmm0, %xmm1
      	movdqa	%xmm2, %xmm0
      	unpcklps	%xmm1, %xmm0
      	ret
      
      We now produce:
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm3
      	addss	%xmm1, %xmm3
      	movaps	%xmm2, %xmm0
      	unpcklps	%xmm3, %xmm0
      	ret
      
      This implements rdar://8368414
      
      llvm-svn: 112378
      bcb6090a
    • Chris Lattner's avatar
      improve comments in the unpcklps generating logic, introduce · 96db6e66
      Chris Lattner authored
      a new EltStride variable instead of reusing NumElems variable
      for a non-obvious purpose.  No functionality change.
      
      llvm-svn: 112377
      96db6e66
    • Chris Lattner's avatar
      remove the MSIL backend. It isn't maintained, is buggy, has no testcases · bd244047
      Chris Lattner authored
      and hasn't kept up with ToT.  Approved by Anton.
      
      llvm-svn: 112375
      bd244047
    • Bob Wilson's avatar
      Use pseudo instructions for VST1 and VST2. · 950882be
      Bob Wilson authored
      llvm-svn: 112357
      950882be
    • Chris Lattner's avatar
      remove unions from LLVM IR. They are severely buggy and not · 13ee795c
      Chris Lattner authored
      being actively maintained, improved, or extended.
      
      llvm-svn: 112356
      13ee795c
    • Bruno Cardoso Lopes's avatar
      Clean up the logic of vector shuffles -> vector shifts. · a982aa24
      Bruno Cardoso Lopes authored
      Also teach this logic how to handle target specific shuffles if
      needed, this is necessary while searching recursively for zeroed
      scalar elements in vector shuffle operands.
      
      llvm-svn: 112348
      a982aa24
    • Bob Wilson's avatar
      We don't need to custom-select VLDMQ and VSTMQ anymore. · 8ee93947
      Bob Wilson authored
      llvm-svn: 112336
      8ee93947
    • Bob Wilson's avatar
      When merging Thumb2 loads/stores, do not give up when the offset is one of · ca5af129
      Bob Wilson authored
      the special values that for ARM would be used with IB or DA modes.  Fall
      through and consider materializing a new base address is it would be
      profitable.
      
      llvm-svn: 112329
      ca5af129
    • Bob Wilson's avatar
      Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like · 13ce07fa
      Bob Wilson authored
      all the other LDM/STM instructions.  This fixes asm printer crashes when
      compiling with -O0.  I've changed one of the NEON tests (vst3.ll) to run
      with -O0 to check this in the future.
      
      Prior to this change VLDM/VSTM used addressing mode #5, but not really.
      The offset field was used to hold a count of the number of registers being
      loaded or stored, and the AM5 opcode field was expanded to specify the IA
      or DB mode, instead of the standard ADD/SUB specifier.  Much of the backend
      was not aware of these special cases.  The crashes occured when rewriting
      a frameindex caused the AM5 offset field to be changed so that it did not
      have a valid submode.  I don't know exactly what changed to expose this now.
      Maybe we've never done much with -O0 and NEON.  Regardless, there's no longer
      any reason to keep a count of the VLDM/VSTM registers, so we can use
      addressing mode #4 and clean things up in a lot of places.
      
      llvm-svn: 112322
      13ce07fa
  4. Aug 27, 2010
  5. Aug 26, 2010
Loading