Skip to content
  1. Sep 01, 2010
    • Chris Lattner's avatar
      licm is wasting time hoisting constant foldable operations, · 030f0202
      Chris Lattner authored
      instead of hoisting them, just fold them away.  This occurs in the
      testcase for PR8041, for example.
      
      llvm-svn: 112669
      030f0202
    • Bill Wendling's avatar
      We have a chance for an optimization. Consider this code: · 6789f8b6
      Bill Wendling authored
      int x(int t) {
        if (t & 256)
          return -26;
        return 0;
      }
      
      We generate this:
      
           tst.w   r0, #256
           mvn     r0, #25
           it      eq
           moveq   r0, #0
      
      while gcc generates this:
      
           ands    r0, r0, #256
           it      ne
           mvnne   r0, #25
           bx      lr
      
      Scandalous really!
      
      During ISel time, we can look for this particular pattern. One where we have a
      "MOVCC" that uses the flag off of a CMPZ that itself is comparing an AND
      instruction to 0. Something like this (greatly simplified):
      
        %r0 = ISD::AND ...
        ARMISD::CMPZ %r0, 0         @ sets [CPSR]
        %r0 = ARMISD::MOVCC 0, -26  @ reads [CPSR]
      
      All we have to do is convert the "ISD::AND" into an "ARM::ANDS" that sets [CPSR]
      when it's zero. The zero value will all ready be in the %r0 register and we only
      need to change it if the AND wasn't zero. Easy!
      
      llvm-svn: 112664
      6789f8b6
    • Devang Patel's avatar
      Reapply r112623. Included additional check for unused byval argument. · 86ec8b3a
      Devang Patel authored
      llvm-svn: 112659
      86ec8b3a
  2. Aug 31, 2010
  3. Aug 30, 2010
  4. Aug 29, 2010
  5. Aug 28, 2010
    • Chris Lattner's avatar
      fixme accomplished · 112b6ee3
      Chris Lattner authored
      llvm-svn: 112386
      112b6ee3
    • Chris Lattner's avatar
      fix the buildvector->insertp[sd] logic to not always create a redundant · 94656b1c
      Chris Lattner authored
      insertp[sd] $0, which is a noop.  Before:
      
      _f32:                                   ## @f32
      	pshufd	$1, %xmm1, %xmm2
      	pshufd	$1, %xmm0, %xmm3
      	addss	%xmm2, %xmm3
      	addss	%xmm1, %xmm0
                                              ## kill: XMM0<def> XMM0<kill> XMM0<def>
      	insertps	$0, %xmm0, %xmm0
      	insertps	$16, %xmm3, %xmm0
      	ret
      
      after:
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm3
      	addss	%xmm1, %xmm3
      	movdqa	%xmm2, %xmm0
      	insertps	$16, %xmm3, %xmm0
      	ret
      
      The extra movs are due to a random (poor) scheduling decision.
      
      llvm-svn: 112379
      94656b1c
    • Chris Lattner's avatar
      fix the BuildVector -> unpcklps logic to not do pointless shuffles · bcb6090a
      Chris Lattner authored
      when the top elements of a vector are undefined.  This happens all
      the time for X86-64 ABI stuff because only the low 2 elements of
      a 4 element vector are defined.  For example, on:
      
      _Complex float f32(_Complex float A, _Complex float B) {
        return A+B;
      }
      
      We used to produce (with SSE2, SSE4.1+ uses insertps):
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$16, %xmm2, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm0
      	addss	%xmm1, %xmm0
      	pshufd	$16, %xmm0, %xmm1
      	movdqa	%xmm2, %xmm0
      	unpcklps	%xmm1, %xmm0
      	ret
      
      We now produce:
      
      _f32:                                   ## @f32
      	movdqa	%xmm0, %xmm2
      	addss	%xmm1, %xmm2
      	pshufd	$1, %xmm1, %xmm1
      	pshufd	$1, %xmm0, %xmm3
      	addss	%xmm1, %xmm3
      	movaps	%xmm2, %xmm0
      	unpcklps	%xmm3, %xmm0
      	ret
      
      This implements rdar://8368414
      
      llvm-svn: 112378
      bcb6090a
    • Benjamin Kramer's avatar
      Update ocaml test. · 2e5c1471
      Benjamin Kramer authored
      llvm-svn: 112364
      2e5c1471
    • Chris Lattner's avatar
      remove unions from LLVM IR. They are severely buggy and not · 13ee795c
      Chris Lattner authored
      being actively maintained, improved, or extended.
      
      llvm-svn: 112356
      13ee795c
    • Chris Lattner's avatar
      remove the ABCD and SSI passes. They don't have any clients that · 504e5100
      Chris Lattner authored
      I'm aware of, aren't maintained, and LVI will be replacing their value.
      nlewycky approved this on irc.
      
      llvm-svn: 112355
      504e5100
    • Chris Lattner's avatar
      handle the constant case of vector insertion. For something · d0214f3e
      Chris Lattner authored
      like this:
      
      struct S { float A, B, C, D; };
      
      struct S g;
      struct S bar() { 
        struct S A = g;
        ++A.B;
        A.A = 42;
        return A;
      }
      
      we now generate:
      
      _bar:                                   ## @bar
      ## BB#0:                                ## %entry
      	movq	_g@GOTPCREL(%rip), %rax
      	movss	12(%rax), %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	movss	4(%rax), %xmm2
      	movss	8(%rax), %xmm1
      	pshufd	$16, %xmm1, %xmm1
      	unpcklps	%xmm0, %xmm1
      	addss	LCPI1_0(%rip), %xmm2
      	pshufd	$16, %xmm2, %xmm2
      	movss	LCPI1_1(%rip), %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	unpcklps	%xmm2, %xmm0
      	ret
      
      instead of:
      
      _bar:                                   ## @bar
      ## BB#0:                                ## %entry
      	movq	_g@GOTPCREL(%rip), %rax
      	movss	12(%rax), %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	movss	4(%rax), %xmm2
      	movss	8(%rax), %xmm1
      	pshufd	$16, %xmm1, %xmm1
      	unpcklps	%xmm0, %xmm1
      	addss	LCPI1_0(%rip), %xmm2
      	movd	%xmm2, %eax
      	shlq	$32, %rax
      	addq	$1109917696, %rax       ## imm = 0x42280000
      	movd	%rax, %xmm0
      	ret
      
      llvm-svn: 112345
      d0214f3e
Loading