Skip to content
  1. Aug 28, 2010
    • Chris Lattner's avatar
      remove unions from LLVM IR. They are severely buggy and not · 13ee795c
      Chris Lattner authored
      being actively maintained, improved, or extended.
      
      llvm-svn: 112356
      13ee795c
    • Chris Lattner's avatar
      remove the ABCD and SSI passes. They don't have any clients that · 504e5100
      Chris Lattner authored
      I'm aware of, aren't maintained, and LVI will be replacing their value.
      nlewycky approved this on irc.
      
      llvm-svn: 112355
      504e5100
    • Chris Lattner's avatar
      remove dead proto · a5217a19
      Chris Lattner authored
      llvm-svn: 112354
      a5217a19
    • Chris Lattner's avatar
      for completeness, allow undef also. · 50df36ac
      Chris Lattner authored
      llvm-svn: 112351
      50df36ac
    • Chris Lattner's avatar
      squish dead code. · 95bb297c
      Chris Lattner authored
      llvm-svn: 112350
      95bb297c
    • Chris Lattner's avatar
      zap dead code · ca936ac9
      Chris Lattner authored
      llvm-svn: 112349
      ca936ac9
    • Bruno Cardoso Lopes's avatar
      Clean up the logic of vector shuffles -> vector shifts. · a982aa24
      Bruno Cardoso Lopes authored
      Also teach this logic how to handle target specific shuffles if
      needed, this is necessary while searching recursively for zeroed
      scalar elements in vector shuffle operands.
      
      llvm-svn: 112348
      a982aa24
    • Chris Lattner's avatar
      handle the constant case of vector insertion. For something · d0214f3e
      Chris Lattner authored
      like this:
      
      struct S { float A, B, C, D; };
      
      struct S g;
      struct S bar() { 
        struct S A = g;
        ++A.B;
        A.A = 42;
        return A;
      }
      
      we now generate:
      
      _bar:                                   ## @bar
      ## BB#0:                                ## %entry
      	movq	_g@GOTPCREL(%rip), %rax
      	movss	12(%rax), %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	movss	4(%rax), %xmm2
      	movss	8(%rax), %xmm1
      	pshufd	$16, %xmm1, %xmm1
      	unpcklps	%xmm0, %xmm1
      	addss	LCPI1_0(%rip), %xmm2
      	pshufd	$16, %xmm2, %xmm2
      	movss	LCPI1_1(%rip), %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	unpcklps	%xmm2, %xmm0
      	ret
      
      instead of:
      
      _bar:                                   ## @bar
      ## BB#0:                                ## %entry
      	movq	_g@GOTPCREL(%rip), %rax
      	movss	12(%rax), %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	movss	4(%rax), %xmm2
      	movss	8(%rax), %xmm1
      	pshufd	$16, %xmm1, %xmm1
      	unpcklps	%xmm0, %xmm1
      	addss	LCPI1_0(%rip), %xmm2
      	movd	%xmm2, %eax
      	shlq	$32, %rax
      	addq	$1109917696, %rax       ## imm = 0x42280000
      	movd	%rax, %xmm0
      	ret
      
      llvm-svn: 112345
      d0214f3e
    • Chris Lattner's avatar
      optimize bitcasts from large integers to vector into vector · dd660104
      Chris Lattner authored
      element insertion from the pieces that feed into the vector.
      This handles a pattern that occurs frequently due to code
      generated for the x86-64 abi.  We now compile something like
      this:
      
      struct S { float A, B, C, D; };
      struct S g;
      struct S bar() { 
        struct S A = g;
        ++A.A;
        ++A.C;
        return A;
      }
      
      into all nice vector operations:
      
      _bar:                                   ## @bar
      ## BB#0:                                ## %entry
      	movq	_g@GOTPCREL(%rip), %rax
      	movss	LCPI1_0(%rip), %xmm1
      	movss	(%rax), %xmm0
      	addss	%xmm1, %xmm0
      	pshufd	$16, %xmm0, %xmm0
      	movss	4(%rax), %xmm2
      	movss	12(%rax), %xmm3
      	pshufd	$16, %xmm2, %xmm2
      	unpcklps	%xmm2, %xmm0
      	addss	8(%rax), %xmm1
      	pshufd	$16, %xmm1, %xmm1
      	pshufd	$16, %xmm3, %xmm2
      	unpcklps	%xmm2, %xmm1
      	ret
      
      instead of icky integer operations:
      
      _bar:                                   ## @bar
      	movq	_g@GOTPCREL(%rip), %rax
      	movss	LCPI1_0(%rip), %xmm1
      	movss	(%rax), %xmm0
      	addss	%xmm1, %xmm0
      	movd	%xmm0, %ecx
      	movl	4(%rax), %edx
      	movl	12(%rax), %esi
      	shlq	$32, %rdx
      	addq	%rcx, %rdx
      	movd	%rdx, %xmm0
      	addss	8(%rax), %xmm1
      	movd	%xmm1, %eax
      	shlq	$32, %rsi
      	addq	%rax, %rsi
      	movd	%rsi, %xmm1
      	ret
      
      This resolves rdar://8360454
      
      llvm-svn: 112343
      dd660104
    • Dan Gohman's avatar
      Completely disable tail calls when fast-isel is enabled, as fast-isel · e06905d1
      Dan Gohman authored
      doesn't currently support dealing with this.
      
      llvm-svn: 112341
      e06905d1
    • Dan Gohman's avatar
      Trim a #include. · 1e06dbf8
      Dan Gohman authored
      llvm-svn: 112340
      1e06dbf8
    • Dan Gohman's avatar
      Fix an index calculation thinko. · fe22f1d3
      Dan Gohman authored
      llvm-svn: 112337
      fe22f1d3
    • Bob Wilson's avatar
      We don't need to custom-select VLDMQ and VSTMQ anymore. · 8ee93947
      Bob Wilson authored
      llvm-svn: 112336
      8ee93947
    • Benjamin Kramer's avatar
      Update CMake build. Add newline at end of file. · 83f9ff04
      Benjamin Kramer authored
      llvm-svn: 112332
      83f9ff04
    • Bob Wilson's avatar
      When merging Thumb2 loads/stores, do not give up when the offset is one of · ca5af129
      Bob Wilson authored
      the special values that for ARM would be used with IB or DA modes.  Fall
      through and consider materializing a new base address is it would be
      profitable.
      
      llvm-svn: 112329
      ca5af129
    • Owen Anderson's avatar
      Add a prototype of a new peephole optimizing pass that uses LazyValue info to... · cf7f9411
      Owen Anderson authored
      Add a prototype of a new peephole optimizing pass that uses LazyValue info to simplify PHIs and select's.
      This pass addresses the missed optimizations from PR2581 and PR4420.
      
      llvm-svn: 112325
      cf7f9411
    • Owen Anderson's avatar
      Improve the precision of getConstant(). · 38f6b7fe
      Owen Anderson authored
      llvm-svn: 112323
      38f6b7fe
    • Bob Wilson's avatar
      Change ARM VFP VLDM/VSTM instructions to use addressing mode #4, just like · 13ce07fa
      Bob Wilson authored
      all the other LDM/STM instructions.  This fixes asm printer crashes when
      compiling with -O0.  I've changed one of the NEON tests (vst3.ll) to run
      with -O0 to check this in the future.
      
      Prior to this change VLDM/VSTM used addressing mode #5, but not really.
      The offset field was used to hold a count of the number of registers being
      loaded or stored, and the AM5 opcode field was expanded to specify the IA
      or DB mode, instead of the standard ADD/SUB specifier.  Much of the backend
      was not aware of these special cases.  The crashes occured when rewriting
      a frameindex caused the AM5 offset field to be changed so that it did not
      have a valid submode.  I don't know exactly what changed to expose this now.
      Maybe we've never done much with -O0 and NEON.  Regardless, there's no longer
      any reason to keep a count of the VLDM/VSTM registers, so we can use
      addressing mode #4 and clean things up in a lot of places.
      
      llvm-svn: 112322
      13ce07fa
    • Chris Lattner's avatar
      Enhance the shift propagator to handle the case when you have: · 6c1395f6
      Chris Lattner authored
      A = shl x, 42
      ...
      B = lshr ..., 38
      
      which can be transformed into:
      A = shl x, 4
      ...
      
      iff we can prove that the would-be-shifted-in bits
      are already zero.  This eliminates two shifts in the testcase
      and allows eliminate of the whole i128 chain in the real example.
      
      llvm-svn: 112314
      6c1395f6
    • Devang Patel's avatar
      Simplify. · f2855b14
      Devang Patel authored
      llvm-svn: 112305
      f2855b14
    • Chris Lattner's avatar
      Implement a pretty general logical shift propagation · 18d7fc8f
      Chris Lattner authored
      framework, which is good at ripping through bitfield
      operations.  This generalize a bunch of the existing
      xforms that instcombine does, such as 
        (x << c) >> c -> and
      to handle intermediate logical nodes.  This is useful for
      ripping up the "promote to large integer" code produced by
      SRoA.
      
      llvm-svn: 112304
      18d7fc8f
  2. Aug 27, 2010
Loading