Skip to content
  1. Mar 14, 2009
  2. Mar 13, 2009
    • Evan Cheng's avatar
      Fix PR3784: If the source of a phi comes from a bb ended with an invoke, make... · 94419d6f
      Evan Cheng authored
      Fix PR3784: If the source of a phi comes from a bb ended with an invoke, make sure the copy is inserted before the try range (unless it's used as an input to the invoke, then insert it after the last use), not at the end of the bb.
      
      Also re-apply r66140 which was disabled as a workaround.
      
      llvm-svn: 66976
      94419d6f
    • Dan Gohman's avatar
      Fix FastISel's assumption that i1 values are always zero-extended · c0bb9595
      Dan Gohman authored
      by inserting explicit zero extensions where necessary. Included
      is a testcase where SelectionDAG produces a virtual register
      holding an i1 value which FastISel previously mistakenly assumed
      to be zero-extended.
      
      llvm-svn: 66941
      c0bb9595
    • Evan Cheng's avatar
      Fix some significant problems with constant pools that resulted in unnecessary... · 1fb8aedd
      Evan Cheng authored
      Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
      
      1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
      2. MachineConstantPool alignment field is also a log2 value.
      3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
      4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
      5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
      6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
      
      
      Solutions:
      1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
      2. MachineConstantPool alignment field is also changed to keep non-log2 value.
      3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
      4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
      5. Asm printer uses cheaper data structure to group constant pool entries.
      6. Asm printer compute entry offsets after grouping is done.
      7. Change JIT code to compute entry offsets on the fly.
      
      llvm-svn: 66875
      1fb8aedd
    • Owen Anderson's avatar
      Convert VirtRegMap to a MachineFunctionPass. · d37ddf5b
      Owen Anderson authored
      llvm-svn: 66870
      d37ddf5b
    • Bill Wendling's avatar
      Oops...I committed too much. · fa54bc20
      Bill Wendling authored
      llvm-svn: 66867
      fa54bc20
    • Bill Wendling's avatar
      Temporarily XFAIL this test. · b02eadf6
      Bill Wendling authored
      llvm-svn: 66866
      b02eadf6
    • Dan Gohman's avatar
      Fix a typo in a comment. · a19c662a
      Dan Gohman authored
      llvm-svn: 66843
      a19c662a
  3. Mar 12, 2009
    • Owen Anderson's avatar
      Reorganize some #include's. · 36a99378
      Owen Anderson authored
      llvm-svn: 66780
      36a99378
    • Chris Lattner's avatar
      Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" · 4147f08e
      Chris Lattner authored
      related transformations out of target-specific dag combine into the
      ARM backend.  These were added by Evan in r37685 with no testcases
      and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
      
      Add some simple X86-specific (for now) DAG combines that turn things
      like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
      with the recently added cp constant select optimization, but is a
      very general xform.  For example, we now compile the second example
      in const-select.ll to:
      
      _test:
              movsd   LCPI2_0, %xmm0
              ucomisd 8(%esp), %xmm0
              seta    %al
              movzbl  %al, %eax
              movl    4(%esp), %ecx
              movsbl  (%ecx,%eax,4), %eax
              ret
      
      instead of:
      
      _test:
              movl    4(%esp), %eax
              leal    4(%eax), %ecx
              movsd   LCPI2_0, %xmm0
              ucomisd 8(%esp), %xmm0
              cmovbe  %eax, %ecx
              movsbl  (%ecx), %eax
              ret
      
      This passes multisource and dejagnu.
      
      llvm-svn: 66779
      4147f08e
    • Evan Cheng's avatar
      Enable Chris' value propagation change. It make available known sign, zero,... · 44659546
      Evan Cheng authored
      Enable Chris' value propagation change. It make available known sign, zero, one bits information for values that are live out of basic blocks. The goal is to eliminate unnecessary sext, zext, truncate of values that are live-in to blocks. This does not handle PHI nodes yet.
      
      llvm-svn: 66777
      44659546
  4. Mar 11, 2009
  5. Mar 10, 2009
  6. Mar 09, 2009
  7. Mar 08, 2009
    • Evan Cheng's avatar
      de22116f
    • Chris Lattner's avatar
      implement an optimization to codegen c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4. · ab5a4431
      Chris Lattner authored
      For 2009-03-07-FPConstSelect.ll we now produce:
      
      _f:
      	xorl	%eax, %eax
      	testl	%edi, %edi
      	movl	$4, %ecx
      	cmovne	%rax, %rcx
      	leaq	LCPI1_0(%rip), %rax
      	movss	(%rcx,%rax), %xmm0
      	ret
      
      previously we produced:
      
      _f:
      	subl	$4, %esp
      	cmpl	$0, 8(%esp)
      	movss	LCPI1_0, %xmm0
      	je	LBB1_2	## entry
      LBB1_1:	## entry
      	movss	LCPI1_1, %xmm0
      LBB1_2:	## entry
      	movss	%xmm0, (%esp)
      	flds	(%esp)
      	addl	$4, %esp
      	ret
      
      on PPC the code also improves to:
      
      _f:
      	cntlzw r2, r3
      	srwi r2, r2, 5
      	li r3, lo16(LCPI1_0)
      	slwi r2, r2, 2
      	addis r3, r3, ha16(LCPI1_0)
      	lfsx f1, r3, r2
      	blr 
      
      from:
      
      _f:
      	li r2, lo16(LCPI1_1)
      	cmplwi cr0, r3, 0
      	addis r2, r2, ha16(LCPI1_1)
      	beq cr0, LBB1_2	; entry
      LBB1_1:	; entry
      	li r2, lo16(LCPI1_0)
      	addis r2, r2, ha16(LCPI1_0)
      LBB1_2:	; entry
      	lfs f1, 0(r2)
      	blr 
      
      This also improves the existing pic-cpool case from:
      
      foo:
      	subl	$12, %esp
      	call	.Lllvm$1.$piclabel
      .Lllvm$1.$piclabel:
      	popl	%eax
      	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
      	cmpl	$0, 16(%esp)
      	movsd	.LCPI1_0@GOTOFF(%eax), %xmm0
      	je	.LBB1_2	# entry
      .LBB1_1:	# entry
      	movsd	.LCPI1_1@GOTOFF(%eax), %xmm0
      .LBB1_2:	# entry
      	movsd	%xmm0, (%esp)
      	fldl	(%esp)
      	addl	$12, %esp
      	ret
      
      to:
      
      foo:
      	call	.Lllvm$1.$piclabel
      .Lllvm$1.$piclabel:
      	popl	%eax
      	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
      	xorl	%ecx, %ecx
      	cmpl	$0, 4(%esp)
      	movl	$8, %edx
      	cmovne	%ecx, %edx
      	fldl	.LCPI1_0@GOTOFF(%eax,%edx)
      	ret
      
      This triggers a few dozen times in spec FP 2000.
      
      llvm-svn: 66358
      ab5a4431
    • Chris Lattner's avatar
      random cleanups. · 21cf4bf2
      Chris Lattner authored
      llvm-svn: 66357
      21cf4bf2
  8. Mar 07, 2009
    • Duncan Sands's avatar
      Introduce new linkage types linkonce_odr, weak_odr, common_odr · 12da8ce3
      Duncan Sands authored
      and extern_weak_odr.  These are the same as the non-odr versions,
      except that they indicate that the global will only be overridden
      by an *equivalent* global.  In C, a function with weak linkage can
      be overridden by a function which behaves completely differently.
      This means that IP passes have to skip weak functions, since any
      deductions made from the function definition might be wrong, since
      the definition could be replaced by something completely different
      at link time.   This is not allowed in C++, thanks to the ODR
      (One-Definition-Rule): if a function is replaced by another at
      link-time, then the new function must be the same as the original
      function.  If a language knows that a function or other global can
      only be overridden by an equivalent global, it can give it the
      weak_odr linkage type, and the optimizers will understand that it
      is alright to make deductions based on the function body.  The
      code generators on the other hand map weak and weak_odr linkage
      to the same thing.
      
      llvm-svn: 66339
      12da8ce3
Loading