Skip to content
  1. Mar 25, 2009
  2. Mar 24, 2009
  3. Mar 23, 2009
  4. Mar 21, 2009
  5. Mar 20, 2009
  6. Mar 19, 2009
  7. Mar 18, 2009
    • Chris Lattner's avatar
      Disable the "call to immediate" optimization on x86-64. It is · a6bed3e9
      Chris Lattner authored
      not safe in general because the immediate could be an arbitrary
      value that does not fit in a 32-bit pcrel displacement.  
      Conservatively fall back to loading the value into a register
      and calling through it.
      
      We still do the optzn on X86-32.
      
      llvm-svn: 67142
      a6bed3e9
  8. Mar 17, 2009
    • Scott Michel's avatar
      CellSPU: · df52d3d4
      Scott Michel authored
      Revert inadvertent mis-fix of fneg.
      
      llvm-svn: 67084
      df52d3d4
    • Dan Gohman's avatar
      Recognize bswapl as bswap too. · d6e571b2
      Dan Gohman authored
      llvm-svn: 67072
      d6e571b2
    • Dan Gohman's avatar
      77a9279d
    • Scott Michel's avatar
      CellSPU: · 839ad0a5
      Scott Michel authored
      - Fix fabs, fneg for f32 and f64.
      - Use BuildVectorSDNode.isConstantSplat, now that the functionality exists
      - Continue to improve i64 constant lowering. Lower certain special constants
        to the constant pool when they correspond to SPU's shufb instruction's
        special mask values. This avoids the overhead of performing a shuffle on a
        zero-filled vector just to get the special constant when the memory load
        suffices.
      
      llvm-svn: 67067
      839ad0a5
  9. Mar 16, 2009
  10. Mar 14, 2009
  11. Mar 13, 2009
    • Dan Gohman's avatar
      Fix FastISel's assumption that i1 values are always zero-extended · c0bb9595
      Dan Gohman authored
      by inserting explicit zero extensions where necessary. Included
      is a testcase where SelectionDAG produces a virtual register
      holding an i1 value which FastISel previously mistakenly assumed
      to be zero-extended.
      
      llvm-svn: 66941
      c0bb9595
    • Rafael Espindola's avatar
      add 8 and 16 bit TLS moves. · 997b74ac
      Rafael Espindola authored
      add a fixme note on how to remove code duplication.
      
      llvm-svn: 66932
      997b74ac
    • Rafael Espindola's avatar
      Improve sext and zext of TLS variables. · 71144973
      Rafael Espindola authored
      llvm-svn: 66922
      71144973
    • Chris Lattner's avatar
      generalize this code so that fast isel handles integer truncates to i1, which · 3fb71c8f
      Chris Lattner authored
      codegen to the same thing as integer truncates to i8 (the top bits are 
      just undefined).  This implements rdar://6667338
      
      llvm-svn: 66902
      3fb71c8f
    • Bill Wendling's avatar
      These instructions have special lowering that may lower them to SSE · 798fd56d
      Bill Wendling authored
      instructions. Prevent that if we don't want implicit uses of SSE.
      
      llvm-svn: 66877
      798fd56d
    • Evan Cheng's avatar
      Fix some significant problems with constant pools that resulted in unnecessary... · 1fb8aedd
      Evan Cheng authored
      Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
      
      1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
      2. MachineConstantPool alignment field is also a log2 value.
      3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
      4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
      5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
      6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.
      
      
      Solutions:
      1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
      2. MachineConstantPool alignment field is also changed to keep non-log2 value.
      3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
      4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
      5. Asm printer uses cheaper data structure to group constant pool entries.
      6. Asm printer compute entry offsets after grouping is done.
      7. Change JIT code to compute entry offsets on the fly.
      
      llvm-svn: 66875
      1fb8aedd
    • Chris Lattner's avatar
      generalize the previous code to use the full generality of LEA · 99cc1337
      Chris Lattner authored
      for i32/i64 expressions (we could also do i16 on cpus where
      i16 lea is fast, but I didn't add this).  On the example, we now
      generate:
      
      _test:
      	movl	4(%esp), %eax
      	cmpl	$42, (%eax)
      	setl	%al
      	movzbl	%al, %eax
      	leal	4(%eax,%eax,8), %eax
      	ret
      
      instead of:
      
      _test:
      	movl	4(%esp), %eax
      	cmpl	$41, (%eax)
      	movl	$4, %ecx
      	movl	$13, %eax
      	cmovg	%ecx, %eax
      	ret
      
      llvm-svn: 66869
      99cc1337
    • Chris Lattner's avatar
      optimize the case of cond ? 42 : 41 and friends. This compiles the · 4be6df5d
      Chris Lattner authored
      example to:
      
      _test:
      	movl	4(%esp), %eax
      	cmpl	$41, (%eax)
      	setg	%al
      	movzbl	%al, %eax
      	orl	$4294967294, %eax
      	ret
      
      instead of:
      
              movl    4(%esp), %eax
              cmpl    $41, (%eax)
      	movl	$4294967294, %ecx
      	movl	$4294967295, %eax
      	cmova	%ecx, %eax
      	ret
      
      which is smaller in code size and faster. rdar://6668608
      
      llvm-svn: 66868
      4be6df5d
    • Dan Gohman's avatar
      Enhance address-mode folding of ISD::ADD to handle cases where the · a1d92423
      Dan Gohman authored
      operands can't both be fully folded at the same time. For example,
      in the included testcase, a global variable is being added with
      an add of two values. The global variable wants RIP-relative
      addressing, so it can't share the address with another base
      register, but it's still possible to fold the initial add.
      
      llvm-svn: 66865
      a1d92423
  12. Mar 12, 2009
    • Evan Cheng's avatar
      Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address... · 2a332aa8
      Evan Cheng authored
      Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address assembly. 2. Fixed JIT encoding by making the address pc-relative.
      
      llvm-svn: 66803
      2a332aa8
    • Chris Lattner's avatar
      Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" · 4147f08e
      Chris Lattner authored
      related transformations out of target-specific dag combine into the
      ARM backend.  These were added by Evan in r37685 with no testcases
      and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).
      
      Add some simple X86-specific (for now) DAG combines that turn things
      like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
      with the recently added cp constant select optimization, but is a
      very general xform.  For example, we now compile the second example
      in const-select.ll to:
      
      _test:
              movsd   LCPI2_0, %xmm0
              ucomisd 8(%esp), %xmm0
              seta    %al
              movzbl  %al, %eax
              movl    4(%esp), %ecx
              movsbl  (%ecx,%eax,4), %eax
              ret
      
      instead of:
      
      _test:
              movl    4(%esp), %eax
              leal    4(%eax), %ecx
              movsd   LCPI2_0, %xmm0
              ucomisd 8(%esp), %xmm0
              cmovbe  %eax, %ecx
              movsbl  (%ecx), %eax
              ret
      
      This passes multisource and dejagnu.
      
      llvm-svn: 66779
      4147f08e
Loading