Skip to content
  1. May 07, 2008
  2. May 06, 2008
  3. May 05, 2008
  4. May 02, 2008
  5. May 01, 2008
  6. Apr 30, 2008
    • Arnold Schwaighofer's avatar
      Tail call optimization improvements: · be0de34e
      Arnold Schwaighofer authored
      Move platform independent code (lowering of possibly overwritten
      arguments, check for tail call optimization eligibility) from
      target X86ISelectionLowering.cpp to TargetLowering.h and
      SelectionDAGISel.cpp.
      
      Initial PowerPC tail call implementation:
      
      Support ppc32 implemented and tested (passes my tests and
      test-suite llvm-test).  
      Support ppc64 implemented and half tested (passes my tests).
      On ppc tail call optimization is performed if 
        caller and callee are fastcc
        call is a tail call (in tail call position, call followed by ret)
        no variable argument lists or byval arguments
        option -tailcallopt is enabled
      Supported:
       * non pic tail calls on linux/darwin
       * module-local tail calls on linux(PIC/GOT)/darwin(PIC)
       * inter-module tail calls on darwin(PIC)
      If constraints are not met a normal call will be emitted.
      
      A test checking the argument lowering behaviour on x86-64 was added.
      
      llvm-svn: 50477
      be0de34e
    • Dale Johannesen's avatar
      Add comments for previous patch as requested. · c110c4a5
      Dale Johannesen authored
      llvm-svn: 50463
      c110c4a5
    • Scott Michel's avatar
      Fix custom target lowering for zero/any/sign_extend: make sure that · be940424
      Scott Michel authored
      DAG.UpdateNodeOperands() is called before (not after) the call to
      TLI.LowerOperation().
      
      llvm-svn: 50461
      be940424
    • Dale Johannesen's avatar
      Make eh_frame objects by 8-byte aligned on 64-bit · fc3e3ad7
      Dale Johannesen authored
      targets.
      
      llvm-svn: 50451
      fc3e3ad7
  7. Apr 29, 2008
    • Roman Levenstein's avatar
      Use std::set instead of std::priority_queue for the RegReductionPriorityQueue. · 6b371145
      Roman Levenstein authored
      This removes the existing bottleneck related to the removal of elements from 
      the middle of the queue.
      
      Also fixes a subtle bug in ScheduleDAGRRList::CapturePred:
      It was updating the state of the SUnit before removing it. As a result, the
      comparison operators were working incorrectly and this SUnit could not be removed 
      from the queue properly.
      
      Reviewed by Evan and Dan. Approved by Dan.
      
      llvm-svn: 50412
      6b371145
    • Chris Lattner's avatar
      make the vector conversion magic handle multiple results. · 5c88f7b1
      Chris Lattner authored
      We now compile test2/test3 to:
      
      _test2:
      	## InlineAsm Start
      	set %xmm0, %xmm1
      	## InlineAsm End
      	addps	%xmm1, %xmm0
      	ret
      _test3:
      	## InlineAsm Start
      	set %xmm0, %xmm1
      	## InlineAsm End
      	paddd	%xmm1, %xmm0
      	ret
      
      as expected.
      
      llvm-svn: 50389
      5c88f7b1
    • Chris Lattner's avatar
      add support for multiple return values in inline asm. This is a step · f9a49c43
      Chris Lattner authored
      towards PR2094.  It now compiles the attached .ll file to:
      
      _sad16_sse2:
      	movslq	%ecx, %rax
      	## InlineAsm Start
      	%ecx %rdx %rax %rax %r8d %rdx %rsi
      	## InlineAsm End
      	## InlineAsm Start
      	set %eax
      	## InlineAsm End
      	ret
      
      which is pretty decent for a 3 output, 4 input asm.
      
      llvm-svn: 50386
      f9a49c43
    • Evan Cheng's avatar
      Another extract_subreg coalescing bug. · 11b98b66
      Evan Cheng authored
      e.g.
      vr1024<2> extract_subreg vr1025, 2
      If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64.
      
      llvm-svn: 50385
      11b98b66
    • Evan Cheng's avatar
      Fix a bug in RegsForValue::getCopyToRegs() that causes cyclical scheduling... · b96782ec
      Evan Cheng authored
      Fix a bug in RegsForValue::getCopyToRegs() that causes cyclical scheduling units. If it's creating multiple CopyToReg nodes that are "flagged" together, it should not create a TokenFactor for it's chain outputs:
      
      c1, f1 = CopyToReg                                                                                                                                                                                             
      c2, f2 = CopyToReg                                                                                                                                                                                             
      c3     = TokenFactor c1, c2                                                                                                                                                                                    
       ...                                                                                                                                                                                                                      
             = user c3, ..., f2
      
      Now that the two CopyToReg's and the user are "flagged" together. They effectively forms a single scheduling unit. The TokenFactor is now both an operand and a successor of the Flagged nodes.
      
      llvm-svn: 50376
      b96782ec
  8. Apr 28, 2008
  9. Apr 27, 2008
    • Chris Lattner's avatar
      typo · 58b9ece3
      Chris Lattner authored
      llvm-svn: 50316
      58b9ece3
    • Chris Lattner's avatar
      Implement a signficant optimization for inline asm: · 22379734
      Chris Lattner authored
      When choosing between constraints with multiple options,
      like "ir", test to see if we can use the 'i' constraint and
      go with that if possible.  This produces more optimal ASM in
      all cases (sparing a register and an instruction to load it),
      and fixes inline asm like this:
      
      void test () {
        asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
      }
      
      Previously we would dump "42" into a memory location (which
      is ok for the 'm' constraint) which would cause a problem
      because the 'c' modifier is not valid on memory operands.
      
      Isn't it great how inline asm turns 'missed optimization'
      into 'compile failed'??
      
      Incidentally, this was the todo in 
      PowerPC/2007-04-24-InlineAsm-I-Modifier.ll
      
      Please do NOT pull this into Tak.
      
      llvm-svn: 50315
      22379734
    • Chris Lattner's avatar
      isa+cast -> dyn_cast · a937baeb
      Chris Lattner authored
      llvm-svn: 50314
      a937baeb
Loading