Skip to content
  1. Mar 25, 2010
    • Jakob Stoklund Olesen's avatar
      Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings. · 49e121d5
      Jakob Stoklund Olesen authored
      On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
      in a different domain than where it was defined. Some instructions have
      equvivalents for different domains, like por/orps/orpd.
      
      The SSEDomainFix pass tries to minimize the number of domain crossings by
      changing between equvivalent opcodes where possible.
      
      This is a work in progress, in particular the pass doesn't do anything yet. SSE
      instructions are tagged with their execution domain in TableGen using the last
      two bits of TSFlags. Note that not all instructions are tagged correctly. Life
      just isn't that simple.
      
      The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
      issue handled by NEONMoveFixPass. This pass may become target independent to
      handle both.
      
      llvm-svn: 99524
      49e121d5
    • Bob Wilson's avatar
      Reapply Kevin's change 94440, now that Chris has fixed the limitation on · e543e7fc
      Bob Wilson authored
      opcode values fitting in one byte (svn r99494).
      
      llvm-svn: 99514
      e543e7fc
    • Chris Lattner's avatar
      eliminate a bunch more parallels now that scheduling · 23bf99a9
      Chris Lattner authored
      handles dead implicit results more aggressively.  More
      to come, I think this is now just a data entry problem.
      
      llvm-svn: 99486
      23bf99a9
    • Evan Cheng's avatar
      Disable folding loads into tail call in 32-bit PIC mode. It can introduce illegal code like this: · b07a29ec
      Evan Cheng authored
              addl    $12, %esp
              popl    %esi
              popl    %edi
              popl    %ebx
              popl    %ebp
              jmpl    *__Block_deallocator-L1$pb(%esi)  # TAILCALL
      
      The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class.
      
      The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit.
      
      llvm-svn: 99455
      b07a29ec
    • Bob Wilson's avatar
      Speculatively revert this to see if it fixes buildbot failures. · 5b2da69f
      Bob Wilson authored
      --- Reverse-merging r99440 into '.':
      U    test/MC/AsmParser/X86/x86_32-bit_cat.s
      U    test/MC/AsmParser/X86/x86_32-encoding.s
      U    include/llvm/IntrinsicsX86.td
      U    include/llvm/CodeGen/SelectionDAGNodes.h
      U    lib/Target/X86/X86InstrSSE.td
      U    lib/Target/X86/X86ISelLowering.h
      
      llvm-svn: 99450
      5b2da69f
  2. Mar 24, 2010
  3. Mar 23, 2010
  4. Mar 20, 2010
  5. Mar 19, 2010
  6. Mar 18, 2010
Loading