Skip to content
  • Jakob Stoklund Olesen's avatar
    Add a late SSEDomainFix pass that twiddles SSE instructions to avoid domain crossings. · 49e121d5
    Jakob Stoklund Olesen authored
    On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
    in a different domain than where it was defined. Some instructions have
    equvivalents for different domains, like por/orps/orpd.
    
    The SSEDomainFix pass tries to minimize the number of domain crossings by
    changing between equvivalent opcodes where possible.
    
    This is a work in progress, in particular the pass doesn't do anything yet. SSE
    instructions are tagged with their execution domain in TableGen using the last
    two bits of TSFlags. Note that not all instructions are tagged correctly. Life
    just isn't that simple.
    
    The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
    issue handled by NEONMoveFixPass. This pass may become target independent to
    handle both.
    
    llvm-svn: 99524
    49e121d5
Loading