Skip to content
  • Dan Gohman's avatar
    Optimized FCMP_OEQ and FCMP_UNE for x86. · 97d95d6d
    Dan Gohman authored
    Where previously LLVM might emit code like this:
    
            ucomisd %xmm1, %xmm0
            setne   %al
            setp    %cl
            orb     %al, %cl
            jne     .LBB4_2
    
    it now emits this:
    
            ucomisd %xmm1, %xmm0
            jne     .LBB4_2
            jp      .LBB4_2
    
    It has fewer instructions and uses fewer registers, but it does
    have more branches. And in the case that this code is followed by
    a non-fallthrough edge, it may be followed by a jmp instruction,
    resulting in three branch instructions in sequence. Some effort
    is made to avoid this situation.
    
    To achieve this, X86ISelLowering.cpp now recognizes FCMP_OEQ and
    FCMP_UNE in lowered form, and replace them with code that emits
    two branches, except in the case where it would require converting
    a fall-through edge to an explicit branch.
    
    Also, X86InstrInfo.cpp's branch analysis and transform code now
    knows now to handle blocks with multiple conditional branches. It
    uses loops instead of having fixed checks for up to two
    instructions. It can now analyze and transform code generated
    from FCMP_OEQ and FCMP_UNE.
    
    llvm-svn: 57873
    97d95d6d
Loading