Skip to content
  1. Dec 17, 2018
  2. Dec 16, 2018
  3. Dec 15, 2018
    • Simon Pilgrim's avatar
      [X86] Begin cleaning up combineOr -> SHLD/SHRD. NFCI. · 52c98240
      Simon Pilgrim authored
      In preparation for converting to funnel shifts.
      
      llvm-svn: 349286
      52c98240
    • Simon Pilgrim's avatar
      [X86] Lower to SHLD/SHRD on slow machines for optsize · ef7b5949
      Simon Pilgrim authored
      Use consistent rules for when to lower to SHLD/SHRD for slow machines - fixes a weird issue where funnel shift gets expanded but then X86ISelLowering's combineOr sees the optsize and combines to SHLD/SHRD, but now with the modulo amount guard......
      
      llvm-svn: 349285
      ef7b5949
    • Craig Topper's avatar
      [X86] Rename hasNoSignedComparisonUses to hasNoSignFlagUses. Add the... · 1fc257d9
      Craig Topper authored
      [X86] Rename hasNoSignedComparisonUses to hasNoSignFlagUses. Add the instruction that only modify the O flag to the waiver list.
      
      The only caller of this turns CMP with 0 into TEST. CMP with 0 and TEST both set OF to 0 so we should have no issues with instructions that only use OF.
      
      Though I don't think there's any reason we would read just OF after a compare with 0 anyway. So this probably isn't an observable change.
      
      llvm-svn: 349223
      1fc257d9
    • Craig Topper's avatar
      [X86] Make hasNoCarryFlagUses/hasNoSignedComparisonUses take an SDValue that... · 5c304eac
      Craig Topper authored
      [X86] Make hasNoCarryFlagUses/hasNoSignedComparisonUses take an SDValue that indicates which result is the flag result. NFCI
      
      hasNoCarryFlagUses hardcoded that the flag result is 1 and used that to filter which uses were of interest. hasNoSignedComparisonUses just assumes the only result is flags and checks whether any user of the node is a CopyToReg instruction.
      
      After this patch we now do a result number check in both and rely on the caller to provide the result number.
      
      This shouldn't change behavior it was just an odd difference between the two functions that I noticed.
      
      llvm-svn: 349222
      5c304eac
  4. Dec 14, 2018
    • Craig Topper's avatar
      [DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext... · 257ce387
      Craig Topper authored
      [DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext (setcc)) already has the target desired type for the setcc
      
      Summary:
      If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node.
      
      To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term.
      
      Reviewers: RKSimon, spatel
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D55459
      
      llvm-svn: 349137
      257ce387
    • Craig Topper's avatar
      [X86] Demote EmitTest to a helper function of EmitCmp. Route all callers... · 178abc59
      Craig Topper authored
      [X86] Demote EmitTest to a helper function of EmitCmp. Route all callers except EmitCmp through EmitCmp.
      
      This requires the two callers to manifest a 0 to make EmitCmp call EmitTest.
      
      I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely.
      
      llvm-svn: 349094
      178abc59
  5. Dec 13, 2018
  6. Dec 12, 2018
    • Craig Topper's avatar
      [X86] Don't emit MULX by default with BMI2 · d1c61861
      Craig Topper authored
      MULX has somewhat improved register allocation constraints compared to the legacy MUL instruction. Both output registers are encoded instead of fixed to EAX/EDX, but EDX is used as input. It also doesn't touch flags. Unfortunately, the encoding is longer.
      
      Prefering it whenever BMI2 is enabled is probably not optimal. Choosing it should somehow be a function of register allocation constraints like converting adds to three address. gcc and icc definitely don't pick MULX by default. Not sure what if any rules they have for using it.
      
      Differential Revision: https://reviews.llvm.org/D55565
      
      llvm-svn: 348975
      d1c61861
    • Craig Topper's avatar
      [X86] Emit SBB instead of SETCC_CARRY from LowerSELECT. Break false dependency on the SBB input. · 4937adf7
      Craig Topper authored
      I'm hoping we can just replace SETCC_CARRY with SBB. This is another step towards that.
      
      I've explicitly used zero as the input to the setcc to avoid a false dependency that we've had with the SETCC_CARRY. I changed one of the patterns that used NEG to instead use an explicit compare with 0 on the LHS. We needed the zero anyway to avoid the false dependency. The negate would clobber its input register. By using a CMP we can avoid that which could be useful.
      
      Differential Revision: https://reviews.llvm.org/D55414
      
      llvm-svn: 348959
      4937adf7
    • Simon Pilgrim's avatar
      [SelectionDAG] Add a generic isSplatValue function · eb508f8c
      Simon Pilgrim authored
      This patch introduces a generic function to determine whether a given vector type is known to be a splat value for the specified demanded elements, recursing up the DAG looking for BUILD_VECTOR or VECTOR_SHUFFLE splat patterns.
      
      It also keeps track of the elements that are known to be UNDEF - it returns true if all the demanded elements are UNDEF (as this may be useful under some circumstances), so this needs to be handled by the caller.
      
      A wrapper variant is also provided that doesn't take the DemandedElts or UndefElts arguments for cases where we just want to know if the SDValue is a splat or not (with/without UNDEFS).
      
      I had hoped to completely remove the X86 local version of this function, but I'm seeing some regressions in shift/rotate codegen that will take a little longer to fix and I hope to get this in sooner so I can continue work on PR38243 which needs more capable splat detection.
      
      Differential Revision: https://reviews.llvm.org/D55426
      
      llvm-svn: 348953
      eb508f8c
    • Sanjay Patel's avatar
      [x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEA · 44eaa492
      Sanjay Patel authored
      This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. 
      That allows us to combine add ops with register moves and save some instructions. This is 
      another step towards allowing add truncation in generic DAGCombiner (see D54640).
      
      Differential Revision: https://reviews.llvm.org/D55494
      
      llvm-svn: 348946
      44eaa492
    • Craig Topper's avatar
      [X86] Combine vpmovdw+vpacksswb into vpmovdb. · 1fe46668
      Craig Topper authored
      This is similar to the combine we already have for vpmovdw+vpackuswb.
      
      llvm-svn: 348910
      1fe46668
  7. Dec 11, 2018
    • Craig Topper's avatar
      Fix not correct imm operand assertion for SUB32ri in X86CondBrFolding::analyzeCompare · b51283bf
      Craig Topper authored
      Summary:
      When doing X86CondBrFolding::analyzeCompare, it will meet the SUB32ri instruction as below to use the global address for its operand,
        %733:gr32 = SUB32ri %62:gr32(tied-def 0), @img2buf_normal, implicit-def $eflags
        JNE_1 %bb.41, implicit $eflags
      
      so the assertion "assert(MI.getOperand(ValueIndex).isImm() && "Expecting Imm operand")" is not correct and change the assert to if make X86CondBrFolding::analyzeCompare return false as not finding the compare for this
      
      Patch by Jianping Chen
      
      Reviewers: smaslov, LuoYuanke, liutianle, Jianping
      
      Reviewed By: Jianping
      
      Subscribers: lebedev.ri, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D54250
      
      llvm-svn: 348853
      b51283bf
    • Sanjay Patel's avatar
      [x86] clean up code for converting 16-bit ops to LEA; NFC · 05e36982
      Sanjay Patel authored
      As discussed in D55494, we want to extend this to handle 8-bit
      ops too, but that could be extended further to enable this on
      32-bit systems too.
      
      llvm-svn: 348851
      05e36982
    • Sanjay Patel's avatar
      [x86] remove dead code for 16-bit LEA formation; NFC · 9765ba5f
      Sanjay Patel authored
      As discussed in:
      D55494
      ...this code has been disabled/dead for a long time (the code references
      Athlon and Pentium 4), and there's almost no chance that it will be used 
      given the last decade of uarch evolution. Also, in SDAG we promote 16-bit 
      ops to 32-bit, so there's almost no way to test this code any more.
      
      llvm-svn: 348845
      9765ba5f
  8. Dec 10, 2018
  9. Dec 09, 2018
    • Craig Topper's avatar
      [X86] If the carry input to an addcarry/subborrow intrinsic is known to be 0,... · 2b09d17d
      Craig Topper authored
      [X86] If the carry input to an addcarry/subborrow intrinsic is known to be 0, emit a flag setting ADD/SUB instead of ADC/SBB.
      
      Previously we had to take the carry in and add -1 to it to set the carry flag so we could use it with ADC/SBB. But if we know its 0 then we don't need to bother.
      
      This should go a long way towards fixing PR24545.
      
      llvm-svn: 348727
      2b09d17d
    • Nico Weber's avatar
      Remove unneeded dependency from lib/Target/X86/Utils/ to lib/IR (aka Core). · b9616619
      Nico Weber authored
      The dependency was added in r213995 in response to r213986 which did make
      X86/Utils depend on IR, but r256680 later removed that dependency again.
      
      llvm-svn: 348724
      b9616619
    • Sanjay Patel's avatar
      [x86] don't try to convert add with undef operands to LEA · 19bc8502
      Sanjay Patel authored
      The existing code tries to handle an undef operand while transforming an add to an LEA, 
      but it's incomplete because we will crash on the i16 test with the debug output shown below. 
      It's better to just give up instead. Really, GlobalIsel should have folded these before we 
      could get into trouble.
      
      # Machine code for function add_undef_i16: NoPHIs, TracksLiveness, Legalized, RegBankSelected, Selected
      
      bb.0 (%ir-block.0):
        liveins: $edi
        %1:gr32 = COPY killed $edi
        %0:gr16 = COPY %1.sub_16bit:gr32
        %5:gr64_nosp = IMPLICIT_DEF
        %5.sub_16bit:gr64_nosp = COPY %0:gr16
        %6:gr64_nosp = IMPLICIT_DEF
        %6.sub_16bit:gr64_nosp = COPY %2:gr16
        %4:gr32 = LEA64_32r killed %5:gr64_nosp, 1, killed %6:gr64_nosp, 0, $noreg
        %3:gr16 = COPY killed %4.sub_16bit:gr32
        $ax = COPY killed %3:gr16
        RET 0, implicit killed $ax
      
      # End machine code for function add_undef_i16.
      
      *** Bad machine code: Reading virtual register without a def ***
      - function:    add_undef_i16
      - basic block: %bb.0  (0x7fe6cd83d940)
      - instruction: %6.sub_16bit:gr64_nosp = COPY %2:gr16
      - operand 1:   %2:gr16
      LLVM ERROR: Found 1 machine code errors.
      
      Differential Revision: https://reviews.llvm.org/D54710
      
      llvm-svn: 348722
      19bc8502
    • Simon Pilgrim's avatar
      [X86] Extend pfm counter coverage for llvm-exegesis · e9d8275e
      Simon Pilgrim authored
      Extension to rL348617, turns out llvm-exegesis doesn't need to match the perf counter name against a scheduler model resource name - so I've added a few more counters that I could find in the libpfm4 source code (and fix a typo in the knl/knm retired_uops counter - which uses 'all' instead of 'any').
      
      llvm-svn: 348721
      e9d8275e
  10. Dec 07, 2018
Loading