Skip to content
  1. Oct 25, 2021
    • Danila Malyutin's avatar
      7b102fcc
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Track values fused into stack spills · ee3eee71
      Jeremy Morse authored
      During register allocation, some instructions can have stack spills fused
      into them. It means that when vregs are allocated on the stack we can
      convert:
      
          SETCCr %0
          DBG_VALUE %0
      
      to
      
          SETCCm %stack.0
          DBG_VALUE %stack.0
      
      Unfortunately instruction referencing finds this harder: a store to the
      stack doesn't have a specific operand number, therefore we don't substitute
      the old operand for a new operand, and the location is dropped. This patch
      implements a solution: just recognise the memory operand attached to an
      instruction with a Special Number (TM), and record a substitution between
      the old value and the new one.
      
      This patch adds substitution code to InlineSpiller to record such fused
      spills, and tracking in InstrRefBasedLDV to recognise such values, and
      produce the value numbers for them. Everything to do with the movement of
      stack-defined values is already handled in InstrRefBasedLDV.
      
      Differential Revision: https://reviews.llvm.org/D111317
      ee3eee71
    • Jeremy Morse's avatar
      [DebugInfo][NFC] Avoid a use-after-free · 2eb96e17
      Jeremy Morse authored
      This patch swaps two lines -- the CurSucc reference can be invalidated
      by the call to DFS.push_back, therefore that should happen last. The
      usual hat-tip to asan for catching this.
      
      This patch also swaps an ealier call to ToAdd.insert and DFS.push_back,
      where a stable iterator (from successors()) is being used. This isn't
      strictly necessary, but is good for consistency and avoiding readers
      asking themselves why the two code portions have a different order.
      2eb96e17
    • Sanjay Patel's avatar
      [DAGCombiner] make matching bit-hack form of usubsat more flexible · 6e46b66e
      Sanjay Patel authored
      (i8 X ^ 128) & (i8 X s>> 7) --> usubsat X, 128
      
      As suggested in D112085, we can substitute 'xor' with 'add'
      in this pattern, and it is logically equivalent:
      https://alive2.llvm.org/ce/z/eJtWWC
      
      We canonicalize to 'xor' in IR, but SDAG does not do that
      (and it probably should not - https://llvm.org/PR52267 ), so
      it is possible to see either pattern in codegen. Note that
      'sub' is a another potential pattern, but that is
      canonicalized to 'add' in DAGCombiner, so we don't need to
      worry about that variation.
      
      Differential Revision: https://reviews.llvm.org/D112377
      6e46b66e
    • Tim Northover's avatar
      CodeGenPrep: remove all copies of GEP from list if there are duplicates. · f9089acc
      Tim Northover authored
      Unfortunately ToT has changed enough from the revision where this actually
      caused problems that the test no longer triggers an assertion failure.
      f9089acc
    • Kazu Hirata's avatar
      Use llvm::any_of and llvm::none_of (NFC) · 4bd46501
      Kazu Hirata authored
      4bd46501
  2. Oct 24, 2021
  3. Oct 23, 2021
  4. Oct 22, 2021
    • Jay Foad's avatar
      [ScheduleDAGInstrs] Call adjustSchedDependency in more cases · 2915889d
      Jay Foad authored
      This removes a condition and the corresponding FIXME comment, because
      the Hexagon assertion it refers to has apparently been fixed, probably
      by D76134.
      
      NFCI. This just gives targets the opportunity to adjust latencies that
      were set to 0 by the generic code because they involve "implicit pseudo"
      operands.
      
      Differential Revision: https://reviews.llvm.org/D112306
      2915889d
    • Jeremy Morse's avatar
      [DebugInfo][Instr] Track subregisters across stack spills/restores · e7084cea
      Jeremy Morse authored
      Sometimes we generate code that writes to a subregister, then spills /
      restores a super-register to the stack, for example:
      
          $eax = MOV32ri 0
          MOV64mr $rsp, 1, $noreg, 16, $noreg, $rax
          $rcx = MOV64rm $rsp, 1, $noreg, 8, $noreg
      
      This patch takes a different approach: it adds another index to
      MLocTracker that identifies a size/offset within a stack slot. A location
      on the stack is then a pari of {FrameIndex, SlotNum}. Spilling and
      restoring now involves pairing up the src/dest register numbers, and the
      dest/src stack position to be transferred to/from. Location coverage
      improves as a result, compile-time performance decreases, alas.
      
      One limitation is that if a PHI occurs inside a stack slot:
      
          DBG_PHI %stack.0, 1
      
      We don't know how large the resulting value is, and so might have
      difficulty picking which value to use. DBG_PHI might need to be augmented
      in the future with such a size.
      
      Unit tests added ensure that spills and restores correctly transfer to
      positions in the Location => Value map, and that different register classes
      written to the stack will correctly clobber all other positions in the
      stack slot.
      
      Differential Revision: https://reviews.llvm.org/D112133
      e7084cea
    • Craig Topper's avatar
      [LegalizeTypes] Only expand CTLZ/CTTZ/CTPOP during type promotion if the new type is legal. · 93139a3c
      Craig Topper authored
      We might be promoting a large non-power of 2 type and the new type
      may need to be split. Once we split it we may have a ctlz/cttz/ctpop
      instruction for the split type.
      
      I'm also concerned that we may create large shifts with shift amounts
      that are too small.
      93139a3c
    • Simon Pilgrim's avatar
      [DAG] narrowExtractedVectorLoad - EXTRACT_SUBVECTOR indices are always constant · a5f56342
      Simon Pilgrim authored
      EXTRACT_SUBVECTOR indices are always constant, we don't need to check for ConstantSDNode, we should just use getConstantOperandVal which will assert for the constant.
      a5f56342
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Add unit tests for transfer-function building · d9eebe3c
      Jeremy Morse authored
      This patch adds some unit tests for the machine-location transfer-function
      building parts of InstrRefBasedLDV: i.e., test that if we feed some MIR
      into the transfer-function building code, does it create the correct
      transfer function.
      
      There are a number of minor defects that get corrected in the process:
       * The unit test was selecting the x86 (i.e. 32 bit) backend rather than
         x86_64's 64 bit backend,
       * COPY instructions weren't actually having their subregister values
         correctly represented in the transfer function. Subregisters were being
         defined by the COPY, rather than taking the value in the source register.
       * SP aliases were at risk of being clobbered, if an SP subregister was
         clobbered.
      
      Differential Revision: https://reviews.llvm.org/D112006
      d9eebe3c
    • Craig Topper's avatar
      [TargetLowering] Simplify the interface of expandABS. NFC · 04c184bb
      Craig Topper authored
      Instead of returning a bool to indicate success and a separate
      SDValue, return the SDValue and have the callers check if it is
      null.
      
      Reviewed By: RKSimon
      
      Differential Revision: https://reviews.llvm.org/D112331
      04c184bb
    • Craig Topper's avatar
      [LegalizeTypes][RISCV][PowerPC] Expand CTLZ/CTTZ/CTPOP instead of promoting if... · 0766aef3
      Craig Topper authored
      [LegalizeTypes][RISCV][PowerPC] Expand CTLZ/CTTZ/CTPOP instead of promoting if they'll be expanded later.
      
      Expanding these requires multiple constants. If we promote during type
      legalization when they'll end up getting expanded in LegalizeDAG, we'll
      use larger constants. These constants may be harder to materialize.
      For example, 64-bit constants on 64-bit RISCV are very expensive.
      
      This is similar to what has already been done to BSWAP and BITREVERSE.
      
      Reviewed By: RKSimon
      
      Differential Revision: https://reviews.llvm.org/D112268
      0766aef3
    • Zarko Todorovski's avatar
    • Craig Topper's avatar
      [TargetLowering] Simplify the interface for expandCTPOP/expandCTLZ/expandCTTZ. · 996123e5
      Craig Topper authored
      There is no need to return a bool and have an SDValue output
      parameter. Just return the SDValue and let the caller check if it
      is null.
      
      I have another patch to add more callers of these so I thought
      I'd clean up the interface first.
      
      Reviewed By: RKSimon
      
      Differential Revision: https://reviews.llvm.org/D112267
      996123e5
    • Craig Topper's avatar
      [LegalizeVectorOps][X86] Don't defer BITREVERSE expansion to LegalizeDAG. · ff37b110
      Craig Topper authored
      By expanding early it allows the shifts to be custom lowered in
      LegalizeVectorOps. Then a DAG combine is able to run on them before
      LegalizeDAG handles the BUILD_VECTORS for the masks used.
      
      v16Xi8 shift lowering on X86 requires a mask to be applied to a v8i16
      shift. The BITREVERSE expansion applied an AND mask before SHL ops and
      after SRL ops. This was done to share the same mask constant for both shifts.
      It looks like this patch allows DAG combine to remove the AND mask added
      after v16i8 SHL by X86 lowering. This maintains the mask sharing that
      BITREVERSE was trying to achieve. Prior to this patch it looks like
      we kept the mask after the SHL instead which required an extra constant
      pool or a PANDN to invert it.
      
      This is dependent on D112248 because RISCV will end up scalarizing the BSWAP
      portion of the BITREVERSE expansion if we don't disable BSWAP scalarization in
      LegalizeVectorOps first.
      
      Reviewed By: RKSimon
      
      Differential Revision: https://reviews.llvm.org/D112254
      ff37b110
  5. Oct 21, 2021
  6. Oct 20, 2021
    • Jon Roelofs's avatar
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] MachineLICM cannot hoist VALU · c80d8a8c
      Stanislav Mekhanoshin authored
      MachineLoop::isLoopInvariant() returns false for all VALU
      because of the exec use. Check TII::isIgnorableUse() to
      allow hoisting.
      
      That unfortunately results in higher register consumption
      since MachineLICM does not adequately estimate pressure.
      Therefor I think it shall only be enabled after D107677 even
      though it does not depend on it.
      
      Differential Revision: https://reviews.llvm.org/D107859
      c80d8a8c
    • Itay Bookstein's avatar
      [IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol · 08ed2160
      Itay Bookstein authored
      As discussed in:
      * https://reviews.llvm.org/D94166
      * https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html
      
      The GlobalIndirectSymbol class lost most of its meaning in
      https://reviews.llvm.org/D109792, which disambiguated getBaseObject
      (now getAliaseeObject) between GlobalIFunc and everything else.
      In addition, as long as GlobalIFunc is not a GlobalObject and
      getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
      is a GlobalIFunc cannot currently be modeled properly. Creating
      aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
      calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
      which is undesirable because it should return the object itself for
      non-aliases.
      
      This patch refactors the GlobalIFunc class to inherit directly from
      GlobalObject, and removes GlobalIndirectSymbol (while inlining the
      relevant parts into GlobalAlias and GlobalIFunc). This allows for
      calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
      itself, making getAliaseeObject() more consistent and enabling
      alias-to-ifunc to be properly modeled in the IR.
      
      I exercised some judgement in the API clients of GlobalIndirectSymbol:
      some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
      some remained shared (with the type adapted to become GlobalValue).
      
      Reviewed By: MaskRay
      
      Differential Revision: https://reviews.llvm.org/D108872
      08ed2160
    • Fraser Cormack's avatar
      [CodeGenPrepare] Avoid a scalable-vector crash in ctlz/cttz · eabf11f9
      Fraser Cormack authored
      This patch fixes a crash when despeculating ctlz/cttz intrinsics with
      scalable-vector types. It is not safe to speculatively get the size of
      the vector type in bits in case the vector type is not a fixed-length type. As
      it happens this isn't required as vector types are skipped anyway.
      
      Reviewed By: RKSimon
      
      Differential Revision: https://reviews.llvm.org/D112141
      eabf11f9
    • Craig Topper's avatar
      [RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on... · fe1f0de0
      Craig Topper authored
      [RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on CTPOP expansion for vectors.
      
      Our fallback expansion for CTLZ/CTTZ relies on CTPOP. If CTPOP
      isn't legal or custom for a vector type we would scalarize the
      CTLZ/CTTZ. This is different than CTPOP itself which would use a
      vector expansion.
      
      This patch teaches expandCTLZ/CTTZ to rely on the vector CTPOP
      expansion instead of scalarizing. To do this I had to add additional
      checks to make sure the operations used by CTPOP expansions are all
      supported. Some of the operations were already needed for the CTLZ/CTTZ
      expansion.
      
      This is a huge improvement to the RISCV which doesn't have a scalar
      ctlz or cttz in the base ISA.
      
      For WebAssembly, I've added Custom lowering to keep the scalarizing
      behavior. I've also extended the scalarizing to CTPOP.
      
      Differential Revision: https://reviews.llvm.org/D111919
      fe1f0de0
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Track a single variable at a time · 89950ade
      Jeremy Morse authored
      Here's another performance patch for InstrRefBasedLDV: rather than
      processing all variable values in a scope at a time, instead, process one
      variable at a time. The benefits are twofold:
       * It's easier to reason about one variable at a time in your mind,
       * It improves performance, apparently from increased locality.
      
      The downside is that the value-propagation code gets indented one level
      further, plus there's some churn in the unit tests.
      
      Differential Revision: https://reviews.llvm.org/D111799
      89950ade
    • Sander de Smalen's avatar
      [SelectionDAG] Fix getVectorSubVecPointer for scalable subvectors. · be6c8dc7
      Sander de Smalen authored
      When inserting a scalable subvector into a scalable vector through
      the stack, the index to store to needs to be scaled by vscale.
      Before this patch, that didn't yet happen, so it would generate the
      wrong offset, thus storing a subvector to the incorrect address
      and overwriting the wrong lanes.
      
      For some insert:
        nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2)
      
      The offset was not scaled by vscale:
        orr     x8, x8, #0x4
        st1h    { z0.h }, p0, [sp]
        st1h    { z1.d }, p1, [x8]
        ld1h    { z0.h }, p0/z, [sp]
      
      And is changed to:
        mov x8, sp
        st1h { z0.h }, p0, [sp]
        st1h { z1.d }, p1, [x8, #1, mul vl]
        ld1h { z0.h }, p0/z, [sp]
      
      Differential Revision: https://reviews.llvm.org/D111633
      be6c8dc7
  7. Oct 19, 2021
  8. Oct 18, 2021
Loading