Skip to content
  1. Oct 22, 2021
  2. Oct 21, 2021
  3. Oct 20, 2021
    • Jon Roelofs's avatar
    • Stanislav Mekhanoshin's avatar
      [AMDGPU] MachineLICM cannot hoist VALU · c80d8a8c
      Stanislav Mekhanoshin authored
      MachineLoop::isLoopInvariant() returns false for all VALU
      because of the exec use. Check TII::isIgnorableUse() to
      allow hoisting.
      
      That unfortunately results in higher register consumption
      since MachineLICM does not adequately estimate pressure.
      Therefor I think it shall only be enabled after D107677 even
      though it does not depend on it.
      
      Differential Revision: https://reviews.llvm.org/D107859
      c80d8a8c
    • Itay Bookstein's avatar
      [IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol · 08ed2160
      Itay Bookstein authored
      As discussed in:
      * https://reviews.llvm.org/D94166
      * https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html
      
      The GlobalIndirectSymbol class lost most of its meaning in
      https://reviews.llvm.org/D109792, which disambiguated getBaseObject
      (now getAliaseeObject) between GlobalIFunc and everything else.
      In addition, as long as GlobalIFunc is not a GlobalObject and
      getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
      is a GlobalIFunc cannot currently be modeled properly. Creating
      aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
      calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
      which is undesirable because it should return the object itself for
      non-aliases.
      
      This patch refactors the GlobalIFunc class to inherit directly from
      GlobalObject, and removes GlobalIndirectSymbol (while inlining the
      relevant parts into GlobalAlias and GlobalIFunc). This allows for
      calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
      itself, making getAliaseeObject() more consistent and enabling
      alias-to-ifunc to be properly modeled in the IR.
      
      I exercised some judgement in the API clients of GlobalIndirectSymbol:
      some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
      some remained shared (with the type adapted to become GlobalValue).
      
      Reviewed By: MaskRay
      
      Differential Revision: https://reviews.llvm.org/D108872
      08ed2160
    • Fraser Cormack's avatar
      [CodeGenPrepare] Avoid a scalable-vector crash in ctlz/cttz · eabf11f9
      Fraser Cormack authored
      This patch fixes a crash when despeculating ctlz/cttz intrinsics with
      scalable-vector types. It is not safe to speculatively get the size of
      the vector type in bits in case the vector type is not a fixed-length type. As
      it happens this isn't required as vector types are skipped anyway.
      
      Reviewed By: RKSimon
      
      Differential Revision: https://reviews.llvm.org/D112141
      eabf11f9
    • Craig Topper's avatar
      [RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on... · fe1f0de0
      Craig Topper authored
      [RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on CTPOP expansion for vectors.
      
      Our fallback expansion for CTLZ/CTTZ relies on CTPOP. If CTPOP
      isn't legal or custom for a vector type we would scalarize the
      CTLZ/CTTZ. This is different than CTPOP itself which would use a
      vector expansion.
      
      This patch teaches expandCTLZ/CTTZ to rely on the vector CTPOP
      expansion instead of scalarizing. To do this I had to add additional
      checks to make sure the operations used by CTPOP expansions are all
      supported. Some of the operations were already needed for the CTLZ/CTTZ
      expansion.
      
      This is a huge improvement to the RISCV which doesn't have a scalar
      ctlz or cttz in the base ISA.
      
      For WebAssembly, I've added Custom lowering to keep the scalarizing
      behavior. I've also extended the scalarizing to CTPOP.
      
      Differential Revision: https://reviews.llvm.org/D111919
      fe1f0de0
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Track a single variable at a time · 89950ade
      Jeremy Morse authored
      Here's another performance patch for InstrRefBasedLDV: rather than
      processing all variable values in a scope at a time, instead, process one
      variable at a time. The benefits are twofold:
       * It's easier to reason about one variable at a time in your mind,
       * It improves performance, apparently from increased locality.
      
      The downside is that the value-propagation code gets indented one level
      further, plus there's some churn in the unit tests.
      
      Differential Revision: https://reviews.llvm.org/D111799
      89950ade
    • Sander de Smalen's avatar
      [SelectionDAG] Fix getVectorSubVecPointer for scalable subvectors. · be6c8dc7
      Sander de Smalen authored
      When inserting a scalable subvector into a scalable vector through
      the stack, the index to store to needs to be scaled by vscale.
      Before this patch, that didn't yet happen, so it would generate the
      wrong offset, thus storing a subvector to the incorrect address
      and overwriting the wrong lanes.
      
      For some insert:
        nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2)
      
      The offset was not scaled by vscale:
        orr     x8, x8, #0x4
        st1h    { z0.h }, p0, [sp]
        st1h    { z1.d }, p1, [x8]
        ld1h    { z0.h }, p0/z, [sp]
      
      And is changed to:
        mov x8, sp
        st1h { z0.h }, p0, [sp]
        st1h { z1.d }, p1, [x8, #1, mul vl]
        ld1h { z0.h }, p0/z, [sp]
      
      Differential Revision: https://reviews.llvm.org/D111633
      be6c8dc7
  4. Oct 19, 2021
  5. Oct 18, 2021
  6. Oct 15, 2021
  7. Oct 14, 2021
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Place variable-values PHI using LLVM utilities · b5426ced
      Jeremy Morse authored
      This patch is very similar to D110173 / a3936a6c, but for variable
      values rather than machine values. This is for the second instr-ref
      problem, calculating the correct variable value on entry to each block.
      The previous lattice based implementation was broken; we now use LLVMs
      existing PHI placement utilities to work out where values need to merge,
      then eliminate un-necessary ones through value propagation.
      
      Most of the deletions here happen in vlocJoin: it was trying to pick a
      location for PHIs to happen in, badly, leading to an infinite loop in the
      MIR test added, where it would repeatedly switch between register
      locations. The new approach is simpler: either PHIs can be eliminated, or
      they can't, and the location of the value is a different problem.
      
      Various bits and pieces move to the header so that they can be tested in
      the unit tests. The DbgValue class grows a "VPHI" kind to represent
      variable value PHIS that haven't been eliminated yet.
      
      Differential Revision: https://reviews.llvm.org/D110630
      b5426ced
    • Simon Pilgrim's avatar
      [Codegen] TargetLowering::getCanonicalIndexType - early out scaled MVT::i8 indices. NFCI. · 88487662
      Simon Pilgrim authored
      Avoids unused assignment scan-build warning.
      88487662
    • Jeremy Morse's avatar
      Follow up to a3936a6c, correctly select LiveDebugValues implementation · e3e1da20
      Jeremy Morse authored
      Some functions get opted out of instruction referencing if they're being
      compiled with no optimisations, however the LiveDebugValues pass picks one
      implementation and then sticks with it through the rest of compilation.
      This leads to a segfault if we encounter a function that doesn't use
      instr-ref (because it's optnone, for example), but we've already decided
      to use InstrRefBasedLDV which expects to be passed a DomTree.
      
      Solution: keep both implementations around in the pass, and pick whichever
      one is appropriate to the current function.
      e3e1da20
  8. Oct 13, 2021
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Only calculate IDF for reg units · fbf269c7
      Jeremy Morse authored
      In D110173 we start using the existing LLVM IDF calculator to place PHIs as
      we reconstruct an SSA form of machine-code program. Sadly that's slower
      than the old (but broken) way, this patch attempts to recover some of that
      performance.
      
      The key observation: every time we def a register, we also have to def it's
      register units. If we def'd $rax, in the current implementation we
      independently calculate PHI locations for {al, ah, ax, eax, hax, rax}, and
      they will all have the same PHI positions. Instead of doing that, we can
      calculate the PHI positions for {al, ah} and place PHIs for any aliasing
      registers in the same positions. Any def of a super-register has to def
      the unit, and vice versa, so this is sound. It cuts down the SSA placement
      we need to do significantly.
      
      This doesn't work for stack slots, or registers we only ever read, so place
      PHIs normally for those. LiveDebugValues choses to ignore writes to SP at
      calls, and now have to ignore writes to SP register units too.
      
      Differential Revision: https://reviews.llvm.org/D111627
      fbf269c7
    • Jeremy Morse's avatar
      Follow up a3936a6c to work around an old compiler bug · e845ca2f
      Jeremy Morse authored
      Old versions of gcc want template specialisations to happen within the
      namespace where the template lives; this is still present in gcc 5.1, which
      we officially support, so it has to be worked around.
      e845ca2f
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Use PHI placement utilities for machine locations · a3936a6c
      Jeremy Morse authored
      InstrRefBasedLDV used to try and determine which values are in which
      registers using a lattice approach; however this is hard to understand, and
      broken in various ways. This patch replaces that approach with a standard
      SSA approach using existing LLVM utilities. PHIs are placed at dominance
      frontiers; value propagation then eliminates un-necessary PHIs.
      
      This patch also adds a bunch of unit tests that should cover many of the
      weirder forms of control flow.
      
      Differential Revision: https://reviews.llvm.org/D110173
      a3936a6c
Loading