Skip to content
  1. Oct 20, 2021
  2. Oct 19, 2021
    • Sanjay Patel's avatar
      [x86] add special-case lowering for usubsat for pre-SSE4 · 92a0389b
      Sanjay Patel authored
      usubsat X, SMIN --> (X ^ SMIN) & (X s>> BW-1)
      
      This would be a regression with D112085 where we combine to
      usubsat more aggressively, so avoid that by matching the
      special-case where we are subtracting SMIN (signmask):
      https://alive2.llvm.org/ce/z/4_3gBD
      
      Differential Revision: https://reviews.llvm.org/D112095
      92a0389b
    • Bjorn Pettersson's avatar
      [SCEV] Fix formatting error introduced by D112080 · 9c44a099
      Bjorn Pettersson authored
      Accidentally pushed D112080 without this clang-format cleanup.
      9c44a099
    • Bjorn Pettersson's avatar
      [SCEV] Avoid compile time explosion in ScalarEvolution::isImpliedCond · 08619006
      Bjorn Pettersson authored
      As seen in PR51869 the ScalarEvolution::isImpliedCond function might
      end up spending lots of time when doing the isKnownPredicate checks.
      
      Calling isKnownPredicate for example result in isKnownViaInduction
      being called, which might result in isLoopBackedgeGuardedByCond being
      called, and then we might get one or more new calls to isImpliedCond.
      Even if the scenario described here isn't an infinite loop, using
      some random generated C programs as input indicates that those
      isKnownPredicate checks quite often returns true. On the other hand,
      the third condition that needs to be fulfilled in order to "prove
      implications via truncation", i.e. the isImpliedCondBalancedTypes
      check, is rarely fulfilled.
      I also made some similar experiments to look at how often we would
      get the same result when using isKnownViaNonRecursiveReasoning instead
      of isKnownPredicate. So far I haven't seen a single case when codegen
      is negatively impacted by using isKnownViaNonRecursiveReasoning. On
      the other hand, it seems like we get rid of the compile time explosion
      seen in PR51869 that way. Hence this patch.
      
      Reviewed By: nikic
      
      Differential Revision: https://reviews.llvm.org/D112080
      08619006
    • Philip Reames's avatar
      Extend transform introduced in D111896 to multiple exits · 0836a105
      Philip Reames authored
      This is trivial.  It was left out of the original review only because we had multiple copies of the same code in review at the same time, and keeping them in sync was easiest if the structure was kept in sync.
      0836a105
    • Philip Reames's avatar
      [indvars] Canonicalize exit conditions to unsigned using range info · fca02188
      Philip Reames authored
      This patch duplicates a bit of logic we apply to comparisons encountered during the IV users walk to conditions which feed exit conditions. Why? simplifyAndExtend has a very limited list of users it walks. In particular, in the examples is stops at the zext and never visits the icmp. (Because we can't fold the zext to an addrec yet in SCEV.) Being willing to visit when we haven't simplified regresses multiple tests (seemingly because of less optimal results when computing trip counts).
      
      Note that this can be trivially extended to multiple exiting blocks. I'm leaving that to a future patch (solely to cut down on the number of versions of the same code in review at once.)
      
      Differential Revision: https://reviews.llvm.org/D111896
      fca02188
    • Anna Thomas's avatar
      [LoopPredication] Calculate profitability without BPI · 9403514e
      Anna Thomas authored
      Using BPI within loop predication is non-trivial because BPI is only
      preserved lossily in loop pass manager (one fix exposed by lossy
      preservation is up for review at D111448). However, since loop
      predication is only used in downstream pipelines, it is hard to keep BPI
      from breaking for incomplete state with upstream changes in BPI.
      Also, correctly preserving BPI for all loop passes is a non-trivial
      undertaking (D110438 does this lossily), while the benefit of using it
      in loop predication isn't clear.
      
      In this patch, we rely on profile metadata to get almost similar benefit as
      BPI, without actually using the complete heuristics provided by BPI.
      This avoids the compile time explosion we tried to fix with D110438 and
      also avoids fragile bugs because BPI can be lossy in loop passes
      (D111448).
      
      Reviewed-By: asbirlea, apilipenko
      Differential Revision: https://reviews.llvm.org/D111668
      9403514e
    • Arthur Eubanks's avatar
      [Verifier] Add context for assume operand bundles verifier errors · ac0561eb
      Arthur Eubanks authored
      And fix a typo.
      ac0561eb
    • Jamie Schmeiser's avatar
      Changes to print-changed classes in preparation for DotCfg change printer · 3af474c0
      Jamie Schmeiser authored
      Summary:
      Break out non-functional changes to the print-changed classes that are needed
      for reuse with the DotCfg change printer in https://reviews.llvm.org/D87202.
      
      Various changes to the change printers to facilitate reuse with the
      upcoming DotCfg change printer. This includes changing several of
      the classes and their support classes to being templates. Also,
      some template parameter names were simplified to avoid confusion
      with planned identifiers in the DotCfg change printer to come. A
      virtual function in the class for comparing functions was changed
      to a lambda. The virtual function same was replaced with calls to
      operator==. The only intentional functional change was to add the exe name
      as the first parameter to llvm::sys::ExecuteAndWait
      
      Author: Jamie Schmeiser <schmeise@ca.ibm.com>
      Reviewed By: aeubanks (Arthur Eubanks)
      Differential Revision: https://reviews.llvm.org/D110737
      3af474c0
    • David Sherwood's avatar
      [AArch64] Split out processor/tuning features · 5ea35791
      David Sherwood authored
      Following on from an earlier patch that introduced support for -mtune
      for AArch64 backends, this patch splits out the tuning features
      from the processor features. This gives us the ability to enable
      architectural feature set A for a given processor with "-mcpu=A"
      and define the set of tuning features B with "-mtune=B".
      
      It's quite difficult to write a test that proves we select the
      right features according to the tuning attribute because most
      of these relate to scheduling. I have created a test here:
      
        CodeGen/AArch64/misched-fusion-addr-tune.ll
      
      that demonstrates the different scheduling choices based upon
      the tuning.
      
      Differential Revision: https://reviews.llvm.org/D111551
      5ea35791
    • David Sherwood's avatar
      [AArch64] Always add -tune-cpu argument to -cc1 driver · 607fb1bb
      David Sherwood authored
      This patch ensures that we always tune for a given CPU on AArch64
      targets when the user specifies the "-mtune=xyz" flag. In the
      AArch64Subtarget if the tune flag is unset we use the CPU value
      instead.
      
      I've updated the release notes here:
      
        llvm/docs/ReleaseNotes.rst
      
      and added tests here:
      
        clang/test/Driver/aarch64-mtune.c
      
      Differential Revision: https://reviews.llvm.org/D110258
      607fb1bb
    • Simon Pilgrim's avatar
      [ADT] Add APInt::isNegatedPowerOf2() helper · 71e39e3f
      Simon Pilgrim authored
      Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos.....
      
      Differential Revision: https://reviews.llvm.org/D111998
      71e39e3f
    • Jeremy Morse's avatar
      [DebugInfo][InstrRef] Avoid un-necessary densemap copies and comparisons · 849b1794
      Jeremy Morse authored
      This is purely a performance patch: InstrRefBasedLDV used to use three
      DenseMaps to store variable values, two for long term storage and one as a
      working set. This patch eliminates the working set, and updates the long
      term storage in place, thus avoiding two DenseMap comparisons and two
      DenseMap assignments, which can be expensive.
      
      Differential Revision: https://reviews.llvm.org/D111716
      849b1794
    • Lasse Folger's avatar
      [lldb] change name demangling to be consistent between windows and linx · 134e1817
      Lasse Folger authored
      When printing names in lldb on windows these names contain the full type information while on linux only the name is contained.
      
      This change introduces a flag in the Microsoft demangler to control if the type information should be included.
      With the flag enabled demangled name contains only the qualified name, e.g:
      without flag -> with flag
      int (*array2d)[10] -> array2d
      int (*abc::array2d)[10] -> abc::array2d
      const int *x -> x
      
      For globals there is a second inconsistency which is not yet addressed by this change. On linux globals (in global namespace) are prefixed with :: while on windows they are not.
      
      Reviewed By: teemperor, rnk
      
      Differential Revision: https://reviews.llvm.org/D111715
      134e1817
    • Jeremy Morse's avatar
      [DebugInfo][NFC] Zero-initialize a class field · cf033bb2
      Jeremy Morse authored
      This field gets assigned when the relevant object starts being used; but it
      remains uninitialized beforehand. This risks introducing hard-to-detect
      bugs if something changes, so zero-initialize the field.
      cf033bb2
    • Lang Hames's avatar
      [JITLink][x86-64] Lift GOT, PLT table managers into x86_64.h; reuse for MachO. · cc3115cd
      Lang Hames authored
      This lifts the global offset table and procedure linkage table builders out of
      ELF_x86_64.h and into x86_64.h, renaming them with generic names
      x86_64::GOTTableBuilder and x86_64::PLTTableBuilder. MachO_x86_64.cpp is updated
      to use these classes instead of the older PerGraphGOTAndStubsBuilder tool.
      cc3115cd
    • Noah Shutty's avatar
      [Support][ThinLTO] Move ThinLTO caching to LLVM Support library · e678c511
      Noah Shutty authored
      We would like to move ThinLTO’s battle-tested file caching mechanism to
      the LLVM Support library so that we can use it elsewhere in LLVM.
      
      Patch By: noajshu
      
      Differential Revision: https://reviews.llvm.org/D111371
      e678c511
    • Hsiangkai Wang's avatar
      [RISCV] Reorder the vector register allocation order. · facff468
      Hsiangkai Wang authored
      GPR uses argument registers as the first group of registers to allocate.
      This patch uses vector argument registers, v8 to v23, as the first group
      to allocate.
      
      Differential Revision: https://reviews.llvm.org/D111304
      facff468
    • Lang Hames's avatar
      Simplify the TableManager class and move it into a public header. · bc03a9c0
      Lang Hames authored
      Moves visitEdge into the TableManager derivatives, replacing the fixEdgeKind
      methods in those classes. The visitEdge method takes on responsibility for
      updating the edge target, as well as its kind.
      bc03a9c0
    • Anshil Gandhi's avatar
      [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols · 0567f033
      Anshil Gandhi authored
      By default clang emits complete contructors as alias of base constructors if they are the same.
      The backend is supposed to emit symbols for the alias, otherwise it causes undefined symbols.
      @yaxunl observed that this issue is related to the llvm options `-amdgpu-early-inline-all=true`
      and `-amdgpu-function-calls=false`. This issue is resolved by only inlining global values
      with internal linkage. The `getCalleeFunction()` in AMDGPUResourceUsageAnalysis also had
      to be extended to support aliases to functions. inline-calls.ll was corrected appropriately.
      
      Reviewed By: yaxunl, #amdgpu
      
      Differential Revision: https://reviews.llvm.org/D109707
      0567f033
  3. Oct 18, 2021
Loading