Skip to content
  1. Jan 17, 2021
    • mydeveloperday's avatar
      [clang-format] Revert e9e6e3b3 · 9af03864
      mydeveloperday authored
      Reverting {D92753} due to issues with #pragma indentation in #ifdef/endif structure
      9af03864
    • Nikita Popov's avatar
      Reapply [BasicAA] Handle recursive queries more efficiently · 0b84afa5
      Nikita Popov authored
      There are no changes relative to the original commit. However, an issue
      this exposed in BasicAA assumption tracking has been fixed in the
      previous commit.
      
      -----
      
      An alias query currently works out roughly like this:
      
       * Look up location pair in cache.
       * Perform BasicAA logic (including cache lookup and insertion...)
       * Perform a recursive query using BestAAResults.
         * Look up location pair in cache (and thus do not recurse into BasicAA)
         * Query all the other AA providers.
       * Query all the other AA providers.
      
      This is a lot of unnecessary work, all ultimately caused by the
      BestAAResults query at the end of aliasCheck(). The reason we perform
      it, is that aliasCheck() is getting called recursively, and we of
      course want those recursive queries to also make use of other AA
      providers, not just BasicAA. We can solve this by making the recursive
      queries directly use BestAAResults (which will check both BasicAA
      and other providers), rather than recursing into aliasCheck().
      
      There are some tradeoffs:
      
       * We can no longer pass through the precomputed underlying object
         to aliasCheck(). This is not a major concern, because nowadays
         getUnderlyingObject() is quite cheap.
       * Results from other AA providers are no longer cached inside
         BasicAA. The way this worked was already a bit iffy, in that a
         result could be cached, but if it was MayAlias, we'd still end
         up re-querying other providers anyway. If we want to cache
         non-BasicAA results, we should do that in a more principled manner.
      
      In any case, despite those tradeoffs, this works out to be a decent
      compile-time improvment. I think it also simplifies the mental model
      of how BasicAA works. It took me quite a while to fully understand
      how these things interact.
      
      Differential Revision: https://reviews.llvm.org/D90094
      0b84afa5
    • Nikita Popov's avatar
      [BasicAA] Move assumption tracking into AAQI · b1c2f128
      Nikita Popov authored
      D91936 placed the tracking for the assumptions into BasicAA.
      However, when recursing over phis, we may use fresh AAQI instances.
      In this case AssumptionBasedResults from an inner AAQI can reesult
      in a removal of an element from the outer AAQI.
      
      To avoid this, move the tracking into AAQI. This generally makes
      more sense, as the NoAlias assumptions themselves are also stored
      in AAQI.
      
      The test case only produces an assertion failure with D90094
      reapplied. I think the issue exists independently of that change
      as well, but I wasn't able to come up with a reproducer.
      b1c2f128
    • Fangrui Song's avatar
      3809f4eb
    • Kazushi (Jam) Marukawa's avatar
      [VE] Support VE in libunwind · 3cbd476c
      Kazushi (Jam) Marukawa authored
      Modify libunwind to support SjLj exception handling routines for VE.
      In order to do that, we need to implement not only SjLj exception
      handling routines but also a Registers_ve class.  This implementation
      of Registers_ve is incomplete.  We will work on it later when we need
      backtrace in libunwind.
      
      Reviewed By: #libunwind, compnerd
      
      Differential Revision: https://reviews.llvm.org/D94591
      3cbd476c
    • Craig Topper's avatar
      [RISCV] Remove an extra map lookup from RISCVCompressInstEmitter. NFC · 061f681c
      Craig Topper authored
      When we looked up the map to see if the entry already existed,
      this created the new entry for us. So save a reference to it so
      we can use it to update the entry instead of looking it up again.
      
      Also remove unnecessary StringRef constructors around string
      literals on calls to this function.
      061f681c
    • Craig Topper's avatar
      [RISCV] Few more minor cleanups to RISCVCompressInstEmitter. NFC · 1327c730
      Craig Topper authored
      -Use StringRef instead of std::string.
      -Const correct a parameter.
      -Don't call StringRef::data() before printing. Just pass the StringRef.
      1327c730
    • Craig Topper's avatar
      [RISCV] Simplify mergeCondAndCode in RISCVCompressInstEmitter.cpp. NFC · 2b6a9262
      Craig Topper authored
      Instead forming a std::string and returning it to pass into another
      raw_ostream, just pass the raw_ostream as a parameter.
      
      Take StringRef as arguments instead raw_string_ostream references
      making the caller responsible for converting to strings. Use
      StringRef operations instead of std::string::substr.a
      2b6a9262
    • Craig Topper's avatar
    • Craig Topper's avatar
      [RISCV] Remove unneeded StringRef to std::string conversions in RISCVCompressInstEmitter. NFC · 633c5afc
      Craig Topper authored
      Stop concatenating std::string before streaming into a raw_ostream.
      Just stream the pieces.
      
      Remove some new lines from asserts. Remove std::string concatenation
      from an assert. assert strings aren't really evaluated like this at
      runtime. An assertion failure will just print exactly what's between
      the parentheses in the source.
      633c5afc
    • Fangrui Song's avatar
      [X86] Default to -x86-pad-for-align=false to drop assembler difference with or w/o -g · a048ce13
      Fangrui Song authored
      Fix PR48742: the D75203 assembler optimization locates MCRelaxableFragment's
      within two MCSymbol's and relaxes some MCRelaxableFragment's to reduce the size
      of a MCAlignFragment.  A -g build has more MCSymbol's and therefore may have
      different assembler output (e.g. a MCRelaxableFragment (jmp) may have 5 bytes
      with -O1 while 2 bytes with -O1 -g).
      
      `.p2align 4, 0x90` is common due to loops. For a larger program, with a
      lot of temporary labels, the assembly output difference is somewhat
      destined. The cost seems to overweigh the benefits so we default to
      -x86-pad-for-align=false until the heuristic is improved.
      
      Reviewed By: skan
      
      Differential Revision: https://reviews.llvm.org/D94542
      a048ce13
  2. Jan 16, 2021
    • Nikita Popov's avatar
      [InstCombine] Replace one-use select operand based on condition · 5238e7b3
      Nikita Popov authored
      InstCombine already performs a fold where X == Y ? f(X) : Z is
      transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
      if f(X) only has one use, then we can always directly replace the
      use inside the instruction. To actually be profitable, limit it to
      the case where Y is a non-expr constant.
      
      This could be further extended to replace uses further up a one-use
      instruction chain, but for now this only looks one level up.
      
      Among other things, this also subsumes D94860.
      
      Differential Revision: https://reviews.llvm.org/D94862
      5238e7b3
    • Roman Lebedev's avatar
      [SimplifyCFG] markAliveBlocks(): catchswitch: preserve PostDomTree · 32fc3231
      Roman Lebedev authored
      When removing catchpad's from catchswitch, if that removes a successor,
      we need to record that in DomTreeUpdater.
      
      This fixes PostDomTree preservation failure in an existing test.
      This appears to be the single issue that i see in my current test coverage.
      32fc3231
    • David Green's avatar
      [ARM] Align blocks that are not fallthough targets · 14547242
      David Green authored
      If the previous block in a function does not fallthough, adding nop's to
      align it will never be executed. This means we can freely (except for
      codesize) align more branches. This happens in constantislandspass (as
      it cannot happen later) and only happens at aggressive optimization
      levels as it does increase codesize.
      
      Differential Revision: https://reviews.llvm.org/D94394
      14547242
    • David Green's avatar
      [ARM] Test for aligned blocks. NFC · 2a5b576e
      David Green authored
      2a5b576e
    • Dávid Bolvanský's avatar
      [NFC] Removed extra text in comments · bfd75bdf
      Dávid Bolvanský authored
      bfd75bdf
    • Aart Bik's avatar
      [mlir][sparse] improved sparse runtime support library · d8fc2730
      Aart Bik authored
      Added the ability to read (an extended version of) the FROSTT
      file format, so that we can now read in sparse tensors of arbitrary
      rank. Generalized the API to deal with more than two dimensions.
      
      Also added the ability to sort the indices of sparse tensors
      lexicographically. This is an important step towards supporting
      auto gen of initialization code, since sparse storage formats
      are easier to initialize if the indices are sorted. Since most
      external formats don't enforce such properties, it is convenient
      to have this ability in our runtime support library.
      
      Lastly, the re-entrant problem of the original implementation
      is fixed by passing an opaque object around (rather than having
      a single static variable, ugh!).
      
      Reviewed By: nicolasvasilache
      
      Differential Revision: https://reviews.llvm.org/D94852
      d8fc2730
    • Shilei Tian's avatar
      [OpenMP] Added the support for hidden helper task in RTL · ed939f85
      Shilei Tian authored
      The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.
      
      Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.
      
      Here are some open issues to be discussed:
      1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D77609
      ed939f85
    • Sanjay Patel's avatar
      [SLP] remove opcode field from reduction data class · 49b96cd9
      Sanjay Patel authored
      This is NFC-intended and another step towards supporting
      intrinsics as reduction candidates.
      
      The remaining bits of the OperationData class do not make
      much sense as-is, so I will try to improve that, but I'm
      trying to take minimal steps because it's still not clear
      how this was intended to work.
      49b96cd9
    • Sanjay Patel's avatar
      [SLP] fix typos; NFC · fcfcc3cc
      Sanjay Patel authored
      fcfcc3cc
    • Sanjay Patel's avatar
      [SLP] remove unnecessary use of 'OperationData' · 48dbac5b
      Sanjay Patel authored
      This is another NFC-intended patch to allow matching
      intrinsics (example: maxnum) as candidates for reductions.
      
      It's possible that the loop/if logic can be reduced now,
      but it's still difficult to understand how this all works.
      48dbac5b
    • Dávid Bolvanský's avatar
    • David Green's avatar
      [ARM] Add low overhead loops terminators to AnalyzeBranch · 372eb2bb
      David Green authored
      This treats low overhead loop branches the same as jump tables and
      indirect branches in analyzeBranch - they cannot be analyzed but the
      direct branches on the end of the block may be removed. This helps
      remove the unnecessary branches earlier, which can help produce better
      codegen (and change block layout in a number of cases).
      
      Differential Revision: https://reviews.llvm.org/D94392
      372eb2bb
    • David Green's avatar
      [ARM] Remove LLC tests from transform/hardware loop tests. · c1ab698d
      David Green authored
      We now have a lot of llc tests for hardware loops in CodeGen, which test
      a larger variety of loops and are easier to maintain. This removes the
      llc from mixed llc/opt tests.
      c1ab698d
    • Dávid Bolvanský's avatar
      416854d0
    • Kazu Hirata's avatar
      [llvm] Use *::empty (NFC) · 2082b10d
      Kazu Hirata authored
      2082b10d
    • Kazu Hirata's avatar
      19aacdb7
    • Kazu Hirata's avatar
      [StringExtras] Fix comment typos (NFC) · ba0fc7e1
      Kazu Hirata authored
      ba0fc7e1
    • Florian Hahn's avatar
      [LTO] Remove options to disable inlining, vectorization & GVNLoadPRE. · bca16e2f
      Florian Hahn authored
      This patch removes some ancient options as a clean-up before moving
      code-gen to use LTOBackend in D94487.
      
      I think it would preferable to remove those ancient options, because
      
        1. There are no corresponding options in LTOBackend based tools,
        2. There are no unit tests for them,
        3. They are not passed through by Clang,
        4. At least for GNVLoadPRE, users could just use GVN's `enable-load-pre`.
      
      Alternatively we could add support for those options to lto::Config &
      co, but I think it would be better to remove them, unless they are
      actually used in practice.
      
      Reviewed By: steven_wu, tejohnson
      
      Differential Revision: https://reviews.llvm.org/D94783
      bca16e2f
    • Dávid Bolvanský's avatar
    • Hsiangkai Wang's avatar
      [RISCV] Correct alignment settings for vector registers. · 098dbf19
      Hsiangkai Wang authored
      According to "9. Vector Memory Alignment Constraints" in V
      specification, the alignment of vector memory access is aligned to the
      size of the element. In our current implementation, we support ELEN up
      to 64. We could assume the alignment of vector registers is 64 under the
      assumption.
      
      Differential Revision: https://reviews.llvm.org/D94751
      098dbf19
    • Dávid Bolvanský's avatar
      a4e2a514
    • Dávid Bolvanský's avatar
    • James Player's avatar
      Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable · 25c1578a
      James Player authored
      Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations.  Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`.
      
      I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above:
      
      ```
      62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp)
      ...
      ```
      The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true.
      
      [[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable | According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version:
      ```
      /// Storage for any type.
      template <typename T, bool = std::is_trivially_copy_constructible<T>::value
                                && std::is_trivially_copy_assignable<T>::value>
      class OptionalStorage {
      ```
      Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted.  Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson.
      
      Reviewed By: dblaikie
      
      Differential Revision: https://reviews.llvm.org/D93510
      25c1578a
    • Stephen Kelly's avatar
      b765eaf9
    • Stephen Kelly's avatar
      [ASTMatchers] Add binaryOperation matcher · e810e95e
      Stephen Kelly authored
      This is a simple utility which allows matching on binaryOperator and
      cxxOperatorCallExpr. It can also be extended to support
      cxxRewrittenBinaryOperator.
      
      Add generic support for MapAnyOfMatchers to auto-marshalling functions.
      
      Differential Revision: https://reviews.llvm.org/D94129
      e810e95e
    • Bjorn Pettersson's avatar
      [LegalizeDAG] Handle NeedInvert when expanding BR_CC · 4f155567
      Bjorn Pettersson authored
      This is a follow-up fix to commit 03c8d6a0.
      Seems like we now end up with NeedInvert being set in the result
      from LegalizeSetCCCondCode more often than in the past, so we
      need to handle NeedInvert when expanding BR_CC.
      
      Not sure how to deal with the "Tmp4.getNode()" case properly,
      but current assumption is that that code path isn't impacted
      by the changes in 03c8d6a0 so we can simply move
      the old assert into the if-branch and only handle NeedInvert in the
      else-branch.
      
      I think that the test case added here, for PowerPC, might have
      failed also before commit 03c8d6a0. But we started
      to hit the assert more often downstream when having merged that
      commit.
      
      Reviewed By: craig.topper
      
      Differential Revision: https://reviews.llvm.org/D94762
      4f155567
    • Stephen Kelly's avatar
      [ASTMatchers] Make cxxOperatorCallExpr matchers API-compatible with n-ary operators · dbe056c2
      Stephen Kelly authored
      This makes them composable with mapAnyOf().
      
      Differential Revision: https://reviews.llvm.org/D94128
      dbe056c2
    • Stephen Kelly's avatar
      [ASTMatchers] Add mapAnyOf matcher · a7101450
      Stephen Kelly authored
      Make it possible to compose a matcher for different base nodes.
      
      This accepts one or more node matcher functors and zero or more
      matchers, composing the latter into the former.
      
      This allows composing of matchers where the same inner matcher name is
      used for the same concept, but with a different node functor. Currently,
      there is a limitation that the nodes must be in the same "clade", so
      while
      
        mapAnyOf(ifStmt, forStmt).with(hasBody(stmt()))
      
      can be used, functionDecl can not be added to the tuple.
      
      It is possible to use this in clang-query, but it will require changes
      to the QueryParser, so is deferred to a future review.
      
      Differential Revision: https://reviews.llvm.org/D94127
      a7101450
    • Nikita Popov's avatar
Loading