Skip to content
  1. Feb 22, 2021
  2. Feb 20, 2021
    • Nikita Popov's avatar
      [ConstantRange] Handle wrapping ranges in min/max (PR48643) · a852234f
      Nikita Popov authored
      When one of the inputs is a wrapping range, intersect with the
      union of the two inputs. The union of the two inputs corresponds
      to the result we would get if we treated the min/max as a simple
      select.
      
      This fixes PR48643.
      a852234f
    • Nikita Popov's avatar
      [ConstantRange] Handle wrapping range in binaryNot() · b6088f74
      Nikita Popov authored
      We don't need any special handling for wrapping ranges (or empty
      ranges for that matter). The sub() call will already compute a
      correct and precise range.
      
      We only need to adjust the test expectation: We're now computing
      an optimal result, rather than an unsigned envelope.
      b6088f74
    • Nikita Popov's avatar
      [ConstantRangeTest] Print detailed information on failure (NFC) · 5ec75c60
      Nikita Popov authored
      When the optimality check fails, print the inputs, the computed
      range and the better range that was found. This makes it much
      simpler to identify the cause of the failure.
      
      Make sure that full ranges (which, unlikely all the other cases,
      have multiple ways to construct them that all result in the same
      range) only print one message by handling them separately.
      5ec75c60
    • Nikita Popov's avatar
      [ConstantRangeTest] Make exhaustive testing more principled (NFC) · 2b729548
      Nikita Popov authored
      The current infrastructure for exhaustive ConstantRange testing is
      somewhat confusing in what exactly it tests and currently cannot even
      be used for operations that produce smallest-size results, rather than
      signed/unsigned envelopes.
      
      This patch makes the testing more principled by collecting the exact
      set of results of an operation into a bit set and then comparing it
      against the range approximation by:
      
       * Checking conservative correctness: All elements in the set must be
         in the range.
       * Checking optimality under a given preference function: None of the
         (slack-free) ranges that can be constructed from the set are
         preferred over the computed range.
      
      Implemented preference functions are:
      
       * PreferSmallest: Smallest range regardless of signed/unsigned wrapping
         behavior. Probably what we would call "optimal" without further
         qualification.
       * PreferSmallestUnsigned/Signed: Smallest range that has no
         unsigned/signed wrapping. We use this if our calculation is precise
         only up to signed/unsigned envelope.
       * PreferSmallestNonFullUnsigned/Signed: Smallest range that has no
         unsigned/signed wrapping -- but preferring a smaller wrapping range
         over a (non-wrapping) full range. We use this if we have a fully
         precise calculation but apply a sign preference to the result
         (union/intersection). Even with a sign preference, returning a
         wrapping range is still "strictly better" than returning a full one.
      
      This also addresses PR49273 by replacing the fragile manual range
      construction logic in testBinarySetOperationExhaustive() with generic
      code that isn't specialized to the particular form of ranges that set
      operations can produces.
      
      Differential Revision: https://reviews.llvm.org/D88356
      2b729548
  3. Feb 19, 2021
  4. Feb 18, 2021
    • Petr Hosek's avatar
      [Coverage] Store compilation dir separately in coverage mapping · 5fbd1a33
      Petr Hosek authored
      We currently always store absolute filenames in coverage mapping.  This
      is problematic for several reasons. It poses a problem for distributed
      compilation as source location might vary across machines.  We are also
      duplicating the path prefix potentially wasting space.
      
      This change modifies how we store filenames in coverage mapping. Rather
      than absolute paths, it stores the compilation directory and file paths
      as given to the compiler, either relative or absolute. Later when
      reading the coverage mapping information, we recombine relative paths
      with the working directory. This approach is similar to handling
      ofDW_AT_comp_dir in DWARF.
      
      Finally, we also provide a new option, -fprofile-compilation-dir akin
      to -fdebug-compilation-dir which can be used to manually override the
      compilation directory which is useful in distributed compilation cases.
      
      Differential Revision: https://reviews.llvm.org/D95753
      5fbd1a33
    • Petr Hosek's avatar
      Revert "[Coverage] Store compilation dir separately in coverage mapping" · fbf8b957
      Petr Hosek authored
      This reverts commit 97ec8fa5 since
      the test is failing on some bots.
      fbf8b957
    • Petr Hosek's avatar
      [Coverage] Store compilation dir separately in coverage mapping · 97ec8fa5
      Petr Hosek authored
      We currently always store absolute filenames in coverage mapping.  This
      is problematic for several reasons. It poses a problem for distributed
      compilation as source location might vary across machines.  We are also
      duplicating the path prefix potentially wasting space.
      
      This change modifies how we store filenames in coverage mapping. Rather
      than absolute paths, it stores the compilation directory and file paths
      as given to the compiler, either relative or absolute. Later when
      reading the coverage mapping information, we recombine relative paths
      with the working directory. This approach is similar to handling
      ofDW_AT_comp_dir in DWARF.
      
      Finally, we also provide a new option, -fprofile-compilation-dir akin
      to -fdebug-compilation-dir which can be used to manually override the
      compilation directory which is useful in distributed compilation cases.
      
      Differential Revision: https://reviews.llvm.org/D95753
      97ec8fa5
    • Sam Powell's avatar
      [llvm][TextAPI] add equality operator for InterfaceFile · eb2eeeb7
      Sam Powell authored
      This patch adds functionality to compare for the equality between `InterfaceFile`s based on attributes specific to linking.
      
      Reviewed By: cishida, steven_wu
      
      Differential Revision: https://reviews.llvm.org/D96629
      eb2eeeb7
    • Djordje Todorovic's avatar
      Revert "[Debugify] Make the debugify aware of the original (-g) Debug Info" · c1e23894
      Djordje Todorovic authored
      This reverts rG8ee7c7e02953.
      One test is failing, I'll reland this as soon as possible.
      c1e23894
    • Djordje Todorovic's avatar
      [Debugify] Make the debugify aware of the original (-g) Debug Info · 8ee7c7e0
      Djordje Todorovic authored
      As discussed on the RFC [0], I am sharing the set of patches that
      enables checking of original Debug Info metadata preservation in
      optimizations. The proof-of-concept/proposal can be found at [1].
      
      The implementation from the [1] was full of duplicated code,
      so this set of patches tries to merge this approach into the existing
      debugify utility.
      
      For example, the utility pass in the original-debuginfo-check
      mode could be invoked as follows:
      
        $ opt -verify-debuginfo-preserve -pass-to-test sample.ll
      
      Since this is very initial stage of the implementation,
      there is a space for improvements such as:
        - Add support for the new pass manager
        - Add support for metadata other than DILocations and DISubprograms
      
      [0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ
      [1] https://github.com/djolertrk/llvm-di-checker
      
      Differential Revision: https://reviews.llvm.org/D82545
      8ee7c7e0
  5. Feb 17, 2021
  6. Feb 16, 2021
    • Sameer Sahasrabuddhe's avatar
      [NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager · 11bf7da6
      Sameer Sahasrabuddhe authored
      The GPUDivergenceAnalysis is now renamed to just "DivergenceAnalysis"
      since there is no conflict with LegacyDivergenceAnalysis. In the
      legacy PM, this analysis can only be used through the legacy DA
      serving as a wrapper. It is now made available as a pass in the new
      PM, and has no relation with the legacy DA.
      
      The new DA currently cannot handle irreducible control flow; its
      presence can cause the analysis to run indefinitely. The analysis is
      now modified to detect this and report all instructions in the
      function as divergent. This is super conservative, but allows the
      analysis to be used without hanging the compiler.
      
      Reviewed By: aeubanks
      
      Differential Revision: https://reviews.llvm.org/D96615
      11bf7da6
  7. Feb 15, 2021
    • Duncan P. N. Exon Smith's avatar
      TransformUtils: Fix metadata handling in CloneModule (and improve CloneFunctionInto) · 22a52dfd
      Duncan P. N. Exon Smith authored
      This commit fixes how metadata is handled in CloneModule to be sound,
      and improves how it's handled in CloneFunctionInto (although the latter
      is still awkward when called within a module).
      
      Ruiling Song pointed out in PR48841 that CloneModule was changed to
      unsoundly use the RF_ReuseAndMutateDistinctMDs flag (renamed in
      fa35c1f8 for clarity). This flag papered
      over a crash caused by other various changes made to CloneFunctionInto
      over the past few years that made it unsound to use cloning between
      different modules.
      
      (This commit partially addresses PR48841, fixing the repro from
      preprocessed source but not textual IR. MDNodeMapper::mapDistinctNode
      became unsound in df763188 and this
      commit does not address that regression.)
      
      RF_ReuseAndMutateDistinctMDs is designed for the IRMover to use,
      avoiding unnecessary clones of all referenced metadata when linking
      between modules (with IRMover, the source module is discarded after
      linking). It never makes sense to use when you're not discarding the
      source. This commit drops its incorrect use in CloneModule.
      
      Sadly, the right thing to do with metadata when cloning a function is
      complicated, and this patch doesn't totally fix it.
      
      The first problem is that there are two different types of referenceable
      metadata and it's not obvious what to with one of them when remapping.
      
      - `!0 = !{!1}` is metadata's version of a constant. Programatically it's
        called "uniqued" (probably a better term would be "constant") because,
        like `ConstantArray`, it's stored in uniquing tables. Once it's
        constructed, it's illegal to change its arguments.
      - `!0 = distinct !{!1}` is a bit closer to a global variable. It's legal
        to change the operands after construction.
      
      What should be done with distinct metadata when cloning functions within
      the same module?
      
      - Should new, cloned nodes be created?
      - Should all references point to the same, old nodes?
      
      The answer depends on whether that metadata is effectively owned by a
      function.
      
      And that's the second problem. Referenceable metadata's ownership model
      is not clear or explicit. Technically, it's all stored on an
      LLVMContext. However, any metadata that is `distinct`, that transitively
      references a `distinct` node, or that transitively references a
      GlobalValue is specific to a Module and is effectively owned by it. More
      specifically, some metadata is effectively owned by a specific Function
      within a module.
      
      Effectively function-local metadata was introduced somewhere around
      c10d0e5c, which made it illegal for two
      functions to share a DISubprogram attachment.
      
      When cloning a function within a module, you need to clone the
      function-local debug info and suppress cloning of global debug info (the
      status quo suppresses cloning some global debug info but not all). When
      cloning a function to a new/different module, you need to clone all of
      the debug info.
      
      Here's what I think we should do (eventually? soon? not this patch
      though):
      - Distinguish explicitly (somehow) between pure constant metadata owned
        by the LLVMContext, global metadata owned by the Module, and local
        metadata owned by a GlobalValue (such as a function).
      - Update CloneFunctionInto to trigger cloning of all "local" metadata
        (only), perhaps by adding a bit to RemapFlag. Alternatively, split
        out a separate function CloneFunctionMetadataInto to prime the
        metadata map that callers are updated to call ahead of time as
        appropriate.
      
      Here's the somewhat more isolated fix in this patch:
      - Converted the `ModuleLevelChanges` parameter to `CloneFunctionInto` to
        an enum called `CloneFunctionChangeType` that is one of
        LocalChangesOnly, GlobalChanges, DifferentModule, and ClonedModule.
      - The code maintaining the "functions uniquely own subprograms"
        invariant is now only active in the first two cases, where a function
        is being cloned within a single module. That's necessary because this
        code inhibits cloning of (some) "global" metadata that's effectively
        owned by the module.
      - The code maintaining the "all compile units must be explicitly
        referenced by !llvm.dbg.cu" invariant is now only active in the
        DifferentModule case, where a function is being cloned into a new
        module in isolation.
      - CoroSplit.cpp's call to CloneFunctionInto in CoroCloner::create
        uses LocalChangeOnly, since fa635d73
        only set `ModuleLevelChanges` to trigger cloning of local metadata.
      - CloneModule drops its unsound use of RF_ReuseAndMutateDistinctMDs
        and special handling of !llvm.dbg.cu.
      - Fixed some outdated header docs and left a couple of FIXMEs.
      
      Differential Revision: https://reviews.llvm.org/D96531
      22a52dfd
  8. Feb 14, 2021
  9. Feb 12, 2021
  10. Feb 11, 2021
    • Duncan P. N. Exon Smith's avatar
      ValueMapper: Rename RF_MoveDistinctMDs => RF_ReuseAndMutateDistinctMDs, NFC · fa35c1f8
      Duncan P. N. Exon Smith authored
      Rename the `RF_MoveDistinctMDs` flag passed into `MapValue` and
      `MapMetadata` to `RF_ReuseAndMutateDistinctMDs` in order to more
      precisely describe its effect and clarify the header documentation.
      
      Found this while helping to investigate PR48841, which pointed out an
      unsound use of the flag in `CloneModule()`. For now I've just added a
      FIXME there, but I'm hopeful that the new (more precise) name will
      prevent other similar errors.
      fa35c1f8
  11. Feb 10, 2021
  12. Feb 09, 2021
    • David Tenty's avatar
      [AIX][llvm][support] Implement getHostCPUName · 318ed901
      David Tenty authored
      We implement getHostCPUName() for AIX via systemcfg interfaces since access to the processor version register is a privileged operation. We return a value based on the  current processor implementation mode.
      
      This fixes the cpu detection used by clang for `-mcpu=native`.
      
      Reviewed By: hubert.reinterpretcast
      
      Differential Revision: https://reviews.llvm.org/D95966
      318ed901
  13. Feb 05, 2021
  14. Feb 04, 2021
    • Christopher Tetreault's avatar
      Reland "Ensure that InstructionCost actually implements a total ordering" · b8b054aa
      Christopher Tetreault authored
      The operator< in the previous attempt was incorrect. It is unfortunate
      that this was only caught by the expensive checks.
      
      This reverts commit ff1147c3.
      b8b054aa
    • Paul Robinson's avatar
      144ca1e5
    • Joachim Meyer's avatar
      [Support] Indent multi-line descr of enum cli options. · e3f02302
      Joachim Meyer authored
      As noted in https://reviews.llvm.org/D93459, the formatting of
      multi-line descriptions of clEnumValN and the likes is unfavorable.
      Thus this patch adds support for correctly indenting these.
      
      Reviewed By: serge-sans-paille
      
      Differential Revision: https://reviews.llvm.org/D93494
      e3f02302
    • wlei's avatar
      [CSSPGO][llvm-profgen] Compress recursive cycles in calling context · ac14bb14
      wlei authored
      This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
      Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
      For example:
      Considering a input context string stack:
      [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
      For first iteration,, it removed all adjacent repeated frames of size 1:
      [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
      For second iteration, it removed all adjacent repeated frames of size 2:
      [“a”, “b”, “c”, “a”, “b”, “c”, “d”]
      So in the end, we get compressed output:
      [“a”, “b”, “c”, “d”]
      
      Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
      Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
      Added unit tests and regression test for this.
      
      Differential Revision: https://reviews.llvm.org/D93556
      ac14bb14
    • wlei's avatar
      Revert "[CSSPGO][llvm-profgen] Compress recursive cycles in calling context" · 6bccdcdb
      wlei authored
      This reverts commit 0609f257.
      6bccdcdb
    • wlei's avatar
      [CSSPGO][llvm-profgen] Compress recursive cycles in calling context · 0609f257
      wlei authored
      This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
      Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
      For example:
      Considering a input context string stack:
      [“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
      For first iteration,, it removed all adjacent repeated frames of size 1:
      [“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
      For second iteration, it removed all adjacent repeated frames of size 2:
      [“a”, “b”, “c”, “a”, “b”, “c”, “d”]
      So in the end, we get compressed output:
      [“a”, “b”, “c”, “d”]
      
      Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
      Added a switch `compress-recursion` to control the size of duplicated frames, default -1 means no size limit.
      Added unit tests and regression test for this.
      
      Differential Revision: https://reviews.llvm.org/D93556
      0609f257
    • Michael Kruse's avatar
      [OpenMPIRBuilder] Implement collapseLoops. · 26b5be66
      Michael Kruse authored
      The collapseLoops method implements a transformations facilitating the implementation of the collapse-clause. It takes a list of loops from a loop nest and reduces it to a single loop that can be used by other methods that are implemented on just a single loop, such as createStaticWorkshareLoop.
      
      This patch shares some changes with D92974 (such as adding some getters to CanonicalLoopNest), used by both patches.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D93268
      26b5be66
  15. Feb 03, 2021
  16. Feb 02, 2021
    • Richard Smith's avatar
      Diagnose if a SLEB128 is too large to fit in an int64_t. · 32e98f05
      Richard Smith authored
      Previously we'd hit UB due to an invalid left shift operand.
      
      Also fix the WASM emitter to properly use SLEB128 encoding instead of
      ULEB128 encoding for signed fields so that negative numbers don't
      result in overly-large values that we can't read back any more.
      
      In passing, don't diagnose a non-canonical ULEB128 that fits in a uint64_t but
      has redundant trailing zero bytes.
      
      Reviewed By: dblaikie, aardappel
      
      Differential Revision: https://reviews.llvm.org/D95510
      32e98f05
    • Christopher Tetreault's avatar
    • Christopher Tetreault's avatar
      Ensure that InstructionCost actually implements a total ordering · b481cd51
      Christopher Tetreault authored
      Previously, operator== would consider the actual equality of the pairs
      (lhs.Value, lhs.State) == (rhs.Value, rhs.State). However, if an invalid
      cost was involved in a call to operator<, only the state would be
      compared. Thus, it was not the case that ({2, Invalid} < {3, Invalid} ||
      {2, Invalid} > {3, Invalid} || {2, Invalid} == {3, Invalid}).
      
      This patch implements a true total ordering, where cost state is
      considered first, then value. While it's not really imporant that
      {2, Invalid} be considered to be less than {3, Invalid}, it's not a
      problem either. This patch also implements operator== in terms of
      operator<, so the two definitions will be kept in sync.
      
      Reviewed By: sdesmalen
      
      Differential Revision: https://reviews.llvm.org/D95803
      b481cd51
    • Nathan Hawes's avatar
      [VFS] Add support to RedirectingFileSystem for mapping a virtual directory to... · ecb00a77
      Nathan Hawes authored
      [VFS] Add support to RedirectingFileSystem for mapping a virtual directory to one in the external FS.
      
      Previously file entries in the -ivfsoverlay yaml could map to a file in the
      external file system, but directories had to list their contents in the form of
      other file entries or directories. Allowing directory entries to map to a
      directory in the external file system makes it possible to present an external
      directory's contents in a different location and (in combination with the
      'fallthrough' option) overlay one directory's contents on top of another.
      
      rdar://problem/72485443
      Differential Revision: https://reviews.llvm.org/D94844
      ecb00a77
  17. Feb 01, 2021
    • Serge Pavlov's avatar
      [FPEnv] Intrinsic for setting rounding mode · bf416d16
      Serge Pavlov authored
      To set non-default rounding mode user usually calls function 'fesetround'
      from standard C library. This way has some disadvantages.
      
      * It creates unnecessary dependency on libc. On the other hand, setting
        rounding mode requires few instructions and could be made by compiler.
        Sometimes standard C library even is not available, like in the case of
        GPU or AI cores that execute small kernels.
      * Compiler could generate more effective code if it knows that a particular
        call just sets rounding mode.
      
      This change introduces new IR intrinsic, namely 'llvm.set.rounding', which
      sets current rounding mode, similar to 'fesetround'. It however differs
      from the latter, because it is a lower level facility:
      
      * 'llvm.set.rounding' does not return any value, whereas 'fesetround'
        returns non-zero value in the case of failure. In glibc 'fesetround'
        reports failure if its argument is invalid or unsupported or if floating
        point operations are unavailable on the hardware. Compiler usually knows
        what core it generates code for and it can validate arguments in many
        cases.
      * Rounding mode is specified in 'fesetround' using constants like
        'FE_TONEAREST', which are target dependent. It is inconvenient to work
        with such constants at IR level.
      
      C standard provides a target-independent way to specify rounding mode, it
      is used in FLT_ROUNDS, however it does not define standard way to set
      rounding mode using this encoding.
      
      This change implements only IR intrinsic. Lowering it to machine code is
      target-specific and will be implemented latter. Mapping of 'fesetround'
      to 'llvm.set.rounding' is also not implemented here.
      
      Differential Revision: https://reviews.llvm.org/D74729
      bf416d16
Loading