Skip to content
  1. Jun 06, 2021
  2. Jun 04, 2021
  3. Jun 02, 2021
  4. May 29, 2021
  5. May 28, 2021
  6. May 26, 2021
    • Teresa Johnson's avatar
      [LTT] Handle merged llvm.assume when dropping type tests · d35fe04f
      Teresa Johnson authored
      When the lower type test pass is invoked a second time with
      DropTypeTests set to true, it expects that all remaining type tests feed
      assume instructions, which are removed along with the type tests.
      
      In some cases the llvm.assume might have been merged with another one,
      i.e. from a builtin_assume instruction, in which case the type test
      would actually feed a phi that in turn feeds the merged assume
      instruction. In this case we can simply replace that operand of the phi
      with "true" before removing the type test.
      
      Differential Revision: https://reviews.llvm.org/D103073
      d35fe04f
  7. May 25, 2021
    • Fangrui Song's avatar
      [Internalize] Rename instead of removal if a to-be-internalized comdat has more than one member · b426b45d
      Fangrui Song authored
      Beside the `comdat any` deduplication feature, instrumentations use comdat to
      establish dependencies among a group of sections, to prevent section based
      linker garbage collection from discarding some members without discarding all.
      LangRef acknowledges this usage with the following wording:
      
      > All global objects that specify this key will only end up in the final object file if the linker chooses that key over some other key.
      
      On ELF, for PGO instrumentation, a `__llvm_prf_cnts` section and its associated
      `__llvm_prf_data` section are placed in the same GRP_COMDAT group.  A
      `__llvm_prf_data` is usually not referenced and expects the liveness of its
      associated `__llvm_prf_cnts` to retain it.
      
      The `setComdat(nullptr)` code (added by D10679) in InternalizePass can break the
      use case (a `__llvm_prf_data` may be dropped with its associated `__llvm_prf_cnts` retained).
      The main goal of this patch is to fix the dependency relationship.
      
      I think it makes sense for InternalizePass to internalize a comdat and thus
      suppress the deduplication feature, e.g. a relocatable link of a regular LTO can
      create an object file affected by InternalizePass.
      If a non-internal comdat in a.o is prevailed by an internal comdat in b.o, the
      a.o references to the comdat definitions will be non-resolvable (references
      cannot bind to STB_LOCAL definitions in b.o).
      
      On PE-COFF, for a non-external selection symbol, deduplication is naturally
      suppressed with link.exe and lld-link. However, this is fuzzy on ELF and I tend
      to believe the spec creator has not thought about this use case (see D102973).
      
      GNU ld and gold are still using the "signature is name based" interpretation.
      So even if D102973 for ld.lld is accepted, for portability, a better approach is
      to rename the comdat. A comdat with one single member is the common case,
      leaving the comdat can waste (sizeof(Elf64_Shdr)+4*2) bytes, so we optimize by
      deleting the comdat; otherwise we rename the comdat.
      
      Reviewed By: tejohnson
      
      Differential Revision: https://reviews.llvm.org/D103043
      b426b45d
    • Marco Elver's avatar
      [SanitizeCoverage] Add support for NoSanitizeCoverage function attribute · 28033302
      Marco Elver authored
      We really ought to support no_sanitize("coverage") in line with other
      sanitizers. This came up again in discussions on the Linux-kernel
      mailing lists, because we currently do workarounds using objtool to
      remove coverage instrumentation. Since that support is only on x86, to
      continue support coverage instrumentation on other architectures, we
      must support selectively disabling coverage instrumentation via function
      attributes.
      
      Unfortunately, for SanitizeCoverage, it has not been implemented as a
      sanitizer via fsanitize= and associated options in Sanitizers.def, but
      rolls its own option fsanitize-coverage. This meant that we never got
      "automatic" no_sanitize attribute support.
      
      Implement no_sanitize attribute support by special-casing the string
      "coverage" in the NoSanitizeAttr implementation. To keep the feature as
      unintrusive to existing IR generation as possible, define a new negative
      function attribute NoSanitizeCoverage to propagate the information
      through to the instrumentation pass.
      
      Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035
      
      Reviewed By: vitalybuka, morehouse
      
      Differential Revision: https://reviews.llvm.org/D102772
      28033302
  8. May 24, 2021
  9. May 22, 2021
    • Yaxun (Sam) Liu's avatar
      [HIP] support ThinLTO · bf612458
      Yaxun (Sam) Liu authored
      Add options -[no-]offload-lto and -foffload-lto=[thin,full] for controlling
      LTO for offload compilation. Allow LTO for AMDGPU target.
      
      AMDGPU target does not support codegen of object files containing
      call of external functions, therefore the LLVM module passed to
      AMDGPU backend needs to contain definitions of all the callees.
      An LLVM option is added to allow function importer to import
      functions with noinline attribute.
      
      HIP toolchain passes proper LLVM options to lld to make sure
      function importer imports definitions of all the callees.
      
      Reviewed by: Teresa Johnson, Artem Belevich
      
      Differential Revision: https://reviews.llvm.org/D99683
      bf612458
    • Arthur Eubanks's avatar
      Revert "[NewPM] Only invalidate modified functions' analyses in CGSCC passes" · f7788e1b
      Arthur Eubanks authored
      This reverts commit d14d84af.
      
      Causes unacceptable memory regressions.
      f7788e1b
  10. May 20, 2021
  11. May 19, 2021
    • Joseph Huber's avatar
      [Diagnostics] Allow emitting analysis and missed remarks on functions · 2db182ff
      Joseph Huber authored
      Summary:
      Currently, only `OptimizationRemarks` can be emitted using a Function.
      Add constructors to allow this for `OptimizationRemarksAnalysis` and
      `OptimizationRemarkMissed` as well.
      
      Reviewed By: jdoerfert thegameg
      
      Differential Revision: https://reviews.llvm.org/D102784
      2db182ff
    • Hongtao Yu's avatar
      [CSSPGO] Overwrite branch weight annotated in previous pass. · 4ca6e37b
      Hongtao Yu authored
      Sample profile loader can be run in both LTO prelink and postlink. Currently the counts annoation in postilnk doesn't fully overwrite what's done in prelink. I'm adding a switch (`-overwrite-existing-weights=1`) to enable a full overwrite, which includes:
      
      1. Clear old metadata for calls when their parent block has a zero count. This could be caused by prelink code duplication.
      
      2. Clear indirect call metadata if somehow all the rest targets have a sum of zero count.
      
      3. Overwrite branch weight for basic blocks.
      
      With a CS profile, I was seeing #1 and #2 help reduce code size by preventing post-sample ICP and CGSCC inliner working on obsolete metadata, which come from a partial global inlining in prelink.  It's not expected to work well for non-CS case with a less-accurate post-inline count quality.
      
      It's worth calling out that some prelink optimizations can damage counts quality in an irreversible way. One example is the loop rotate optimization. Due to lack of exact loop entry count (profiling can only give loop iteration count and loop exit count), moving one iteration out of the loop body leaves the rest iteration count unknown. We had to turn off prelink loop rotate to achieve a better postlink counts quality. A even better postlink counts quality can be archived by turning off prelink CGSCC inlining which is not context-sensitive.
      
      Reviewed By: wenlei, wmi
      
      Differential Revision: https://reviews.llvm.org/D102537
      4ca6e37b
    • Joseph Huber's avatar
      [Attributor] Change AAExecutionDomain to only accept intrinsics · 68abc3d2
      Joseph Huber authored
      Summary:
      The OpenMP runtime functions don't always provide unique thread ID's to
      determine if a basic block is truly single-threaded. Change the implementation
      to only check NVPTX intrinsics for now.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D102700
      68abc3d2
  12. May 17, 2021
  13. May 15, 2021
  14. May 14, 2021
    • wlei's avatar
      [CSSPGO] Fix return value of getProbeWeight · e475d4d6
      wlei authored
      Currently we didn't support multiple return type, we work around to use error_code to represent:
      
      1)  The dangling probe.
      2)  Ignore the weight of non-probe instruction
      
      While merging the instructions' weight for the whole BB, it will filter out the error code. But If all instructions of the BB give error_code, the outside logic will mark it as a BB requiring the inference algorithm to infer its weight. This is different from the zero value which will be treated as a cold block.
      
      Fix one place that if we can't find the FunctionSamples in the profile data which indicates the BB is cold, we choose to return zero.
      
      Also refine the comments.
      
      Reviewed By: hoy, wenlei
      
      Differential Revision: https://reviews.llvm.org/D102007
      e475d4d6
  15. May 13, 2021
  16. May 11, 2021
    • Fangrui Song's avatar
      [GlobalOpt] Remove heap SROA · 129f466e
      Fangrui Song authored
      GlobalOpt implements a heap SROA (SROA for an malloc allocatated struct or array
      of structs) which is largely undertested (heap-sra-[1234].ll are basically the
      same test with very little difference) and does not trigger at all when
      bootstrapping clang (it only supports the case of one single store).
      
      The heap SROA implementation causes PR50027 (GEP is not properly handled; crash or miscompile).
      Just drop the implementation. I have deleted some obviously duplicated tests
      but kept `heap-sra-[12]{,-no-nullopt}.ll`.
      
      Reviewed By: aeubanks
      
      Differential Revision: https://reviews.llvm.org/D102257
      129f466e
    • Eli Friedman's avatar
      [ArgumentPromotion] Fix byval alignment handling. · 61cbbba7
      Eli Friedman authored
      Make sure the alignment of the generated operations matches the
      alignment of the byval argument.  Previously, we were just ignoring
      alignment and getting lucky.
      
      While I'm here, also delete the unnecessary "tail" handling.
      Passing a pointer to a byval argument to a "tail" call is UB, so
      rewriting to an alloca doesn't require any special handling.
      
      Differential Revision: https://reviews.llvm.org/D89819
      61cbbba7
  17. May 10, 2021
  18. May 08, 2021
  19. May 07, 2021
  20. May 06, 2021
  21. May 04, 2021
Loading