Skip to content
  1. Jan 17, 2019
    • Wei Mi's avatar
      [SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink · 3bcccdfe
      Wei Mi authored
      If the sample profile has no inlining hierachy information included, we call
      the sample profile is flattened. For flattened profile, in ThinLTO postlink
      phase, SampleProfileLoader's hot function inlining and profile annotation will
      do nothing, so it is better to save the effort to read in the profile and run
      the sample profile loader pass. It is helpful for reducing compile time when
      the flattened profile is huge.
      
      Differential Revision: https://reviews.llvm.org/D54819
      
      llvm-svn: 351476
      3bcccdfe
  2. Jan 16, 2019
    • Philip Pfaffe's avatar
      [NewPM][TSan] Reiterate the TSan port · 685c76d7
      Philip Pfaffe authored
      Summary:
      Second iteration of D56433 which got reverted in rL350719. The problem
      in the previous version was that we dropped the thunk calling the tsan init
      function. The new version keeps the thunk which should appease dyld, but is not
      actually OK wrt. the current semantics of function passes. Hence, add a
      helper to insert the functions only on the first time. The helper
      allows hooking into the insertion to be able to append them to the
      global ctors list.
      
      Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan
      
      Subscribers: hiraditya, bollu, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56538
      
      llvm-svn: 351314
      685c76d7
  3. Jan 10, 2019
    • Fedor Sergeev's avatar
      [LoopUnroll] add parsing for unroll parameters in -passes pipeline · b7871405
      Fedor Sergeev authored
      Allow to specify loop-unrolling with optional parameters explicitly
      spelled out in -passes pipeline specification.
      Introducing somewhat generic way of specifying parameters parsing via
      FUNCTION_PASS_PARAMETRIZED pass registration.
      
      Syntax of parametrized unroll pass name is as follows:
         'unroll<' parameter-list '>'
      
      Where parameter-list is ';'-separate list of parameter names and optlevel
         optlevel: 'O[0-3]'
         parameter: { 'partial' | 'peeling' | 'runtime' | 'upperbound' }
         negated:  'no-' parameter
      
      Example:
         -passes=loop(unroll<O3;runtime;no-upperbound>)
      
          this invokes LoopUnrollPass configured with OptLevel=3,
          Runtime, no UpperBound, everything else by default.
      
      llvm-svn: 350808
      b7871405
  4. Jan 09, 2019
  5. Jan 08, 2019
  6. Jan 04, 2019
    • Teresa Johnson's avatar
      [ThinLTO] Handle chains of aliases · 853b9624
      Teresa Johnson authored
      At -O0, globalopt is not run during the compile step, and we can have a
      chain of an alias having an immediate aliasee of another alias. The
      summaries are constructed assuming aliases in a canonical form
      (flattened chains), and as a result only the base object but no
      intermediate aliases were preserved.
      
      Fix by adding a pass that canonicalize aliases, which ensures each
      alias is a direct alias of the base object.
      
      Reviewers: pcc, davidxl
      
      Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D54507
      
      llvm-svn: 350423
      853b9624
  7. Jan 03, 2019
    • Philip Pfaffe's avatar
      [NewPM] Port Msan · b39a97c8
      Philip Pfaffe authored
      Summary:
      Keeping msan a function pass requires replacing the module level initialization:
      That means, don't define a ctor function which calls __msan_init, instead just
      declare the init function at the first access, and add that to the global ctors
      list.
      
      Changes:
      - Pull the actual sanitizer and the wrapper pass apart.
      - Add a newpm msan pass. The function pass inserts calls to runtime
        library functions, for which it inserts declarations as necessary.
      - Update tests.
      
      Caveats:
      - There is one test that I dropped, because it specifically tested the
        definition of the ctor.
      
      Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka
      
      Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji
      
      Differential Revision: https://reviews.llvm.org/D55647
      
      llvm-svn: 350305
      b39a97c8
  8. Dec 12, 2018
    • Michael Kruse's avatar
      [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. · 72448525
      Michael Kruse authored
      When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.
      
          #pragma clang loop unroll_and_jam(enable)
          #pragma clang loop distribute(enable)
      
      is the same as
      
          #pragma clang loop distribute(enable)
          #pragma clang loop unroll_and_jam(enable)
      
      and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.
      
      This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,
      
          !0 = !{!0, !1, !2}
          !1 = !{!"llvm.loop.unroll_and_jam.enable"}
          !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
          !3 = !{!"llvm.loop.distribute.enable"}
      
      defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.
      
      Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.
      
      For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.
      
      Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.
      
      To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.
      
      With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).
      
      Reviewed By: hfinkel, dmgreen
      
      Differential Revision: https://reviews.llvm.org/D49281
      Differential Revision: https://reviews.llvm.org/D55288
      
      llvm-svn: 348944
      72448525
  9. Dec 07, 2018
    • Max Kazantsev's avatar
      Introduce llvm.experimental.widenable_condition intrinsic · b9e65cbd
      Max Kazantsev authored
      This patch introduces a new instinsic `@llvm.experimental.widenable_condition`
      that allows explicit representation for guards. It is an alternative to using
      `@llvm.experimental.guard` intrinsic that does not contain implicit control flow.
      
      We keep finding places where `@llvm.experimental.guard` is not supported or
      treated too conservatively, and there are 2 reasons to that:
      
      - `@llvm.experimental.guard` has memory write side effect to model implicit control flow,
        and this sometimes confuses passes and analyzes that work with memory;
      - Not all passes and analysis are aware of the semantics of guards. These passes treat them
        as regular throwing call and have no idea that the condition of guard may be used to prove
        something. One well-known place which had caused us troubles in the past is explicit loop
        iteration count calculation in SCEV. Another example is new loop unswitching which is not
        aware of guards. Whenever a new pass appears, we potentially have this problem there.
      
      Rather than go and fix all these places (and commit to keep track of them and add support
      in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible.
      The only significant difference between guards and regular explicit branches is that guard's condition
      can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor,
      and it is always legal to go there no matter what the guard condition is. The other successor is
      a guarded block, and it is only legal to go there if the condition is true.
      
      This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard`
      intrinsic. Now a widenable guard can be represented in the CFG explicitly like this:
      
      
          %widenable_condition = call i1 @llvm.experimental.widenable.condition()
          %new_condition = and i1 %cond, %widenable_condition
          br i1 %new_condition, label %guarded, label %deopt
      
        guarded:
          ; Guarded instructions
      
        deopt:
          call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
      
      The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an
      `undef`, but the intrinsic prevents the optimizer from folding it early. This form
      should exploit all optimization boons provided to `br` instuction, and it still can be
      widened by replacing the result of `@llvm.experimental.widenable.condition()`
      with `and` with any arbitrary boolean value (as long as the branch that is taken when
      it is `false` has a deopt and has no side-effects).
      
      For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using
      implicit control flow in guards".
      
      This patch introduces this new intrinsic with respective LangRef changes and a pass
      that converts old-style guards (expressed as intrinsics) into the new form.
      
      The naming discussion is still ungoing. Merging this to unblock further items. We can
      later change the name of this intrinsic.
      
      Reviewed By: reames, fedor.sergeev, sanjoy
      Differential Revision: https://reviews.llvm.org/D51207
      
      llvm-svn: 348593
      b9e65cbd
    • Markus Lavin's avatar
      [PM] Port LoadStoreVectorizer to the new pass manager. · 4dc4ebd6
      Markus Lavin authored
      Differential Revision: https://reviews.llvm.org/D54848
      
      llvm-svn: 348570
      4dc4ebd6
  10. Nov 26, 2018
  11. Nov 21, 2018
  12. Nov 15, 2018
  13. Nov 12, 2018
    • Philip Pfaffe's avatar
      Add an OptimizerLast EP · 2d4effb2
      Philip Pfaffe authored
      Summary:
      It turns out that we need an OptimizerLast PassBuilder extension point
      after all. I missed the relevance of this EP the first time. By legacy PM magic,
      function passes added at this EP get added to the last _Function_ PM, which is a
      feature we lost when dropping this EP for the new PM.
      
      A key difference between this and the legacy PassManager's OptimizerLast
      callback is that this extension point is not triggered at O0. Extensions
      to the O0 pipeline should append their passes to the end of the overall
      pipeline.
      
      Differential Revision: https://reviews.llvm.org/D54374
      
      llvm-svn: 346645
      2d4effb2
  14. Oct 31, 2018
    • Fedor Sergeev's avatar
      [LoopUnroll] allow customization for new-pass-manager version of LoopUnroll · 412ed347
      Fedor Sergeev authored
      Unlike its legacy counterpart new pass manager's LoopUnrollPass does
      not provide any means to select which flavors of unroll to run
      (runtime, peeling, partial), relying on global defaults.
      
      In some cases having ability to run a restricted LoopUnroll that
      does more than LoopFullUnroll is needed.
      
      Introduced LoopUnrollOptions to select optional unroll behaviors.
      Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.
      
      Reviewers: chandlerc, tejohnson
      Differential Revision: https://reviews.llvm.org/D53440
      
      llvm-svn: 345723
      412ed347
  15. Oct 27, 2018
  16. Oct 24, 2018
    • Teresa Johnson's avatar
      [hot-cold-split] Only perform splitting in ThinLTO backend post-link · d725335b
      Teresa Johnson authored
      Summary:
      Fix the new PM to only perform hot cold splitting once during ThinLTO,
      by skipping it in the pre-link phase.
      
      This was already fixed in the old PM by the move of the hot cold split
      pass later (after the early return when PrepareForThinLTO) by r344869.
      
      Reviewers: vsk, sebpop, hiraditya
      
      Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D53611
      
      llvm-svn: 345096
      d725335b
  17. Oct 21, 2018
  18. Oct 17, 2018
    • Fedor Sergeev's avatar
      [NewPM] teach -passes= to emit meaningful error messages · bd6b2138
      Fedor Sergeev authored
      All the PassBuilder::parse interfaces now return descriptive StringError
      instead of a plain bool. It allows to make -passes/aa-pipeline parsing
      errors context-specific and thus less confusing.
      
      TODO: ideally we should also make suggestions for misspelled pass names,
      but that requires some extensions to PassBuilder.
      
      Reviewed By: philip.pfaffe, chandlerc
      Differential Revision: https://reviews.llvm.org/D53246
      
      llvm-svn: 344685
      bd6b2138
  19. Oct 15, 2018
  20. Oct 11, 2018
    • Leonard Chan's avatar
      [PassManager/Sanitizer] Port of AddresSanitizer pass from legacy to new PassManager · 64e21b5c
      Leonard Chan authored
      This patch ports the legacy pass manager to the new one to take advantage of
      the benefits of the new PM. This involved moving a lot of the declarations for
      `AddressSantizer` to a header so that it can be publicly used via
      PassRegistry.def which I believe contains all the passes managed by the new PM.
      
      This patch essentially decouples the instrumentation from the legacy PM such
      hat it can be used by both legacy and new PM infrastructure.
      
      Differential Revision: https://reviews.llvm.org/D52739
      
      llvm-svn: 344274
      64e21b5c
    • Richard Smith's avatar
      Add a flag to remap manglings when reading profile data information. · 6c676628
      Richard Smith authored
      This can be used to preserve profiling information across codebase
      changes that have widespread impact on mangled names, but across which
      most profiling data should still be usable. For example, when switching
      from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI,
      or even from a 32-bit to a 64-bit build.
      
      The user can provide a remapping file specifying parts of mangled names
      that should be treated as equivalent (eg, std::__1 should be treated as
      equivalent to std::__cxx11), and profile data will be treated as
      applying to a particular function if its name is equivalent to the name
      of a function in the profile data under the provided equivalences. See
      the documentation change for a description of how this is configured.
      
      Remapping is supported for both sample-based profiling and instruction
      profiling. We do not support remapping indirect branch target
      information, but all other profile data should be remapped
      appropriately.
      
      Support is only added for the new pass manager. If someone wants to also
      add support for this for the old pass manager, doing so should be
      straightforward.
      
      This is the LLVM side of Clang r344199.
      
      Reviewers: davidxl, tejohnson, dlj, erik.pilkington
      
      Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D51249
      
      llvm-svn: 344200
      6c676628
  21. Oct 03, 2018
  22. Oct 01, 2018
  23. Sep 17, 2018
  24. Sep 12, 2018
    • Alexandros Lamprineas's avatar
      Revert "[GVNHoist] Re-enable GVNHoist by default" · fe0512d5
      Alexandros Lamprineas authored
      This reverts rL341954.
      
      The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
      failing with timeouts at stage2 clang/ubsan:
      
      [3065/3073] Linking CXX executable bin/lld
      command timed out: 1200 seconds without output running python
      ../sanitizer_buildbot/sanitizers/buildbot_selector.py,
      attempting to kill
      
      llvm-svn: 342001
      fe0512d5
  25. Sep 11, 2018
  26. Sep 04, 2018
    • Hiroshi Yamauchi's avatar
      [PGO] Control Height Reduction · 9775a620
      Hiroshi Yamauchi authored
      Summary:
      Control height reduction merges conditional blocks of code and reduces the
      number of conditional branches in the hot path based on profiles.
      
      if (hot_cond1) { // Likely true.
        do_stg_hot1();
      }
      if (hot_cond2) { // Likely true.
        do_stg_hot2();
      }
      
      ->
      
      if (hot_cond1 && hot_cond2) { // Hot path.
        do_stg_hot1();
        do_stg_hot2();
      } else { // Cold path.
        if (hot_cond1) {
          do_stg_hot1();
        }
        if (hot_cond2) {
          do_stg_hot2();
        }
      }
      
      This speeds up some internal benchmarks up to ~30%.
      
      Reviewers: davidxl
      
      Reviewed By: davidxl
      
      Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny
      
      Differential Revision: https://reviews.llvm.org/D50591
      
      llvm-svn: 341386
      9775a620
  27. Aug 29, 2018
  28. Jul 30, 2018
  29. Jul 19, 2018
  30. Jul 16, 2018
  31. Jul 01, 2018
    • David Green's avatar
      [UnrollAndJam] New Unroll and Jam pass · 963401d2
      David Green authored
      This is a simple implementation of the unroll-and-jam classical loop
      optimisation.
      
      The basic idea is that we take an outer loop of the form:
      
        for i..
          ForeBlocks(i)
          for j..
            SubLoopBlocks(i, j)
          AftBlocks(i)
      
      Instead of doing normal inner or outer unrolling, we unroll as follows:
      
        for i... i+=2
          ForeBlocks(i)
          ForeBlocks(i+1)
          for j..
            SubLoopBlocks(i, j)
            SubLoopBlocks(i+1, j)
          AftBlocks(i)
          AftBlocks(i+1)
        Remainder Loop
      
      So we have unrolled the outer loop, then jammed the two inner loops into
      one. This can lead to a simpler inner loop if memory accesses can be shared
      between the now jammed loops.
      
      To do this we have to prove that this is all safe, both for the memory
      accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
      AftBlocks(i) and SubLoopBlocks(i, j).
      
      Differential Revision: https://reviews.llvm.org/D41953
      
      llvm-svn: 336062
      963401d2
  32. Jun 30, 2018
  33. Jun 28, 2018
Loading