Skip to content
  1. Oct 24, 2020
  2. Oct 23, 2020
    • Nick Desaulniers's avatar
      [IR] add fn attr for no_stack_protector; prevent inlining on mismatch · b7926ce6
      Nick Desaulniers authored
      It's currently ambiguous in IR whether the source language explicitly
      did not want a stack a stack protector (in C, via function attribute
      no_stack_protector) or doesn't care for any given function.
      
      It's common for code that manipulates the stack via inline assembly or
      that has to set up its own stack canary (such as the Linux kernel) would
      like to avoid stack protectors in certain functions. In this case, we've
      been bitten by numerous bugs where a callee with a stack protector is
      inlined into an __attribute__((__no_stack_protector__)) caller, which
      generally breaks the caller's assumptions about not having a stack
      protector. LTO exacerbates the issue.
      
      While developers can avoid this by putting all no_stack_protector
      functions in one translation unit together and compiling those with
      -fno-stack-protector, it's generally not very ergonomic or as
      ergonomic as a function attribute, and still doesn't work for LTO. See also:
      https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/
      https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u
      
      Typically, when inlining a callee into a caller, the caller will be
      upgraded in its level of stack protection (see adjustCallerSSPLevel()).
      By adding an explicit attribute in the IR when the function attribute is
      used in the source language, we can now identify such cases and prevent
      inlining.  Block inlining when the callee and caller differ in the case that one
      contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.
      
      Fixes pr/47479.
      
      Reviewed By: void
      
      Differential Revision: https://reviews.llvm.org/D87956
      b7926ce6
    • Caroline Concatto's avatar
      [SVE]Clarify TypeSize comparisons in llvm/lib/Transforms · 24156364
      Caroline Concatto authored
      Use isKnownXY comparators when one of the operands can be with
      scalable vectors or getFixedSize() for all the other cases.
      
      This patch also does bug fixes for getPrimitiveSizeInBits by using
      getFixedSize() near the places with the TypeSize comparison.
      
      Differential Revision: https://reviews.llvm.org/D89703
      24156364
    • Arthur Eubanks's avatar
      [Inliner] Run always-inliner in inliner-wrapper · 0291e2c9
      Arthur Eubanks authored
      An alwaysinline function may not get inlined in inliner-wrapper due to
      the inlining order.
      
      Previously for the following, the inliner would first inline @a() into @b(),
      
      ```
      define void @a() {
      entry:
        call void @b()
        ret void
      }
      
      define void @b() alwaysinline {
      entry:
        br label %for.cond
      
      for.cond:
        call void @a()
        br label %for.cond
      }
      ```
      
      making @b() recursive and unable to be inlined into @a(), ending at
      
      ```
      define void @a() {
      entry:
        call void @b()
        ret void
      }
      
      define void @b() alwaysinline {
      entry:
        br label %for.cond
      
      for.cond:
        call void @b()
        br label %for.cond
      }
      ```
      
      Running always-inliner first makes sure that we respect alwaysinline in more cases.
      
      Fixes https://bugs.llvm.org/show_bug.cgi?id=46945.
      
      Reviewed By: davidxl, rnk
      
      Differential Revision: https://reviews.llvm.org/D86988
      0291e2c9
  3. Oct 22, 2020
  4. Oct 21, 2020
  5. Oct 19, 2020
    • Hans Wennborg's avatar
      Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting" · 0628bea5
      Hans Wennborg authored
      This broke Chromium's PGO build, it seems because hot-cold-splitting got turned
      on unintentionally. See comment on the code review for repro etc.
      
      > This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
      > the splitting pass to be toggled on/off. The current method of passing
      > `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
      > correctly (say, with `-O0` or `-Oz`).
      >
      > To implement the -fsplit-cold-code option, an attribute is applied to
      > functions to indicate that they may be considered for splitting. This
      > removes some complexity from the old/new PM pipeline builders, and
      > behaves as expected when LTO is enabled.
      >
      > Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
      > Differential Revision: https://reviews.llvm.org/D57265
      > Reviewed By: Aditya Kumar, Vedant Kumar
      > Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
      
      This reverts commit 273c299d.
      0628bea5
  6. Oct 16, 2020
    • Vedant Kumar's avatar
      [PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting · 273c299d
      Vedant Kumar authored
      This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
      the splitting pass to be toggled on/off. The current method of passing
      `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
      correctly (say, with `-O0` or `-Oz`).
      
      To implement the -fsplit-cold-code option, an attribute is applied to
      functions to indicate that they may be considered for splitting. This
      removes some complexity from the old/new PM pipeline builders, and
      behaves as expected when LTO is enabled.
      
      Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
      Differential Revision: https://reviews.llvm.org/D57265
      Reviewed By: Aditya Kumar, Vedant Kumar
      Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
      273c299d
  7. Oct 14, 2020
  8. Oct 09, 2020
    • Giorgis Georgakoudis's avatar
      [OpenMPOpt] Merge parallel regions · 3a6bfcf2
      Giorgis Georgakoudis authored
      There are cases that generated OpenMP code consists of multiple,
      consecutive OpenMP parallel regions, either due to high-level
      programming models, such as RAJA, Kokkos, lowering to OpenMP code, or
      simply because the programmer parallelized code this way.  This
      optimization merges consecutive parallel OpenMP regions to: (1) reduce
      the runtime overhead of re-activating a team of threads; (2) enlarge the
      scope for other OpenMP optimizations, e.g., runtime call deduplication
      and synchronization elimination.
      
      This implementation defensively merges parallel regions, only when they
      are within the same BB and any in-between instructions are safe to
      execute in parallel.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D83635
      3a6bfcf2
  9. Oct 07, 2020
    • Johannes Doerfert's avatar
      [Attributor] Use smarter way to determine alignment of GEPs · 7993d611
      Johannes Doerfert authored
      Use same logic existing in other places to deal with base case GEPs.
      
      Add the original Attributor talk example.
      7993d611
    • Johannes Doerfert's avatar
      [Attributor] Ignore read accesses to constant memory · c4cfe7a4
      Johannes Doerfert authored
      The old function attribute deduction pass ignores reads of constant
      memory and we need to copy this behavior to replace the pass completely.
      First step are constant globals. TBAA can also describe constant
      accesses and there are other possibilities. We might want to consider
      asking the alias analyses that are available but for now this is simpler
      and cheaper.
      c4cfe7a4
    • Johannes Doerfert's avatar
      [Attributor] Give up early on AANoReturn::initialize · 3f540c05
      Johannes Doerfert authored
      If the function is not assumed `noreturn` we should not wait for an
      update to mark the call site as "may-return".
      
      This has two kinds of consequences:
        - We have less iterations in many tests.
        - We have less deductions based on "known information" (since we ask
          earlier, point 1, and therefore assumed information is not "known"
          yet).
      The latter is an artifact that we might want to tackle properly at some
      point but which is not easily fixable right now.
      3f540c05
  10. Oct 06, 2020
    • Johannes Doerfert's avatar
      [Attributor][FIX] Move assertion to make it not trivially fail · 4a7a9884
      Johannes Doerfert authored
      The idea of this assertion was to check the simplified value before we
      assign it, not after, which caused this to trivially fail all the time.
      4a7a9884
    • Johannes Doerfert's avatar
      [Attributor][FIX] Dead return values are not `noundef` · 04f69513
      Johannes Doerfert authored
      When we assume a return value is dead we might still visit return
      instructions via `Attributor::checkForAllReturnedValuesAndReturnInsts(..)`.
      When we do so the "returned value" is potentially simplified to `undef`
      as it is the assumed "returned value". This is a problem if there was a
      preexisting `noundef` attribute that will only be removed as we manifest
      the `undef` return value. We should not use this combination to derive
      `unreachable` though. Two test cases fixed.
      04f69513
    • Johannes Doerfert's avatar
      [Attributor][NFC] Ignore benign uses in AAMemoryBehaviorFloating · 957094e3
      Johannes Doerfert authored
      In AAMemoryBehaviorFloating we used to track benign uses in a SetVector.
      With this change we look through benign uses eagerly to reduce the
      number of elements (=Uses) we look at during an update.
      
      The test does actually not fail prior to this commit but I already wrote
      it so I kept it.
      957094e3
  11. Oct 05, 2020
    • Vedant Kumar's avatar
      Revert "Outline non returning functions unless a longjmp" · 9afb1c56
      Vedant Kumar authored
      This reverts commit 20797989.
      
      This patch (https://reviews.llvm.org/D69257) cannot complete a stage2
      build due to the change:
      
      ```
      CI->getCalledFunction()->getName().contains("longjmp")
      ```
      
      There are several concrete issues here:
      
        - The callee may not be a function, so `getCalledFunction` can assert.
        - The called value may not have a name, so `getName` can assert.
        - There's no distinction made between "my_longjmp_test_helper" and the
          actual longjmp libcall.
      
      At a higher level, there's a serious layering problem here. The
      splitting pass makes policy decisions in a general way (e.g. based on
      attributes or profile data). Special-casing certain names breaks the
      layering. It subverts the work of library maintainers (who may now need
      to opt-out of unexpected optimization behavior for any affected
      functions) and can lead to inconsistent optimization behavior (as not
      all llvm passes special-case ".*longjmp.*" in the same way).
      
      The patch may need significant revision to address these issues.
      
      But the immediate issue is that this crashes while compiling llvm's unit
      tests in a stage2 build (due to the `getName` problem).
      9afb1c56
  12. Oct 04, 2020
  13. Oct 02, 2020
  14. Oct 01, 2020
    • Sjoerd Meijer's avatar
      [LoopFlatten] Add a loop-flattening pass · d53b4bee
      Sjoerd Meijer authored
      This is a simple pass that flattens nested loops.  The intention is to optimise
      loop nests like this, which together access an array linearly:
      
        for (int i = 0; i < N; ++i)
          for (int j = 0; j < M; ++j)
            f(A[i*M+j]);
      
      into one loop:
      
        for (int i = 0; i < (N*M); ++i)
          f(A[i]);
      
      It can also flatten loops where the induction variables are not used in the
      loop. This can help with codesize and runtime, especially on simple cpus
      without advanced branch prediction.
      
      This is only worth flattening if the induction variables are only used in an
      expression like i*M+j. If they had any other uses, we would have to insert a
      div/mod to reconstruct the original values, so this wouldn't be profitable.
      
      This partially fixes PR40581 as this pass triggers on one of the two cases. I
      will follow up on this to learn LoopFlatten a few more (small) tricks. Please
      note that LoopFlatten is not yet enabled by default.
      
      Patch by Oliver Stannard, with minor tweaks from Dave Green and myself.
      
      Differential Revision: https://reviews.llvm.org/D42365
      d53b4bee
    • Arthur Eubanks's avatar
      [WholeProgramDevirt][NewPM] Add NPM testing path to match legacy pass · 460dda07
      Arthur Eubanks authored
      The legacy pass's default constructor sets UseCommandLine = true and
      goes down a separate testing route. Match that in the NPM pass.
      
      This fixes all tests in llvm/test/Transforms/WholeProgramDevirt under NPM.
      
      Reviewed By: ychen
      
      Differential Revision: https://reviews.llvm.org/D88588
      460dda07
  15. Sep 29, 2020
  16. Sep 27, 2020
    • Nikita Popov's avatar
      [LVI] Require context instruction in external API (NFCI) · 9b959b59
      Nikita Popov authored
      Require CxtI in getConstant() and getConstantRange() APIs.
      Accordingly drop the BB parameter, as it is implied by
      CxtI->getParent().
      
      This makes sure we don't forget to pass the context instruction,
      and makes the API contract clearer (also clean up the comments to
      that effect -- the value holds at the context instruction, not
      the end of the block).
      9b959b59
  17. Sep 26, 2020
  18. Sep 25, 2020
    • Joseph Huber's avatar
      [OpenMP] OpenMPOpt Support for Globalization Remarks · a2281419
      Joseph Huber authored
      Summary:
      This patch add support for printing analysis messages relating to data
      globalization on the GPU. This occurs when data is shared between the
      threads in a GPU context and must be pushed to global or shared memory.
      
      Reviewers: jdoerfert
      
      Subscribers: guansong hiraditya llvm-commits ormris sstefan1 yaxunl
      
      Tags: #OpenMP #LLVM
      
      Differential Revision: https://reviews.llvm.org/D88243
      a2281419
  19. Sep 22, 2020
  20. Sep 20, 2020
  21. Sep 16, 2020
Loading