Skip to content
  1. Jan 19, 2019
    • Vedant Kumar's avatar
      [CodeExtractor] Emit lifetime markers around reloads of outputs · 17d9f14b
      Vedant Kumar authored
      CodeExtractor permits extracting a region of blocks from a function even
      when values defined within the region are used outside of it.
      
      This is typically done by creating an alloca in the original function
      and reloading the alloca after a call to the extracted function.
      
      Wrap the reload in lifetime start/end markers to promote stack coloring.
      
      Suggested by Sergei Kachkov!
      
      Differential Revision: https://reviews.llvm.org/D56045
      
      llvm-svn: 351621
      17d9f14b
  2. Jan 18, 2019
    • Florian Hahn's avatar
      [LCSSA] Skip blocks in sub-loops when scanning for uses. · be7cbe3f
      Florian Hahn authored
      Summary:
      Scanning blocks in sub-loops for uses is unnecessary, as they were
      already handled while dealing with the containing sub-loop.
      
      This speeds up LCSSA for highly nested loops. For the test case in PR37202, it
      halves the time spent in LCSSA. In cases were we won't be able to skip
      any blocks, the additional lookup should be negligible.
      
      Time-passes without this patch for test case from PR37202:
      
        Total Execution Time: 48.5505 seconds (48.5511 wall clock)
      
         ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
        10.0822 ( 21.0%)   0.1406 ( 27.0%)  10.2228 ( 21.1%)  10.2228 ( 21.1%)  Loop-Closed SSA Form Pass
        10.0417 ( 20.9%)   0.1467 ( 28.2%)  10.1884 ( 21.0%)  10.1890 ( 21.0%)  Loop-Closed SSA Form Pass #2
         4.2703 (  8.9%)   0.0040 (  0.8%)   4.2742 (  8.8%)   4.2742 (  8.8%)  Unswitch loops
         2.7376 (  5.7%)   0.0229 (  4.4%)   2.7605 (  5.7%)   2.7611 (  5.7%)  Loop-Closed SSA Form Pass #5
         2.7332 (  5.7%)   0.0214 (  4.1%)   2.7546 (  5.7%)   2.7546 (  5.7%)  Loop-Closed SSA Form Pass #3
         2.7088 (  5.6%)   0.0230 (  4.4%)   2.7319 (  5.6%)   2.7324 (  5.6%)  Loop-Closed SSA Form Pass #4
         2.6855 (  5.6%)   0.0236 (  4.5%)   2.7091 (  5.6%)   2.7090 (  5.6%)  Loop-Closed SSA Form Pass #6
         2.1648 (  4.5%)   0.0018 (  0.4%)   2.1666 (  4.5%)   2.1664 (  4.5%)  Unroll loops
         1.8371 (  3.8%)   0.0009 (  0.2%)   1.8379 (  3.8%)   1.8380 (  3.8%)  Value Propagation
         1.8149 (  3.8%)   0.0021 (  0.4%)   1.8170 (  3.7%)   1.8169 (  3.7%)  Loop Invariant Code Motion
         1.6755 (  3.5%)   0.0226 (  4.3%)   1.6981 (  3.5%)   1.6980 (  3.5%)  Loop-Closed SSA Form Pass #7
      
      Time-passes with this patch
      
        Total Execution Time: 29.9285 seconds (29.9276 wall clock)
      
         ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
         5.2786 ( 17.7%)   0.0021 (  1.2%)   5.2806 ( 17.6%)   5.2808 ( 17.6%)  Unswitch loops
         4.3739 ( 14.7%)   0.0303 ( 18.1%)   4.4042 ( 14.7%)   4.4042 ( 14.7%)  Loop-Closed SSA Form Pass
         4.2658 ( 14.3%)   0.0192 ( 11.5%)   4.2850 ( 14.3%)   4.2851 ( 14.3%)  Loop-Closed SSA Form Pass #2
         2.2307 (  7.5%)   0.0013 (  0.8%)   2.2320 (  7.5%)   2.2318 (  7.5%)  Loop Invariant Code Motion
         2.0888 (  7.0%)   0.0012 (  0.7%)   2.0900 (  7.0%)   2.0897 (  7.0%)  Unroll loops
         1.6761 (  5.6%)   0.0013 (  0.8%)   1.6774 (  5.6%)   1.6774 (  5.6%)  Value Propagation
         1.3686 (  4.6%)   0.0029 (  1.8%)   1.3716 (  4.6%)   1.3714 (  4.6%)  Induction Variable Simplification
         1.1457 (  3.8%)   0.0010 (  0.6%)   1.1468 (  3.8%)   1.1468 (  3.8%)  Loop-Closed SSA Form Pass #4
         1.1384 (  3.8%)   0.0005 (  0.3%)   1.1389 (  3.8%)   1.1389 (  3.8%)  Loop-Closed SSA Form Pass #6
         1.1360 (  3.8%)   0.0027 (  1.6%)   1.1387 (  3.8%)   1.1387 (  3.8%)  Loop-Closed SSA Form Pass #5
         1.1331 (  3.8%)   0.0010 (  0.6%)   1.1341 (  3.8%)   1.1340 (  3.8%)  Loop-Closed SSA Form Pass #3
      
      Reviewers: davide, efriedma, mzolotukhin
      
      Reviewed By: davide, efriedma
      
      Subscribers: hiraditya, dmgreen, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56848
      
      llvm-svn: 351567
      be7cbe3f
  3. Jan 17, 2019
  4. Jan 16, 2019
    • Philip Pfaffe's avatar
      [NewPM][TSan] Reiterate the TSan port · 685c76d7
      Philip Pfaffe authored
      Summary:
      Second iteration of D56433 which got reverted in rL350719. The problem
      in the previous version was that we dropped the thunk calling the tsan init
      function. The new version keeps the thunk which should appease dyld, but is not
      actually OK wrt. the current semantics of function passes. Hence, add a
      helper to insert the functions only on the first time. The helper
      allows hooking into the insertion to be able to append them to the
      global ctors list.
      
      Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan
      
      Subscribers: hiraditya, bollu, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56538
      
      llvm-svn: 351314
      685c76d7
  5. Jan 15, 2019
    • David Callahan's avatar
      treat invoke like call · d129d3e9
      David Callahan authored
      Summary:
      InvokeInst should be treated like CallInst and
      assigned a separate discriminator. This is particularly
      import when an Invoke is converted to a Call
      during compilation and so can invalidate sample profile
      data collected wtih different link time optimizations
      
      Reviewers: twoh, Kader, danielcdh, wmi
      
      Reviewed By: wmi
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56491
      
      llvm-svn: 351251
      d129d3e9
    • Max Kazantsev's avatar
      [NFC] Move some functions to LoopUtils · a78dc4d6
      Max Kazantsev authored
      llvm-svn: 351179
      a78dc4d6
  6. Jan 14, 2019
    • Jeremy Morse's avatar
      [DebugInfo] Remove un-necessary logic from HoistThenElseCodeToIf · f216da7e
      Jeremy Morse authored
      Following PR39807, the way in which SimplifyCFG hoists common code on
      branch paths was fixed in r347782. However this left extra code hanging
      around HoistThenElseCodeToIf that wasn't necessary and needlessly
      complicated matters -- we no longer need to look up through the 'if'
      basic block to find a location for hoisted 'select' insts, we can instead
      use the location chosen by applyMergedLocation.
      
      This patch deletes that extra logic, and updates a regression test to
      reflect the new logic (selects get the merged location, not a previous
      insts location).
      
      Differential Revision: https://reviews.llvm.org/D55272
      
      llvm-svn: 351058
      f216da7e
    • Max Kazantsev's avatar
      [BasicBlockUtils] Generalize DeleteDeadBlock to deal with multiple dead blocks · 1f73310e
      Max Kazantsev authored
      Utility function `DeleteDeadBlock` expects that all predecessors of a block being
      deleted are already deleted, with the exception of single-block loop. It makes it
      hard to use for deletion of a set of blocks that may contain cyclic dependencies.
      The is no correct order of invocations of this function that does not produce
      dangling pointers on already deleted blocks.
      
      This patch introduces a generalized version of this function `DeleteDeadBlocks`
      that allows us to remove multiple blocks at once, even if there are cycles among
      them. The only requirement is that no block being deleted should have a predecessor
      that is not being deleted. 
      
      The logic of `DeleteDeadBlocks` is following:
        for each block
          create relevant DT updates;
          remove all instructions (replace with undef if needed);
          replace terminator with unreacheable;
        apply DT updates;
        for each block
          delete block;
      
      Therefore, `DeleteDeadBlock` becomes a particular case of
      the general algorithm called for a single block.
      
      Differential Revision: https://reviews.llvm.org/D56120
      Reviewed By: skatkov
      
      llvm-svn: 351045
      1f73310e
  7. Jan 10, 2019
  8. Jan 08, 2019
    • Anna Thomas's avatar
      [UnrollRuntime] Fix domTree failures in multiexit unrolling · 2dfa412e
      Anna Thomas authored
      Summary:
      This fixes the IDom for exit blocks and all blocks reachable from the exit blocks, when runtime unrolling under multiexit/exiting case.
      We initially had a restrictive check that the IDom is only updated when
      it is the header of the loop.
      However, we also need to update the IDom to the correct one when the
      IDom is any block within the original loop. See added test cases (which
      fail dom tree verification without the patch).
      
      Reviewers: reames, mzolotukhin, mkazantsev, hfinkel
      
      Reviewed by: brzycki, kuhar
      
      Subscribers: zzheng, dmgreen, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56284
      
      llvm-svn: 350640
      2dfa412e
  9. Jan 07, 2019
    • Chandler Carruth's avatar
      [CallSite removal] Migrate all Alias Analysis APIs to use the newly · 363ac683
      Chandler Carruth authored
      minted `CallBase` class instead of the `CallSite` wrapper.
      
      This moves the largest interwoven collection of APIs that traffic in
      `CallSite`s. While a handful of these could have been migrated with
      a minorly more shallow migration by converting from a `CallSite` to
      a `CallBase`, it hardly seemed worth it. Most of the APIs needed to
      migrate together because of the complex interplay of AA APIs and the
      fact that converting from a `CallBase` to a `CallSite` isn't free in its
      current implementation.
      
      Out of tree users of these APIs can fairly reliably migrate with some
      combination of `.getInstruction()` on the `CallSite` instance and
      casting the resulting pointer. The most generic form will look like `CS`
      -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there
      is a more elegant migration. Hopefully, this migrates enough APIs for
      users to fully move from `CallSite` to the base class. All of the
      in-tree users were easily migrated in that fashion.
      
      Thanks for the review from Saleem!
      
      Differential Revision: https://reviews.llvm.org/D55641
      
      llvm-svn: 350503
      363ac683
  10. Jan 04, 2019
    • Teresa Johnson's avatar
      [ThinLTO] Handle chains of aliases · 853b9624
      Teresa Johnson authored
      At -O0, globalopt is not run during the compile step, and we can have a
      chain of an alias having an immediate aliasee of another alias. The
      summaries are constructed assuming aliases in a canonical form
      (flattened chains), and as a result only the base object but no
      intermediate aliases were preserved.
      
      Fix by adding a pass that canonicalize aliases, which ensures each
      alias is a direct alias of the base object.
      
      Reviewers: pcc, davidxl
      
      Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D54507
      
      llvm-svn: 350423
      853b9624
    • Vedant Kumar's avatar
      [CodeExtractor] Do not extract unsafe lifetime markers · a1778df4
      Vedant Kumar authored
      Lifetime markers which reference inputs to the extraction region are not
      safe to extract. Example ('rhs' will be extracted):
      
      ```
                     entry:
                    +------------+
                    | x = alloca |
                    | y = alloca |
                    +------------+
                   /              \
         lhs:                      rhs:
        +-------------------+     +-------------------+
        | lifetime_start(x) |     | lifetime_start(x) |
        | use(x)            |     | lifetime_start(y) |
        | lifetime_end(x)   |     | use(x, y)         |
        | lifetime_start(y) |     | lifetime_end(y)   |
        | use(y)            |     | lifetime_end(x)   |
        | lifetime_end(y)   |     +-------------------+
        +-------------------+
      ```
      
      Prior to extraction, the stack coloring pass sees that the slots for 'x'
      and 'y' are in-use at the same time. After extraction, the coloring pass
      infers that 'x' and 'y' are *not* in-use concurrently, because markers
      from 'rhs' are no longer available to help decide otherwise.
      
      This leads to a miscompile, because the stack slots actually are in-use
      concurrently in the extracted function.
      
      Fix this by moving lifetime start/end markers for memory regions defined
      in the calling function around the call to the extracted function.
      
      Fixes llvm.org/PR39671 (rdar://45939472).
      
      Differential Revision: https://reviews.llvm.org/D55967
      
      llvm-svn: 350420
      a1778df4
  11. Jan 03, 2019
    • Anna Thomas's avatar
      [UnrollRuntime] Move the DomTree verification under expensive checks · a470aa67
      Anna Thomas authored
      Suggested by Hal as done in r349871.
      
      llvm-svn: 350349
      a470aa67
    • Anna Thomas's avatar
      [UnrollRuntime] Add DomTree verification under debug mode · 0785e730
      Anna Thomas authored
      NFC: This adds the dom tree verification under debug mode at a point
      just before we start unrolling the loop. This allows us to verify dom
      tree at a state where it is much smaller and before the unrolling
      actually happens.
      This also implies we do not need to run -verify-dom-info everytime to
      see if the DT is in a valid state when we transform the loop for runtime
      unrolling.
      
      llvm-svn: 350334
      0785e730
    • Philip Pfaffe's avatar
      [NewPM] Port Msan · b39a97c8
      Philip Pfaffe authored
      Summary:
      Keeping msan a function pass requires replacing the module level initialization:
      That means, don't define a ctor function which calls __msan_init, instead just
      declare the init function at the first access, and add that to the global ctors
      list.
      
      Changes:
      - Pull the actual sanitizer and the wrapper pass apart.
      - Add a newpm msan pass. The function pass inserts calls to runtime
        library functions, for which it inserts declarations as necessary.
      - Update tests.
      
      Caveats:
      - There is one test that I dropped, because it specifically tested the
        definition of the ctor.
      
      Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka
      
      Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji
      
      Differential Revision: https://reviews.llvm.org/D55647
      
      llvm-svn: 350305
      b39a97c8
  12. Dec 28, 2018
  13. Dec 21, 2018
  14. Dec 20, 2018
    • Michael Kruse's avatar
      Introduce llvm.loop.parallel_accesses and llvm.access.group metadata. · 978ba615
      Michael Kruse authored
      The current llvm.mem.parallel_loop_access metadata has a problem in that
      it uses LoopIDs. LoopID unfortunately is not loop identifier. It is
      neither unique (there's even a regression test assigning the some LoopID
      to multiple loops; can otherwise happen if passes such as LoopVersioning
      make copies of entire loops) nor persistent (every time a property is
      removed/added from a LoopID's MDNode, it will also receive a new LoopID;
      this happens e.g. when calling Loop::setLoopAlreadyUnrolled()).
      Since most loop transformation passes change the loop attributes (even
      if it just to mark that a loop should not be processed again as
      llvm.loop.isvectorized does, for the versioned and unversioned loop),
      the parallel access information is lost for any subsequent pass.
      
      This patch unlinks LoopIDs and parallel accesses.
      llvm.mem.parallel_loop_access metadata on instruction is replaced by
      llvm.access.group metadata. llvm.access.group points to a distinct
      MDNode with no operands (avoiding the problem to ever need to add/remove
      operands), called "access group". Alternatively, it can point to a list
      of access groups. The LoopID then has an attribute
      llvm.loop.parallel_accesses with all the access groups that are parallel
      (no dependencies carries by this loop).
      
      This intentionally avoid any kind of "ID". Loops that are clones/have
      their attributes modifies retain the llvm.loop.parallel_accesses
      attribute. Access instructions that a cloned point to the same access
      group. It is not necessary for each access to have it's own "ID" MDNode,
      but those memory access instructions with the same behavior can be
      grouped together.
      
      The behavior of llvm.mem.parallel_loop_access is not changed by this
      patch, but should be considered deprecated.
      
      Differential Revision: https://reviews.llvm.org/D52116
      
      llvm-svn: 349725
      978ba615
  15. Dec 15, 2018
    • Vedant Kumar's avatar
      [Util] Refer to [s|z]exts of args when converting dbg.declares (fix PR35400) · 9d182733
      Vedant Kumar authored
      When converting dbg.declares, if the described value is a [s|z]ext,
      refer to the ext directly instead of referring to its operand.
      
      This fixes a narrowing bug (the debugger got the sign of a variable
      wrong, see llvm.org/PR35400).
      
      The main reason to refer to the ext's operand was that an optimization
      may remove the ext itself, leading to a dropped variable. Now that
      InstCombine has been taught to use replaceAllDbgUsesWith (r336451), this
      is less of a concern. Other passes can/should adopt this API as needed
      to fix dropped variable bugs.
      
      Differential Revision: https://reviews.llvm.org/D51813
      
      llvm-svn: 349214
      9d182733
  16. Dec 14, 2018
  17. Dec 13, 2018
    • Easwaran Raman's avatar
      [ThinLTO] Compute synthetic function entry count · 5a7056fa
      Easwaran Raman authored
      Summary:
      This patch computes the synthetic function entry count on the whole
      program callgraph (based on module summary) and writes the entry counts
      to the summary. After function importing, this count gets attached to
      the IR as metadata. Since it adds a new field to the summary, this bumps
      up the version.
      
      Reviewers: tejohnson
      
      Subscribers: mehdi_amini, inglorion, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D43521
      
      llvm-svn: 349076
      5a7056fa
    • Davide Italiano's avatar
      [LoopUtils] Use i32 instead of `void`. · 9737096b
      Davide Italiano authored
      The actual type of the first argument of the @dbg intrinsic
      doesn't really matter as we're setting it to `undef`, but the
      bitcode reader is picky about `void` types.
      
      llvm-svn: 349069
      9737096b
    • Davide Italiano's avatar
      [LoopUtils] Prefer a set over a map. NFCI. · 8ee59ca6
      Davide Italiano authored
      llvm-svn: 348999
      8ee59ca6
    • Davide Italiano's avatar
      [LoopDeletion] Update debug values after loop deletion. · 744c3c32
      Davide Italiano authored
      When loops are deleted, we don't keep track of variables modified inside
      the loops, so the DI will contain the wrong value for these.
      
      e.g.
      
      int b() {
      
      int i;
      for (i = 0; i < 2; i++)
        ;
      patatino();
      return a;
      -> 6 patatino();
      
      7     return a;
      8   }
      9   int main() { b(); }
      (lldb) frame var i
      (int) i = 0
      
      We mark instead these values as unavailable inserting a
      @llvm.dbg.value(undef to make sure we don't end up printing an incorrect
      value in the debugger. We could consider doing something fancier,
      for, e.g. constants, in the future.
      
      PR39868.
      rdar://problem/46418795)
      
      Differential Revision: https://reviews.llvm.org/D55299
      
      llvm-svn: 348988
      744c3c32
  18. Dec 12, 2018
    • Michael Kruse's avatar
      [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. · 72448525
      Michael Kruse authored
      When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.
      
          #pragma clang loop unroll_and_jam(enable)
          #pragma clang loop distribute(enable)
      
      is the same as
      
          #pragma clang loop distribute(enable)
          #pragma clang loop unroll_and_jam(enable)
      
      and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.
      
      This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,
      
          !0 = !{!0, !1, !2}
          !1 = !{!"llvm.loop.unroll_and_jam.enable"}
          !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
          !3 = !{!"llvm.loop.distribute.enable"}
      
      defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.
      
      Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.
      
      For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.
      
      Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.
      
      To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.
      
      With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).
      
      Reviewed By: hfinkel, dmgreen
      
      Differential Revision: https://reviews.llvm.org/D49281
      Differential Revision: https://reviews.llvm.org/D55288
      
      llvm-svn: 348944
      72448525
  19. Dec 10, 2018
  20. Dec 07, 2018
    • Vedant Kumar's avatar
      [CodeExtractor] Store outputs at the first valid insertion point · b2a6f8e5
      Vedant Kumar authored
      When CodeExtractor outlines values which are used by the original
      function, it must store those values in some in-out parameter. This
      store instruction must not be inserted in between a PHI and an EH pad
      instruction, as that results in invalid IR.
      
      This fixes the following verifier failure seen while outlining within
      ObjC methods with live exit values:
      
        The unwind destination does not have an exception handling instruction!
          %call35 = invoke i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %exn.adjusted, i8* %1)
                  to label %invoke.cont34 unwind label %lpad33, !dbg !4183
        The unwind destination does not have an exception handling instruction!
          invoke void @objc_exception_throw(i8* %call35) #12
                  to label %invoke.cont36 unwind label %lpad33, !dbg !4184
        LandingPadInst not the first non-PHI instruction in the block.
          %3 = landingpad { i8*, i32 }
                  catch i8* null, !dbg !1411
      
      rdar://46540815
      
      llvm-svn: 348562
      b2a6f8e5
  21. Dec 05, 2018
  22. Dec 03, 2018
    • Vedant Kumar's avatar
      [CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433) · d129569e
      Vedant Kumar authored
      If a PHI node out of extracted region has multiple incoming values from it,
      split this PHI on two parts. First PHI has incomings only from region and
      extracts with it (they are placed to the separate basic block that added to the
      list of outlined), and incoming values in original PHI are replaced by first
      PHI. Similar solution is already used in CodeExtractor for PHIs in entry block
      (severSplitPHINodes method). It covers PR39433 bug.
      
      Patch by Sergei Kachkov!
      
      Differential Revision: https://reviews.llvm.org/D55018
      
      llvm-svn: 348205
      d129569e
  23. Dec 02, 2018
  24. Nov 30, 2018
    • Joseph Tremoulet's avatar
      [Mem2Reg] Fix nondeterministic corner case · 27b1e3bd
      Joseph Tremoulet authored
      Summary:
      When mem2reg inserts phi nodes in blocks with unreachable predecessors,
      it adds undef operands for those incoming edges.  When there are
      multiple such predecessors, the order is currently based on the address
      of the BasicBlocks.  This change fixes that by using the BBNumbers in
      the sort/search predicates, as is done elsewhere in mem2reg to ensure
      determinism.
      
      Also adds a testcase with a bunch of unreachable preds, which
      (nodeterministically) fails without the fix.
      
      
      Reviewers: majnemer
      
      Reviewed By: majnemer
      
      Subscribers: mgrang, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D55077
      
      llvm-svn: 348024
      27b1e3bd
  25. Nov 28, 2018
    • Jeremy Morse's avatar
      [DebugInfo] Give inlinable calls DILocs (PR39807) · 9b4cfa55
      Jeremy Morse authored
      In PR39807 we incorrectly handle circumstances where calls are common'd
      from conditional blocks into the parent BB. Calls that can be inlined
      must always have DebugLocs, however we strip them during commoning, which
      the IR verifier asserts on.
      
      Fix this by using applyMergedLocation: it will perform the same DebugLoc
      stripping of conditional Locs, but will also generate an unknown location
      DebugLoc that satisfies the requirement for inlinable calls to always have
      locations.
      
      Some of the prior logic for selecting a DebugLoc is now likely redundant;
      I'll generate a follow-up to remove it (involves editing more regression
      tests).
      
      Differential Revision: https://reviews.llvm.org/D54997
      
      llvm-svn: 347782
      9b4cfa55
    • Xin Tong's avatar
      [ThinLTO] Correct linkonce_any function import linkage. NFC. · 53e52e47
      Xin Tong authored
      Summary:
      This is a NFC as we do not import non-odr vague linkage when computing
      for import list for a module.
      
      Reviewers: tejohnson, pcc
      
      Subscribers: inglorion, dexonsmith, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D54928
      
      llvm-svn: 347763
      53e52e47
  26. Nov 26, 2018
  27. Nov 19, 2018
    • Reid Kleckner's avatar
      [Transforms] Prefer static and avoid namespaces, NFC · 994a8451
      Reid Kleckner authored
      Put 'static' on three functions in an anonymous namespace as per our
      coding style.
      
      Remove the 'namespace llvm {}' around the .cpp file and explicitly
      declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'.
      I prefer this style for free functions because the compiler will error
      out if the .h and .cpp files don't agree on the function name or
      prototype.
      
      llvm-svn: 347269
      994a8451
    • Vedant Kumar's avatar
      [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlock · 4de31bba
      Vedant Kumar authored
      Add methods to BasicBlock which make it easier to efficiently check
      whether a block has N (or more) predecessors.
      
      This can be more efficient than using pred_size(), which is a linear
      time operation.
      
      We might consider adding similar methods for successors. I haven't done
      so in this patch because succ_size() is already O(1).
      
      With this patch applied, I measured a 0.065% compile-time reduction in
      user time for running `opt -O3` on the sqlite3 amalgamation (30 trials).
      The change in mergeStoreIntoSuccessor alone saves 45 million linked list
      iterations in a stage2 Release build of llc.
      
      See llvm.org/PR39702 for a harder but more general way of achieving
      similar results.
      
      Differential Revision: https://reviews.llvm.org/D54686
      
      llvm-svn: 347256
      4de31bba
Loading