Skip to content
  1. Dec 13, 2017
    • Sanjay Patel's avatar
      [EarlyCSE] recognize commuted and swapped variants of min/max as equivalent (PR35642) · 3c7a35de
      Sanjay Patel authored
      As shown in:
      https://bugs.llvm.org/show_bug.cgi?id=35642
      ...we can have different forms of min/max, so we should recognize those here in EarlyCSE 
      similar to how we already handle binops and compares that can commute.
      
      Differential Revision: https://reviews.llvm.org/D41136
      
      llvm-svn: 320640
      3c7a35de
    • Brian M. Rzycki's avatar
      [JumpThreading] Preservation of DT and LVI across the pass · d989af98
      Brian M. Rzycki authored
      Summary:
      See D37528 for a previous (non-deferred) version of this
      patch and its description.
      
      Preserves dominance in a deferred manner using a new class
      DeferredDominance. This reduces the performance impact of
      updating the DominatorTree at every edge insertion and
      deletion. A user may call DDT->flush() within JumpThreading
      for an up-to-date DT. This patch currently has one flush()
      at the end of runImpl() to ensure DT is preserved across
      the pass.
      
      LVI is also preserved to help subsequent passes such as
      CorrelatedValuePropagation. LVI is simpler to maintain and
      is done immediately (not deferred). The code to perfom the
      preversation was minimally altered and was simply marked
      as preserved for the PassManager to be informed.
      
      This extends the analysis available to JumpThreading for
      future enhancements. One example is loop boundary threading.
      
      Reviewers: dberlin, kuhar, sebpop
      
      Reviewed By: kuhar, sebpop
      
      Subscribers: hiraditya, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D40146
      
      llvm-svn: 320612
      d989af98
    • Aditya Kumar's avatar
      [GVNHoist] Fix: PR35222 gvn-hoist incorrectly erases load · 49c03b11
      Aditya Kumar authored
      w.r.t. the paper
      "A Practical Improvement to the Partial Redundancy Elimination in SSA Form"
      (https://sites.google.com/site/jongsoopark/home/ssapre.pdf)
      
      Proper dominance check was missing here, so having a loopinfo should not be required.
      Committing this diff as this fixes the bug, if there are
      further concerns, I'll be happy to work on them.
      
      Differential Revision: https://reviews.llvm.org/D39781
      
      llvm-svn: 320607
      49c03b11
    • Igor Laevsky's avatar
      Reintroduce r320049, r320014 and r319894. · e0edb664
      Igor Laevsky authored
      OpenGL issues should be fixed by now.
      
      llvm-svn: 320568
      e0edb664
    • Mohammad Shahid's avatar
      [SLP] Vectorize jumbled memory loads. · dbd30edb
      Mohammad Shahid authored
      Summary:
      This patch tries to vectorize loads of consecutive memory accesses, accessed
      in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
      which was reverted back due to some basic issue with representing the 'use mask' of
      jumbled accesses.
      
      This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
      
      Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
      
      Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
      
      Reviewed By: Ayal
      
      Subscribers: mgrang, dcaballe, hans, mzolotukhin
      
      Differential Revision: https://reviews.llvm.org/D36130
      
      llvm-svn: 320548
      dbd30edb
    • Florian Hahn's avatar
      [CallSiteSplitting] Refactor creating callsites. · beda7d51
      Florian Hahn authored
      Summary:
      This change makes the call site creation more general if any of the
      arguments is predicated on a condition in the call site's predecessors.
      
      If we find a callsite, that potentially can be split, we collect the set
      of conditions for the call site's predecessors (currently only 2
      predecessors are allowed). To do that, we traverse each predecessor's
      predecessors as long as it only has single predecessors and record the
      condition, if it is relevant to the call site. For each condition, we
      also check if the condition is taken or not. In case it is not taken,
      we record the inverse predicate.
      
      We use the recorded conditions to create the new call sites and split
      the basic block.
      
      This has 2 benefits: (1) it is slightly easier to see what is going on
      (IMO) and (2) we can easily extend it to handle more complex control
      flow.
      
      Reviewers: davidxl, junbuml
      
      Reviewed By: junbuml
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D40728
      
      llvm-svn: 320547
      beda7d51
  2. Dec 12, 2017
  3. Dec 11, 2017
  4. Dec 10, 2017
  5. Dec 09, 2017
  6. Dec 08, 2017
    • Vedant Kumar's avatar
      [Debugify] Add a pass to test debug info preservation · 195dfd10
      Vedant Kumar authored
      The Debugify pass synthesizes debug info for IR. It's paired with a
      CheckDebugify pass which determines how much of the original debug info
      is preserved. These passes make it easier to create targeted tests for
      debug info preservation.
      
      Here is the Debugify algorithm:
      
        NextLine = 1
        for (Instruction &I : M)
          attach DebugLoc(NextLine++) to I
      
        NextVar = 1
        for (Instruction &I : M)
          if (canAttachDebugValue(I))
            attach dbg.value(NextVar++) to I
      
      The CheckDebugify pass expects contiguous ranges of DILocations and
      DILocalVariables. If it fails to find all of the expected debug info, it
      prints a specific error to stderr which can be FileChecked.
      
      This was discussed on llvm-dev in the thread:
      "Passes to add/validate synthetic debug info"
      
      Differential Revision: https://reviews.llvm.org/D40512
      
      llvm-svn: 320202
      195dfd10
    • Florian Hahn's avatar
      [CodeExtractor] Add debug locations for new call and branch instrs. · e5089e2e
      Florian Hahn authored
      Summary:
      If a partially inlined function has debug info, we have to add debug
      locations to the call instruction calling the outlined function.
      We use the debug location of the first instruction in the outlined
      function, as the introduced call transfers control to this statement and
      there is no other equivalent line in the source code.
      
      We also use the same debug location for the branch instruction added
      to jump from artificial entry block for the outlined function, which just
      jumps to the first actual basic block of the outlined function.
      
      Reviewers: davide, aprantl, rriddle, dblaikie, danielcdh, wmi
      
      Reviewed By: aprantl, rriddle, danielcdh
      
      Subscribers: eraman, JDevlieghere, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D40413
      
      llvm-svn: 320199
      e5089e2e
    • Xinliang David Li's avatar
      Revert r320104: infinite loop profiling bug fix · d91057bf
      Xinliang David Li authored
      Causes unexpected memory issue with New PM this time.
      The new PM invalidates BPI but not BFI, leaving the
      reference to BPI from BFI invalid.
      
      Abandon this patch.  There is a more general solution
      which also handles runtime infinite loop (but not statically).
      
      llvm-svn: 320180
      d91057bf
    • Alexey Bataev's avatar
      [InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1,... · ec95c6cc
      Alexey Bataev authored
      [InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1, &V2))  --> store (, load (select(Cond, load &V1, load &V2)))
      
      Summary:
      If we have the code like this:
      ```
      float a, b;
      a = std::max(a ,b);
      ```
      it is converted into something like this:
      ```
      %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %a.addr, float* nonnull dereferenceable(4) %b.addr)
      %1 = bitcast float* %call to i32*
      %2 = load i32, i32* %1, align 4
      %3 = bitcast float* %a.addr to i32*
      store i32 %2, i32* %3, align 4
      ```
      After inlinning this code is converted to the next:
      ```
      %1 = load float, float* %a.addr
      %2 = load float, float* %b.addr
      %cmp.i = fcmp fast olt float %1, %2
      %__b.__a.i = select i1 %cmp.i, float* %a.addr, float* %b.addr
      %3 = bitcast float* %__b.__a.i to i32*
      %4 = load i32, i32* %3, align 4
      %5 = bitcast float* %arrayidx to i32*
      store i32 %4, i32* %5, align 4
      
      ```
      This pattern is not recognized as minmax pattern.
      Patch solves this problem by converting sequence
      ```
      store (bitcast, (load bitcast (select ((cmp V1, V2), &V1, &V2))))
      ```
      to a sequence
      ```
      store (,load (select((cmp V1, V2), &V1, &V2)))
      ```
      After this the code is recognized as minmax pattern.
      
      Reviewers: RKSimon, spatel
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D40304
      
      llvm-svn: 320157
      ec95c6cc
  7. Dec 07, 2017
Loading