Skip to content
  1. Aug 17, 2017
    • Simon Dardis's avatar
      [dfsan] Add explicit zero extensions for shadow parameters in function wrappers. · b5205c69
      Simon Dardis authored
      In the case where dfsan provides a custom wrapper for a function,
      shadow parameters are added for each parameter of the function.
      These parameters are i16s. For targets which do not consider this
      a legal type, the lack of sign extension information would cause
      LLVM to generate anyexts around their usage with phi variables
      and calling convention logic.
      
      Address this by introducing zero exts for each shadow parameter.
      
      Reviewers: pcc, slthakur
      
      Differential Revision: https://reviews.llvm.org/D33349
      
      llvm-svn: 311087
      b5205c69
    • Ayal Zaks's avatar
      [LV] Using VPlan to model the vectorized code and drive its transformation · 66278833
      Ayal Zaks authored
      VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
      patch introduces the VPlan model into LV and uses it to represent the vectorized
      code and drive the generation of vectorized IR.
      
      In this patch VPlan models the vectorized loop body: the vectorized control-flow
      is represented using VPlan's Hierarchical CFG, with predication refactored from
      being a post-vectorization-step into a vectorization planning step modeling
      if-then VPRegionBlocks, and generating code inline with non-predicated code. The
      vectorized code within each VPBasicBlock is represented as a sequence of
      Recipes, each responsible for modelling and generating a sequence of IR
      instructions. To keep the size of this commit manageable the Recipes in this
      patch are coarse-grained and capture large chunks of LV's code-generation logic.
      The constructed VPlans are dumped in dot format under -debug.
      
      This commit retains current vectorizer output, except for minor instruction
      reorderings; see associated modifications to lit tests.
      
      For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
      and its references.
      
      Authors: Gil Rapaport and Ayal Zaks
      
      Differential Revision: https://reviews.llvm.org/D32871
      
      llvm-svn: 311077
      66278833
    • Jakub Kuderski's avatar
      Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators · fd5c5c91
      Jakub Kuderski authored
      Summary:
      This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.
      
      I didn't notice any performance impact when bootstrapping clang with this patch.
      
      The patch was originally committed in r311039 and reverted in r311049.
      This revision fixes the problem with not adding a dependency on the
      DominatorTreeWrapperPass for the LegacyPassManager.
      
      Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki
      
      Reviewed By: davide
      
      Subscribers: grandinj, zhendongsu, llvm-commits, david2050
      
      Differential Revision: https://reviews.llvm.org/D35869
      
      llvm-svn: 311057
      fd5c5c91
    • Amjad Aboud's avatar
      [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including... · 86111c66
      Amjad Aboud authored
      [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount)
      
      Differential Revision: https://reviews.llvm.org/D36784
      
      llvm-svn: 311050
      86111c66
    • Jakub Kuderski's avatar
      Revert "[ADCE][Dominators] Teach ADCE to preserve dominators" · cbcffb17
      Jakub Kuderski authored
      This reverts commit r311039. The patch caused the
      `test/Bindings/OCaml/Output/scalar_opts.ml` to fail.
      
      llvm-svn: 311049
      cbcffb17
  2. Aug 16, 2017
    • Craig Topper's avatar
      [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) +... · 882f2963
      Craig Topper authored
      [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors
      
      This also uses decomposeBitTestICmp to decode the compare.
      
      Differential Revision: https://reviews.llvm.org/D36781
      
      llvm-svn: 311044
      882f2963
    • Jakub Kuderski's avatar
      [ADCE][Dominators] Teach ADCE to preserve dominators · 4552e9de
      Jakub Kuderski authored
      Summary:
      This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.
      
      I didn't notice any performance impact when bootstrapping clang with this patch.
      
      Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki
      
      Reviewed By: davide
      
      Subscribers: grandinj, zhendongsu, llvm-commits, david2050
      
      Differential Revision: https://reviews.llvm.org/D35869
      
      llvm-svn: 311039
      4552e9de
    • Geoff Berry's avatar
      [LoopDataPrefetch][AArch64FalkorHWPFFix] Preserve ScalarEvolution · 40549ad1
      Geoff Berry authored
      Summary:
      Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
      ScalarEvolution since they do not alter loop structure and should not
      alter any SCEV values (though LoopDataPrefetch may introduce new
      instructions that won't have cached SCEV values yet).
      
      This can result in slight code differences, mainly w.r.t. nsw/nuw flags
      on SCEVs, since these are computed somewhat lazily when a zext/sext
      instruction is encountered.  As a result, passes after the modified
      passes may see SCEVs with more nsw/nuw flags present.
      
      Reviewers: sanjoy, anemet
      
      Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D36716
      
      llvm-svn: 311032
      40549ad1
    • Hal Finkel's avatar
      [BDCE] Don't check demanded bits on unsized types · 9e54b709
      Hal Finkel authored
      To clear assumptions that are potentially invalid after trivialization, we need
      to walk the use/def chain. Normally, the only way to reach an instruction with
      an unsized type is via an instruction that has side effects (or otherwise will
      demand its input bits). That would stop the walk. However, if we have a
      readnone function that returns an unsized type (e.g., void), we must avoid
      asking for the demanded bits of the function call's return value. A
      void-returning readnone function is always dead (and so we can stop walking the
      use/def chain here), but the check is necessary to avoid asserting.
      
      Fixes PR34211.
      
      llvm-svn: 311014
      9e54b709
    • Dehao Chen's avatar
      Merge debug info when hoist then-else code to if. · 84d41203
      Dehao Chen authored
      Summary: When we move then-else code to if, we need to merge its debug info, otherwise the hoisted instruction may have inaccurate debug info attached.
      
      Reviewers: aprantl, probinson, dblaikie, echristo, loladiro
      
      Reviewed By: aprantl
      
      Subscribers: sanjoy, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D36778
      
      llvm-svn: 310985
      84d41203
    • Craig Topper's avatar
      [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector... · 0a1a276d
      Craig Topper authored
      [InstCombine] Teach canEvaluateZExtd and canEvaluateTruncated to handle vector shifts with splat shift amount
      
      We were only allowing ConstantInt before. This patch allows splat of ConstantInt too.
      
      Differential Revision: https://reviews.llvm.org/D36763
      
      llvm-svn: 310970
      0a1a276d
  3. Aug 15, 2017
    • Amjad Aboud's avatar
      [InstCombine] Added support for (X >>s C) << C --> X & (-1 << C) · 0464c5d9
      Amjad Aboud authored
      Differential Revision: https://reviews.llvm.org/D36743
      
      llvm-svn: 310949
      0464c5d9
    • Sanjay Patel's avatar
      [InstCombine] sink sext after ashr · f69b7d5c
      Sanjay Patel authored
      Narrow ops are better for bit-tracking, and in the case of vectors,
      may enable better codegen.
      
      As the trunc test shows, this can allow follow-on simplifications.
      
      There's a block of code in visitTrunc that deals with shifted ops
      with FIXME comments. It may be possible to remove some of that now,
      but I want to make sure there are no problems with this step first.
      
      http://rise4fun.com/Alive/Y3a
      
      Name: hoist_ashr_ahead_of_sext_1
        %s = sext i8 %x to i32
        %r = ashr i32 %s, 3  ; shift value is < than source bit width
        =>
        %a = ashr i8 %x, 3
        %r = sext i8 %a to i32
        
      Name: hoist_ashr_ahead_of_sext_2
        %s = sext i8 %x to i32
        %r = ashr i32 %s, 8  ; shift value is >= than source bit width
        =>
        %a = ashr i8 %x, 7   ; so clamp this shift value
        %r = sext i8 %a to i32
        
      Name: junc_the_trunc  
        %a = sext i16 %v to i32
        %s = ashr i32 %a, 18
        %t = trunc i32 %s to i16
        =>
        %t = ashr i16 %v, 15
      llvm-svn: 310942
      f69b7d5c
    • Jakub Kuderski's avatar
      [Dominators] Include infinite loops in PostDominatorTree · 638c085d
      Jakub Kuderski authored
      Summary:
      This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change.
      
      What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG.
      
      This patch makes the following assumptions:
      - A sequence of updates should produce the same tree as a recalculating it.
      - Any sequence of the same updates should lead to the same tree.
      - Siblings and roots are unordered.
      
      The last two properties are essential to efficiently perform batch updates in the future.
      When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently.
      
      This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough.
      That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping  clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call:
      
      ```
      # functions:  52283
      # samples:  337609
      # reverse unreachable BBs:  216022
      # BBs:  247840796
      Percent reverse-unreachable:  0.08716159869015269 %
      Max(PercRevUnreachable) in a function:  87.58620689655172 %
      # > 25 % samples:  471 ( 0.1395104988314885 % samples )
      ... in 145 ( 0.27733680163724345 % functions )
      ```
      
      Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway.
      
      I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :).
      
      Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel
      
      Reviewed By: dberlin
      
      Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050
      
      Differential Revision: https://reviews.llvm.org/D35851
      
      llvm-svn: 310940
      638c085d
    • Rui Ueyama's avatar
      Fix -Wunused-lambda-capture for Release build. · 4a179550
      Rui Ueyama authored
      `I` and `this` are used only in assert or DEBUG, so they are unused
      in Release build.
      
      llvm-svn: 310934
      4a179550
    • Ayal Zaks's avatar
      [LV] Minor savings to Sink casts to unravel first order recurrence · 25e2800e
      Ayal Zaks authored
      Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
      is not needed.
      
      Differential Revision: https://reviews.llvm.org/D36408
      
      llvm-svn: 310910
      25e2800e
    • Dinar Temirbulatov's avatar
      [SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0, · 9e43d6e7
      Dinar Temirbulatov authored
                      replace E->Scalars[0] to VL0, NFCI.
      
      llvm-svn: 310904
      9e43d6e7
    • Dehao Chen's avatar
      Add missing dependency in ICP. (NFC) · 45847d36
      Dehao Chen authored
      llvm-svn: 310896
      45847d36
    • Reid Kleckner's avatar
      Remove checks for debug info intrinsics in use lists, NFC · 18728822
      Reid Kleckner authored
      These haven't done anything since debug info intrinsics stopped
      appearing in Value use lists in 2014.
      
      llvm-svn: 310892
      18728822
  4. Aug 14, 2017
    • Craig Topper's avatar
      Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of... · 0aa3a195
      Craig Topper authored
      Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
      
      This recommits r310869, with the moved files and no extra changes.
      
      Original commit message:
      
      This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.
      
      I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.
      
      I also had to make decomposeBitTest support vectors since InstSimplify needs that.
      
      As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.
      
      Differential Revision: https://reviews.llvm.org/D36593
      
      llvm-svn: 310889
      0aa3a195
    • Andrew Kaylor's avatar
      Add strictfp attribute to prevent unwanted optimizations of libm calls · 53a5fbb4
      Andrew Kaylor authored
      Differential Revision: https://reviews.llvm.org/D34163
      
      llvm-svn: 310885
      53a5fbb4
    • Craig Topper's avatar
      Revert r310869 "[InstSimplify][InstCombine] Modify the interface of... · 69fa8e0d
      Craig Topper authored
      Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"
      
      Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.
      
      llvm-svn: 310873
      69fa8e0d
    • Craig Topper's avatar
      [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and... · 2f0b4506
      Craig Topper authored
      [InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify
      
      This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.
      
      I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.
      
      I also had to make decomposeBitTest support vectors since InstSimplify needs that.
      
      As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.
      
      Differential Revision: https://reviews.llvm.org/D36593
      
      llvm-svn: 310869
      2f0b4506
    • Dinar Temirbulatov's avatar
      [SLPVectorizer] Schedule bundle with different opcodes. · 7b78f5e5
      Dinar Temirbulatov authored
      This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ]
      
      Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab
      
      Subscribers: llvm-commits, rengolin
      
      Differential Revision: https://reviews.llvm.org/D36518
      
      llvm-svn: 310847
      7b78f5e5
    • Sanjay Patel's avatar
      [BDCE] reduce scope of an assert (PR34179) · a1067d9b
      Sanjay Patel authored
      The assert was added with r310779 and is usually correct,
      but as the test shows, not always. The 'volatile' on the
      load is needed to expose the faulty path because without
      it, DemandedBits would return that the load is just dead
      rather than not demanded, and so we wouldn't hit the
      bogus assert.
      
      Also, since the lambda is just a single-line now, get rid 
      of it and inline the DB.isAllOnesValue() calls. 
      
      This should fix (prevent execution of a faulty assert):
      https://bugs.llvm.org/show_bug.cgi?id=34179
      
      llvm-svn: 310842
      a1067d9b
    • Sam Parker's avatar
      [LoopUnroll] Enable option to peel remainder loop · 718c8a6a
      Sam Parker authored
      On some targets, the penalty of executing runtime unrolling checks
      and then not the unrolled loop can be significantly detrimental to
      performance. This results in the need to be more conservative with
      the unroll count, keeping a trip count of 2 reduces the overhead as
      well as increasing the chance of the unrolled body being executed. But
      being conservative leaves performance gains on the table.
      
      This patch enables the unrolling of the remainder loop introduced by
      runtime unrolling. This can help reduce the overhead of misunrolled
      loops because the cost of non-taken branches is much less than the
      cost of the backedge that would normally be executed in the remainder
      loop. This allows larger unroll factors to be used without suffering
      performance loses with smaller iteration counts.
      
      Differential Revision: https://reviews.llvm.org/D36309
      
      llvm-svn: 310824
      718c8a6a
    • Craig Topper's avatar
      [InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants · f7200990
      Craig Topper authored
      Summary:
      These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.
      
      Next step is to use m_APInt instead of ConstantInt.
      
      Reviewers: spatel, efriedma, davide, majnemer
      
      Reviewed By: spatel
      
      Subscribers: zzheng, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D36439
      
      llvm-svn: 310806
      f7200990
  5. Aug 12, 2017
  6. Aug 11, 2017
    • Eli Friedman's avatar
      [OptDiag] Updating Remarks in SampleProfile · 51cf2604
      Eli Friedman authored
      Updating remark API to newer OptimizationDiagnosticInfo API. This
      allows remarks to show up in diagnostic yaml file, and enables use
      of opt-viewer tool.
      
      Hotness information for remarks (L505 and L751) do not display hotness
      information, most likely due to profile information not being
      propagated yet. Unsure if this is the desired outcome.
      
      Patch by Tarun Rajendran.
      
      Differential Revision: https://reviews.llvm.org/D36127
      
      llvm-svn: 310763
      51cf2604
    • Xinliang David Li's avatar
      Fix typo /NFC · 24524f31
      Xinliang David Li authored
      llvm-svn: 310737
      24524f31
  7. Aug 10, 2017
    • Craig Topper's avatar
      [InstCombine] Make (X|C1)^C2 -> X^(C1^C2) iff X&~C1 == 0 work for splat vectors · 9a6110b2
      Craig Topper authored
      This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2).
      
      Differential Revision: https://reviews.llvm.org/D36505
      
      llvm-svn: 310658
      9a6110b2
    • Craig Topper's avatar
      [InstCombine] Fix a crash in getSelectCondition if we happen to have two... · 57b4d864
      Craig Topper authored
      [InstCombine] Fix a crash in getSelectCondition if we happen to have two inverse vectors of i1 constants.
      
      We used to try to truncate the constant vector to vXi1, but if it's already i1 this would fail. Instead we now use IRBuilder::getZExtOrTrunc which should check the type and only create a trunc if needed. I believe this should trigger constant folding in the IRBuilder and ultimately do the same thing just with the additional type check.
      
      llvm-svn: 310639
      57b4d864
    • Craig Topper's avatar
      [InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many... · cd13ebca
      Craig Topper authored
      [InstCombine] Add a DEBUG_COUNTER to InstCombine to limit how many instructions are visited for debug
      
      Sometimes it would be nice to stop InstCombine mid way through its combining to see the current IR. By using a debug counter we can place an upper limit on how many instructions to process.
      
      This will also allow skipping the first X combines, but that has the potential to change later combines since earlier canonicalizations might have been skipped.
      
      Differential Revision: https://reviews.llvm.org/D36553
      
      llvm-svn: 310638
      cd13ebca
    • Craig Topper's avatar
      [DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require... · 9cd976d0
      Craig Topper authored
      [DebugCounter] Move the semicolon out of the DEBUG_COUNTER macro and require it to be placed at the end of each use.
      
      This make it consistent with STATISTIC which it will often appears near.
      
      While there move one DEBUG_COUNTER instance out of an anonymous namespace. It's already declaring a static variable so the namespace is unnecessary.
      
      llvm-svn: 310637
      9cd976d0
    • Alexander Potapenko's avatar
      [sanitizer-coverage] Change cmp instrumentation to distinguish const operands · 52410815
      Alexander Potapenko authored
      This implementation of SanitizerCoverage instrumentation inserts different
      callbacks depending on constantness of operands:
      
        1. If both operands are non-const, then a usual
           __sanitizer_cov_trace_cmp[1248] call is inserted.
        2. If exactly one operand is const, then a
           __sanitizer_cov_trace_const_cmp[1248] call is inserted. The first
           argument of the call is always the constant one.
        3. If both operands are const, then no callback is inserted.
      
      This separation comes useful in fuzzing when tasks like "find one operand
      of the comparison in input arguments and replace it with the other one"
      have to be done. The new instrumentation allows us to not waste time on
      searching the constant operands in the input.
      
      Patch by Victor Chibotaru.
      
      llvm-svn: 310600
      52410815
    • Chad Rosier's avatar
  8. Aug 09, 2017
Loading