Skip to content
  1. Sep 27, 2016
    • Michael Zolotukhin's avatar
    • Adam Nemet's avatar
      Output optimization remarks in YAML · a62b7e1a
      Adam Nemet authored
      (Re-committed after moving the template specialization under the yaml
      namespace.  GCC was complaining about this.)
      
      This allows various presentation of this data using an external tool.
      This was first recommended here[1].
      
      As an example, consider this module:
      
        1 int foo();
        2 int bar();
        3
        4 int baz() {
        5   return foo() + bar();
        6 }
      
      The inliner generates these missed-optimization remarks today (the
      hotness information is pulled from PGO):
      
        remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
        remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)
      
      Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:
      
        --- !Missed
        Pass:            inline
        Name:            NotInlined
        DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
        Function:        baz
        Hotness:         30
        Args:
          - Callee: foo
          - String:  will not be inlined into
          - Caller: baz
        ...
        --- !Missed
        Pass:            inline
        Name:            NotInlined
        DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
        Function:        baz
        Hotness:         30
        Args:
          - Callee: bar
          - String:  will not be inlined into
          - Caller: baz
        ...
      
      This is a summary of the high-level decisions:
      
      * There is a new streaming interface to emit optimization remarks.
      E.g. for the inliner remark above:
      
         ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                      DEBUG_TYPE, "NotInlined", &I)
                  << NV("Callee", Callee) << " will not be inlined into "
                  << NV("Caller", CS.getCaller()) << setIsVerbose());
      
      NV stands for named value and allows the YAML client to process a remark
      using its name (NotInlined) and the named arguments (Callee and Caller)
      without parsing the text of the message.
      
      Subsequent patches will update ORE users to use the new streaming API.
      
      * I am using YAML I/O for writing the YAML file.  YAML I/O requires you
      to specify reading and writing at once but reading is highly non-trivial
      for some of the more complex LLVM types.  Since it's not clear that we
      (ever) want to use LLVM to parse this YAML file, the code supports and
      asserts that we're writing only.
      
      On the other hand, I did experiment that the class hierarchy starting at
      DiagnosticInfoOptimizationBase can be mapped back from YAML generated
      here (see D24479).
      
      * The YAML stream is stored in the LLVM context.
      
      * In the example, we can probably further specify the IR value used,
      i.e. print "Function" rather than "Value".
      
      * As before hotness is computed in the analysis pass instead of
      DiganosticInfo.  This avoids the layering problem since BFI is in
      Analysis while DiagnosticInfo is in IR.
      
      [1] https://reviews.llvm.org/D19678#419445
      
      Differential Revision: https://reviews.llvm.org/D24587
      
      llvm-svn: 282539
      a62b7e1a
    • Reid Kleckner's avatar
      [DebugInfo] Add comments to phi dbg.value tracking code, NFC · 6481822e
      Reid Kleckner authored
      LLVM developers might be surprised to learn that there are blocks
      without valid insertion points (catchswitch), so it seems worth calling
      that out explicitly.  Also add a FIXME about what we should really be
      doing if we ever need to make optimized Windows EH code debuggable.
      
      While I'm here, make auto usage more consistent with LLVM standards and
      avoid an unecessary call to insertBefore.
      
      llvm-svn: 282521
      6481822e
    • Adam Nemet's avatar
      Revert "Output optimization remarks in YAML" · cc2a3fa8
      Adam Nemet authored
      This reverts commit r282499.
      
      The GCC bots are failing
      
      llvm-svn: 282503
      cc2a3fa8
    • Adam Nemet's avatar
      Output optimization remarks in YAML · 92e928c1
      Adam Nemet authored
      This allows various presentation of this data using an external tool.
      This was first recommended here[1].
      
      As an example, consider this module:
      
        1 int foo();
        2 int bar();
        3
        4 int baz() {
        5   return foo() + bar();
        6 }
      
      The inliner generates these missed-optimization remarks today (the
      hotness information is pulled from PGO):
      
        remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
        remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)
      
      Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:
      
        --- !Missed
        Pass:            inline
        Name:            NotInlined
        DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
        Function:        baz
        Hotness:         30
        Args:
          - Callee: foo
          - String:  will not be inlined into
          - Caller: baz
        ...
        --- !Missed
        Pass:            inline
        Name:            NotInlined
        DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
        Function:        baz
        Hotness:         30
        Args:
          - Callee: bar
          - String:  will not be inlined into
          - Caller: baz
        ...
      
      This is a summary of the high-level decisions:
      
      * There is a new streaming interface to emit optimization remarks.
      E.g. for the inliner remark above:
      
         ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                      DEBUG_TYPE, "NotInlined", &I)
                  << NV("Callee", Callee) << " will not be inlined into "
                  << NV("Caller", CS.getCaller()) << setIsVerbose());
      
      NV stands for named value and allows the YAML client to process a remark
      using its name (NotInlined) and the named arguments (Callee and Caller)
      without parsing the text of the message.
      
      Subsequent patches will update ORE users to use the new streaming API.
      
      * I am using YAML I/O for writing the YAML file.  YAML I/O requires you
      to specify reading and writing at once but reading is highly non-trivial
      for some of the more complex LLVM types.  Since it's not clear that we
      (ever) want to use LLVM to parse this YAML file, the code supports and
      asserts that we're writing only.
      
      On the other hand, I did experiment that the class hierarchy starting at
      DiagnosticInfoOptimizationBase can be mapped back from YAML generated
      here (see D24479).
      
      * The YAML stream is stored in the LLVM context.
      
      * In the example, we can probably further specify the IR value used,
      i.e. print "Function" rather than "Value".
      
      * As before hotness is computed in the analysis pass instead of
      DiganosticInfo.  This avoids the layering problem since BFI is in
      Analysis while DiagnosticInfo is in IR.
      
      [1] https://reviews.llvm.org/D19678#419445
      
      Differential Revision: https://reviews.llvm.org/D24587
      
      llvm-svn: 282499
      92e928c1
    • Kostya Serebryany's avatar
      [sanitizer-coverage] fix a bug in trace-gep · 45c14475
      Kostya Serebryany authored
      llvm-svn: 282467
      45c14475
    • Kostya Serebryany's avatar
    • Ivan Krasin's avatar
      Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation · 4ff4f21e
      Ivan Krasin authored
      Summary:
      We don't currently need this facility for CFI. Disabling individual hot methods proved
      to be a better strategy in Chrome.
      
      Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne.
      
      Reviewers: pcc
      
      Subscribers: kcc
      
      Differential Revision: https://reviews.llvm.org/D24948
      
      llvm-svn: 282461
      4ff4f21e
    • Peter Collingbourne's avatar
      LowerTypeTests: Remove unused variable. · 53a852b6
      Peter Collingbourne authored
      llvm-svn: 282456
      53a852b6
    • Peter Collingbourne's avatar
      LowerTypeTests: Create LowerTypeTestsModule class and move implementation... · 6ed92e3f
      Peter Collingbourne authored
      LowerTypeTests: Create LowerTypeTestsModule class and move implementation there. Related simplifications.
      
      llvm-svn: 282455
      6ed92e3f
  2. Sep 26, 2016
    • Piotr Padlewski's avatar
      [thinlto] Basic thinlto fdo heuristic · d9830eb7
      Piotr Padlewski authored
      Summary:
      This patch improves thinlto importer
      by importing 3x larger functions that are called from hot block.
      
      I compared performance with the trunk on spec, and there
      were about 2% on povray and 3.33% on milc. These results seems
      to be consistant and match the results Teresa got with her simple
      heuristic. Some benchmarks got slower but I think they are just
      noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
      more iterations to confirm. Geomean of all benchmarks including the noisy ones
      were about +0.02%.
      
      I see much better improvement on google branch with Easwaran patch
      for pgo callsite inlining (the inliner actually inline those big functions)
      Over all I see +0.5% improvement, and I get +8.65% on povray.
      So I guess we will see much bigger change when Easwaran patch will land
      (it depends on new pass manager), but it is still worth putting this to trunk
      before it.
      
      Implementation details changes:
      - Removed CallsiteCount.
      - ProfileCount got replaced by Hotness
      - hot-import-multiplier is set to 3.0 for now,
      didn't have time to tune it up, but I see that we get most of the interesting
      functions with 3, so there is no much performance difference with higher, and
      binary size doesn't grow as much as with 10.0.
      
      Reviewers: eraman, mehdi_amini, tejohnson
      
      Subscribers: mehdi_amini, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24638
      
      llvm-svn: 282437
      d9830eb7
    • Daniel Berlin's avatar
      Remove pruning of phi nodes in MemorySSA - it makes updating harder · 1e98c042
      Daniel Berlin authored
      Reviewers: george.burgess.iv
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24923
      
      llvm-svn: 282419
      1e98c042
    • Matthew Simpson's avatar
      [LV] Scalarize instructions marked scalar after vectorization · b764aba2
      Matthew Simpson authored
      This patch ensures that we actually scalarize instructions marked scalar after
      vectorization. Previously, such instructions may have been vectorized instead.
      
      Differential Revision: https://reviews.llvm.org/D23889
      
      llvm-svn: 282418
      b764aba2
    • Gor Nishanov's avatar
      [Coroutines] Part14: Handle coroutines with no suspend points. · bc0ebb38
      Gor Nishanov authored
      Summary:
      If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function.
      
      Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>.
      
      Reviewers: majnemer
      
      Subscribers: mehdi_amini, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24408
      
      llvm-svn: 282414
      bc0ebb38
    • Alexey Bataev's avatar
      [InstCombine] Fixed bug introduced in r282237 · 793c946e
      Alexey Bataev authored
      The index of the new insertelement instruction was evaluated in the
      wrong way, it was considered as the index of the inserted value instead
      of index of the position, where the value should be inserted.
      
      llvm-svn: 282401
      793c946e
    • Andrea Di Biagio's avatar
      [InstCombine] Teach the udiv folding logic how to handle constant expressions. · a82d52d1
      Andrea Di Biagio authored
      This patch fixes PR30366.
      
      Function foldUDivShl() worked under the assumption that one of the values
      in input to the function was always an instance of llvm::Instruction.
      However, function visitUDivOperand() (the only user of foldUDivShl) was
      clearly violating that precondition; internally, visitUDivOperand() uses pattern
      matches to check the operands of a udiv. Pattern matchers for binary operators
      know how to handle both Instruction and ConstantExpr values.
      
      This patch fixes the problem in foldUDivShl(). Now we use pattern matchers
      instead of explicit casts to Instruction. The reduced test case from PR30366
      has been added to test file InstCombine/udiv-simplify.ll.
      
      Differential Revision: https://reviews.llvm.org/D24565
      
      llvm-svn: 282398
      a82d52d1
  3. Sep 24, 2016
    • Duncan P. N. Exon Smith's avatar
      ObjCARC: Don't look at users of ConstantData · 11c06ea5
      Duncan P. N. Exon Smith authored
      Stop looking at users of UndefValue and ConstantPointerNull in the
      objective C ARC optimizers.  The other users aren't actually
      interesting, since they're not pointing at a particular object.  I
      imagine these calls could be optimized through -instcombine... maybe
      they already are?
      
      These early returns will be required at some point in the future, with a
      WIP patch that asserts when someone accesses a use-list on ConstantData.
      
      llvm-svn: 282338
      11c06ea5
    • Duncan P. N. Exon Smith's avatar
      Scalar: Ignore ConstantData in processAssumption · 4fd9b7e1
      Duncan P. N. Exon Smith authored
      Assumptions on UndefValue and ConstantPointerNull aren't relevant to
      other users.  Ignore them entirely to avoid wasting cycles walking
      through their (possibly extremely extensive (cross-module)) use-lists.
      
      It wasn't clear how to add a specific test for this, and it'll be
      covered anyway by an eventual patch that asserts when trying to access
      the use-list of an instance of ConstantData.
      
      llvm-svn: 282334
      4fd9b7e1
    • Duncan P. N. Exon Smith's avatar
      GlobalStatus: Don't walk use-lists of ConstantData · c82c1142
      Duncan P. N. Exon Smith authored
      Return early from llvm::isSafeToDestroyConstant() whenever the value
      `isa<ConstantData>()`.  These constants are shared across the
      LLVMContext.  We never really want to delete them here, and walking
      their use-lists can be very expensive.
      
      (This is motivated by an eventual goal of removing use-lists entirely
      from ConstantData.)
      
      llvm-svn: 282320
      c82c1142
  4. Sep 23, 2016
    • Alexey Bataev's avatar
      [InstCombine] Fix for PR29124: reduce insertelements to shufflevector · fee9078d
      Alexey Bataev authored
      If inserting more than one constant into a vector:
      
      define <4 x float> @foo(<4 x float> %x) {
        %ins1 = insertelement <4 x float> %x, float 1.0, i32 1
        %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2
        ret <4 x float> %ins2
      }
      
      InstCombine could reduce that to a shufflevector:
      
      define <4 x float> @goo(<4 x float> %x) {
       %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3>
       ret <4 x float> %shuf
      }
      Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e.
      shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float
      undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> ->
      insertelement <4 x float> %v, float 1.0, 1
      
      Differential Revision: https://reviews.llvm.org/D24182
      
      llvm-svn: 282237
      fee9078d
    • Sanjay Patel's avatar
      [InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672) · 30ef70b0
      Sanjay Patel authored
      We already have the udiv variant of this transform, so I think this is ok for 
      InstCombine too even though there is an increase in IR instructions. As the 
      tests and TODO comments show, the transform can lead to follow-on combines.
      
      This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672
      
      Differential Revision: https://reviews.llvm.org/D24527
      
      llvm-svn: 282209
      30ef70b0
  5. Sep 22, 2016
    • Hans Wennborg's avatar
      Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)" · c7957ef8
      Hans Wennborg authored
      and also the dependent r282175 "GVN-hoist: do not dereference null pointers"
      
      It's causing compiler crashes building Harfbuzz (PR30499).
      
      llvm-svn: 282199
      c7957ef8
    • Sebastian Pop's avatar
      GVN-hoist: do not dereference null pointers · 1531f30c
      Sebastian Pop authored
      there may be basic blocks without memory accesses, in which case the
      list of accesses is a null pointer.
      
      llvm-svn: 282175
      1531f30c
    • Sebastian Pop's avatar
      GVN-hoist: fix store past load dependence analysis (PR30216) · 8e6e3318
      Sebastian Pop authored
      To hoist stores past loads, we used to search for potential
      conflicting loads on the hoisting path by following a MemorySSA
      def-def link from the store to be hoisted to the previous
      defining memory access, and from there we followed the def-use
      chains to all the uses that occur on the hoisting path. The
      problem is that the def-def link may point to a store that does
      not alias with the store to be hoisted, and so the loads that are
      walked may not alias with the store to be hoisted, and even as in
      the testcase of PR30216, the loads that may alias with the store
      to be hoisted are not visited.
      
      The current patch visits all loads on the path from the store to
      be hoisted to the hoisting position and uses the alias analysis
      to ask whether the store may alias the load. I was not able to
      use the MemorySSA functionality to ask for whether load and
      store are clobbered: I'm not sure which function to call, so I
      used a call to AA->isNoAlias().
      
      Store past store is still working as before using a MemorySSA
      query: I added an extra test to pr30216.ll to make sure store
      past store does not regress.
      
      Differential Revision: https://reviews.llvm.org/D24517
      
      llvm-svn: 282168
      8e6e3318
    • Sebastian Pop's avatar
      GVN-hoist: fix typo · 5d68aa79
      Sebastian Pop authored
      llvm-svn: 282165
      5d68aa79
    • Etienne Bergeron's avatar
      [compiler-rt] fix typo in option description [NFC] · 7f0e3153
      Etienne Bergeron authored
      llvm-svn: 282163
      7f0e3153
    • Sebastian Pop's avatar
      GVN-hoist: only hoist relevant scalar instructions · 440f15b7
      Sebastian Pop authored
      Without this patch, GVN-hoist would think that a branch instruction is a scalar instruction
      and would try to value number it. The patch filters out all such kind of irrelevant instructions.
      
      A bit frustrating is that there is no easy way to discard all those very infrequent instructions,
      a bit like isa<TerminatorInst> that stands for a large family of instructions. I'm thinking that
      checking for those very infrequent other instructions would cost us more in compilation time
      than just letting those instructions getting numbered, so I'm still thinking that a simpler check:
      
        if (isa<TerminatorInst>(I))
          return false;
      
      is better than listing all the other less frequent instructions.
      
      Differential Revision: https://reviews.llvm.org/D23929
      
      llvm-svn: 282160
      440f15b7
    • Keith Walker's avatar
      Reapplying r281895 (and follow-up r281964) after fixing pr30468. · ba159897
      Keith Walker authored
      The additional fix is:
      
      When adding debug information to a lowered phi node in mem2reg
      check that we have a valid insertion point after the phi for adding
      the debug information.
      
      This change addresses the issue in pr30468 where a lowered phi was
      added before a catchswitch and no debug information should be added
      after the phi in this case.
      
      Differential Revision: https://reviews.llvm.org/D24797
      
      llvm-svn: 282155
      ba159897
    • Anna Thomas's avatar
      [RS4GC] Remat in presence of phi and use live value · 82c3717f
      Anna Thomas authored
      Summary:
      
      Reviewers:
      
      Subscribers:
      
      llvm-svn: 282150
      82c3717f
    • Sagar Thakur's avatar
      [EfficiencySanitizer] Using '$' instead of '#' for struct counter name · e74eb4e7
      Sagar Thakur authored
      For MIPS '#' is the start of comment line. Therefore we get assembler errors if # is used in the structure names.
      
      Differential: D24334
      Reviewed by: zhaoqin
      
      llvm-svn: 282141
      e74eb4e7
    • Dorit Nuzman's avatar
      Fix revision 281960 · d1247a68
      Dorit Nuzman authored
      llvm-svn: 282139
      d1247a68
  6. Sep 21, 2016
Loading