Skip to content
  1. Jun 06, 2017
    • Davide Italiano's avatar
      [SelectionDAG] Update the dominator after splitting critical edges. · fb4d5c09
      Davide Italiano authored
      Running `llc -verify-dom-info` on the attached testcase results in a
      crash in the verifier, due to a stale dominator tree.
      
      i.e.
      
        DominatorTree is not up to date!
        Computed:
        =============================--------------------------------
        Inorder Dominator Tree:
          [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
            [2] %lor.lhs.false.i61.i.i.i {1,2}
            [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
              [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
      
        Actual:
        =============================--------------------------------
        Inorder Dominator Tree:
          [1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
            [2] %lor.lhs.false.i61.i.i.i {1,2}
            [2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
              [3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
              [3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}
      
      This is because in `SelectionDAGIsel` we split critical edges without
      updating the corresponding dominator for the function (and we claim
      in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).
      
      We could either stop preserving the domtree in `getAnalysisUsage`
      or tell `splitCriticalEdge()` to update it.
      As the second option is easy to implement, that's the one I chose.
      
      Differential Revision:  https://reviews.llvm.org/D33800
      
      llvm-svn: 304742
      fb4d5c09
  2. Jun 05, 2017
    • Sanjay Patel's avatar
      [DAGCombine] Fix unchecked calls to DAGCombiner::*ExtPromoteOperand · 6350de76
      Sanjay Patel authored
      Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode. 
      Falling back to any extend in this case instead of failing outright seems correct to me.
      
      No test case because:
      The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload 
      TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked 
      illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks 
      setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16  , Legal);.
      
      Patch by Jacob Young!
      
      Differential Revision: https://reviews.llvm.org/D33633
      
      llvm-svn: 304723
      6350de76
  3. Jun 03, 2017
  4. Jun 02, 2017
  5. Jun 01, 2017
  6. May 31, 2017
    • Nirav Dave's avatar
      [ScheduleDAG] Deal with already scheduled loads in ScheduleDAG. · 3424373f
      Nirav Dave authored
      Summary:
      If we attempt to unfold an SUnit in ScheduleDAG that results in
      finding an already scheduled load, we must should abort the
      unfold as it will not improve scheduling.
      
      This fixes PR32610.
      
      Reviewers: jmolloy, sunfish, bogner, spatel
      
      Subscribers: llvm-commits, MatzeB
      
      Differential Revision: https://reviews.llvm.org/D32911
      
      llvm-svn: 304321
      3424373f
    • Nirav Dave's avatar
      [DAG] Avoid use of stale store. · 7c70fddb
      Nirav Dave authored
      Correct references to alignment of store which may be deleted in a
      previous iteration of merge. Instead use first store that would be
      merged.
      
      Corrects pr33172's use-after-poison caught by ASan.
      
      Reviewers: spatel, hfinkel, RKSimon
      
      Reviewed By: RKSimon
      
      Subscribers: thegameg, javed.absar, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D33686
      
      llvm-svn: 304299
      7c70fddb
  7. May 30, 2017
    • Craig Topper's avatar
      [SelectionDAG] Remove special case for ISD::FPOWI from the strict FP intrinsic handling. · 5fd588be
      Craig Topper authored
      This code was compensating for FPOWI defaulting to Legal and many targets not changing it to Expand. This was fixed in r304215 to default to Expand so this special handling should no longer be necessary.
      
      llvm-svn: 304221
      5fd588be
    • Craig Topper's avatar
      [SelectionDAG] Set ISD::FPOWI to Expand by default · f6d4dc5b
      Craig Topper authored
      Summary:
      Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".
      
      This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.
      
      Reviewers: spatel, RKSimon, efriedma
      
      Reviewed By: RKSimon
      
      Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D33530
      
      llvm-svn: 304215
      f6d4dc5b
  8. May 29, 2017
  9. May 27, 2017
    • Sanjay Patel's avatar
      [DAGCombiner] use narrow load to avoid vector extract · 33f4a972
      Sanjay Patel authored
      If we have (extract_subvector(load wide vector)) with no other users, 
      that can just be (load narrow vector). This is intentionally conservative.
      Follow-ups may loosen the one-use constraint to account for the extract cost
      or just remove the one-use check.
      
      The memop chain updating is based on code that already exists multiple times
      in x86 lowering, so that should be pulled into a helper function as a follow-up.
      
      Background: this is a potential improvement noticed via regressions caused by
      making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see 
      comments in D33137).
      
      Differential Revision: https://reviews.llvm.org/D33578
      
      llvm-svn: 304072
      33f4a972
  10. May 26, 2017
    • Benjamin Kramer's avatar
      Make helper functions static. NFC. · debb3c35
      Benjamin Kramer authored
      llvm-svn: 304029
      debb3c35
    • Sanjay Patel's avatar
      [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790) · ec13ebf2
      Sanjay Patel authored
      In the best case:
      extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN
      ...we kill all of the extract/concat and just have narrow binops remaining.
      
      If only one of the binop operands is amenable, this transform is still
      worthwhile because we kill some of the extract/concat.
      
      Optional bitcasting makes the code more complicated, but there doesn't
      seem to be a way to avoid that.
      
      The TODO about extending to more than bitwise logic is there because we really
      will regress several x86 tests including madd, psad, and even a plain
      integer-multiply-by-2 or shift-left-by-1. I don't think there's anything
      fundamentally wrong with this patch that would cause those regressions; those
      folds are just missing or brittle.
      
      If we extend to more binops, I found that this patch will fire on at least one
      non-x86 regression test. There's an ARM NEON test in
      test/CodeGen/ARM/coalesce-subregs.ll with a pattern like:
      
                  t5: v2f32 = vector_shuffle<0,3> t2, t4
                t6: v1i64 = bitcast t5
                t8: v1i64 = BUILD_VECTOR Constant:i64<0>
              t9: v2i64 = concat_vectors t6, t8
            t10: v4f32 = bitcast t9
          t12: v4f32 = fmul t11, t10
        t13: v2i64 = bitcast t12
      t16: v1i64 = extract_subvector t13, Constant:i32<0>
      
      There was no functional change in the codegen from this transform from what I
      could see though.
      
      For the x86 test changes:
      
      1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case,
         but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops,
         there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op.
         SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat
         ops to match the pattern.
      2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract.
         Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026
      3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT.
      4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction
         count by one in each case by eliminating two insert/extract while adding one narrower logic op.
      
      https://bugs.llvm.org/show_bug.cgi?id=32790
      
      Differential Revision: https://reviews.llvm.org/D33137
      
      llvm-svn: 303997
      ec13ebf2
    • Nirav Dave's avatar
      [DAG] Move legal type checks in store merge to be checked only · 689709c9
      Nirav Dave authored
      on non-legal cases. NFC.
      
      llvm-svn: 303994
      689709c9
    • John Brawn's avatar
      [ARM] Fix lowering of misaligned memcpy/memset · 9009d290
      John Brawn authored
      Currently getOptimalMemOpType returns i32 for large enough sizes without
      checking for alignment, leading to poor code generation when misaligned accesses
      aren't permitted as we generate a word store then later split it up into byte
      stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for
      memset we splat the memset value into a word then immediately split it up
      again.
      
      Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type
      to use, but also fix a bug there where it wasn't correctly checking if
      misaligned memory accesses are allowed.
      
      Differential Revision: https://reviews.llvm.org/D33442
      
      llvm-svn: 303990
      9009d290
  11. May 25, 2017
  12. May 24, 2017
  13. May 23, 2017
  14. May 22, 2017
  15. May 20, 2017
    • Matthias Braun's avatar
      SimplifyLibCalls: Optimize wcslen · 50ec0b5d
      Matthias Braun authored
      Refactor the strlen optimization code to work for both strlen and wcslen.
      
      This especially helps with programs in the wild where people pass
      L"string"s to const std::wstring& function parameters and the wstring
      constructor gets inlined.
      
      This also fixes a lingerind API problem/bug in getConstantStringInfo()
      where zeroinitializers would always give you an empty string (without a
      length) back regardless of the actual length of the initializer which
      did not work well in the TrimAtNul==false causing the PR mentioned
      below.
      
      Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
      memcpy lowering and may lead to some cases for out-of-bounds
      zeroinitializer accesses not getting optimized anymore. So some code
      with UB may produce out of bound memory reads now instead of just
      producing zeros.
      
      The refactoring "accidentally" fixes http://llvm.org/PR32124
      
      Differential Revision: https://reviews.llvm.org/D32839
      
      llvm-svn: 303461
      50ec0b5d
  16. May 19, 2017
  17. May 18, 2017
    • Craig Topper's avatar
      [Statistics] Add a method to atomically update a statistic that contains a maximum · 8a950275
      Craig Topper authored
      Summary:
      There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways:
      
        MaxNumFoo = std::max(MaxNumFoo, NumFoo);
      
      or
      
        MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo;
      
      The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare.  But we have no way of knowing if the value was changed by another thread between the reads and the writes.
      
      This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again.
      
      This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ
      
      Reviewers: dberlin, chandlerc, hfinkel, dblaikie
      
      Reviewed By: chandlerc
      
      Subscribers: llvm-commits, sanjoy
      
      Differential Revision: https://reviews.llvm.org/D33301
      
      llvm-svn: 303318
      8a950275
  18. May 16, 2017
    • Nirav Dave's avatar
      Elide stores which are overwritten without being observed. · da8f2212
      Nirav Dave authored
      Summary:
      In SelectionDAG, when a store is immediately chained to another store
      to the same address, elide the first store as it has no observable
      effects. This is causes small improvements dealing with intrinsics
      lowered to stores.
      
      Test notes:
      
      * Many testcases overwrite store addresses multiple times and needed
        minor changes, mainly making stores volatile to prevent the
        optimization from optimizing the test away.
      
      * Many X86 test cases optimized out instructions associated with
        associated with va_start.
      
      * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
        dependencies to check and can probably be removed and potentially
        replaced with another test.
      
      Reviewers: rnk, john.brawn
      
      Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D33206
      
      llvm-svn: 303198
      da8f2212
    • Nirav Dave's avatar
      [DAG] Prune deleted nodes in TokenFactor · cfd357a6
      Nirav Dave authored
      Fix visitTokenFactor to correctly remove deleted nodes. NFC.
      
      llvm-svn: 303181
      cfd357a6
    • Peter Collingbourne's avatar
      IR: Give function GlobalValue::getRealLinkageName() a less misleading name:... · 6f0ecca3
      Peter Collingbourne authored
      IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape().
      
      This function gives the wrong answer on some non-ELF platforms in some
      cases. The function that does the right thing lives in Mangler.h. To try to
      discourage people from using this function, give it a different name.
      
      Differential Revision: https://reviews.llvm.org/D33162
      
      llvm-svn: 303134
      6f0ecca3
Loading