Skip to content
  1. Nov 14, 2019
  2. Nov 13, 2019
    • Craig Topper's avatar
      [X86] Don't set the operation action for i16 SINT_TO_FP to Promote just because SSE1 is enabled. · f7e9d81a
      Craig Topper authored
      Instead do custom promotion in the handler so that we can still
      allow i16 to be used with fp80. And f64 without sse2.
      f7e9d81a
    • Craig Topper's avatar
      [X86] Fix typo in comment. NFC · 787595b2
      Craig Topper authored
      787595b2
    • Craig Topper's avatar
      [X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same... · fee90672
      Craig Topper authored
      [X86] Move all the FP_TO_XINT/XINT_TO_FP setOperationActions into the same !useSoftFloat block. Qualify all of the Promote actions for these with !useSoftFloat too. NFCI
      
      The Promote action doesn't apply until LegalizeDAG. By the time
      we get there, we would have already softened all the FP operations
      if useSoftFloat was true. So there wouldn't be any operation left
      to Promote.
      fee90672
    • Hiroshi Yamauchi's avatar
      [PGO][PGSO] Temporarily disable the large working set size behavior. · 3f0969da
      Hiroshi Yamauchi authored
      Summary:
      This temporarily disables the large working set size behavior in profile guided
      size optimization due to internal benchmark regressions.
      
      Reviewers: davidxl
      
      Subscribers: hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D70207
      3f0969da
    • Sanjay Patel's avatar
      [SLP] fix miscompile on min/max reductions with extra uses (PR43948) · a3e61946
      Sanjay Patel authored
      The bug manifests as replacing a reduction operand with an undef
      value.
      
      The problem appears to be limited to cases where a min/max reduction
      has extra uses of the compare operand to the select.
      
      In the general case, we are tracking "ExternallyUsedValues" and
      an "IgnoreList" of the reduction operations, but those may not apply
      to the final compare+select in a min/max reduction.
      
      For that, we use replaceAllUsesWith (RAUW) to ensure that the new
      vectorized reduction values are transferred to all subsequent users.
      
      Differential Revision: https://reviews.llvm.org/D70148
      a3e61946
    • Craig Topper's avatar
      [TargetLowering] Increase the storage size of NumRegistersForVT to allow the... · 84e83b54
      Craig Topper authored
      [TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly
      
      v256i1 on X86 without avx512 breaks down to 256 i8 values when passed between basic blocks. But the NumRegistersForVT was sized at a byte for each VT. This results in 256 being stored as 0.
      
      This patch enlarges the type to 16 bits and adds an assert to ensure that no information is lost when the entry is stored.
      
      Differential Revision: https://reviews.llvm.org/D70138
      84e83b54
    • Simon Atanasyan's avatar
      63bbbcde
    • Simon Atanasyan's avatar
      3216d284
    • Quentin Colombet's avatar
      [LiveInterval] Allow updating subranges with slightly out-dated IR · de94cda8
      Quentin Colombet authored
      During register coalescing, we update the live-intervals on-the-fly.
      To do that we are in this strange mode where the live-intervals can
      be slightly out-of-sync (more precisely they are forward looking)
      compared to what the IR actually represents.
      This happens because the register coalescer only updates the IR when
      it is done with updating the live-intervals and it has to do it this
      way because updating the IR on-the-fly would actually clobber some
      information on how the live-ranges that are being updated look like.
      
      This is problematic for updates that rely on the IR to accurately
      represents the state of the live-ranges. Right now, we have only
      one of those: stripValuesNotDefiningMask.
      To reconcile this need of out-of-sync IR, this patch introduces a
      new argument to LiveInterval::refineSubRanges that allows the code
      doing the live range updates to reason about how the code should
      look like after the coalescer will have rewritten the registers.
      Essentially this captures how a subregister index with be offseted
      to match its position in a new register class.
      
      E.g., let say we want to merge:
          V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32>
      
      We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32>
      overlap, i.e., by choosing a class where we can find "offset + 1 == 3".
      Put differently we align V2's sub3 with V1's sub1:
          V2: sub0 sub1 sub2 sub3
          V1: <offset>  sub0 sub1
      
      This offset will look like a composed subregidx in the the class:
           V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
       =>  V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
      
      Now if we didn't rewrite the uses and def of V1, all the checks for V1
      need to account for this offset to match what the live intervals intend
      to capture.
      
      Prior to this patch, we would fail to recognize the uses and def of V1
      and would end up with machine verifier errors: No live segment at def.
      This could lead to miscompile as we would drop some live-ranges and
      thus, miss some interferences.
      
      For this problem to trigger, we need to reach stripValuesNotDefiningMask
      while having a mismatch between the IR and the live-ranges (i.e.,
      we have to apply a subreg offset to the IR.)
      
      This requires the following three conditions:
      1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1>
      2. An update with Tuple registers with a possibility to coalesce the
         subreg index: e.g., v1.dsub_1 == v2.dsub_3
      3. Subreg liveness enabled.
      
      looking at the IR to decide what is alive and what is not, i.e., calling
      stripValuesNotDefiningMask.
      coalescer maintains for the live-ranges information.
      
      None of the targets that currently use subreg liveness (i.e., the targets
      that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and
      and #2, so this patch also artificial enables subreg liveness for ARM,
      so that a nice test case can be attached.
      de94cda8
    • Ahmed Bougacha's avatar
      [AArch64][v8.3a] Add missing imp-defs on RETA*. · 7313d7d6
      Ahmed Bougacha authored
      RETA always implicitly uses LR, unlike RET which merely has an
      alias that defaults it to LR.
      Additionally, RETA implicitly uses SP as well, which it uses as
      a discriminator to authenticate LR.
      
      This isn't usually noticeable, because RET_ReallyLR is used in most
      of the backend.  However, the post-RA scheduler, if enabled, will
      cause miscompiles if the imp-uses are missing.
      
      While there, fix a typo in the lone affected testcase.
      7313d7d6
    • Ahmed Bougacha's avatar
      [AArch64][v8.3a] Add LDRA '[xN]!' alias. · 643ac6c0
      Ahmed Bougacha authored
      The instruction definition has been retroactively expanded to
      allow for an alias for '[xN, 0]!' as '[xN]!'.
      That wouldn't make sense on LDR, but does for LDRA.
      643ac6c0
    • David Stenberg's avatar
      Fix typo in DwarfDebug [NFC] · 7417cc14
      David Stenberg authored
      7417cc14
    • Sanjay Patel's avatar
    • Matthew Malcomson's avatar
    • Sanjay Patel's avatar
      [InstCombine] propagate fast-math-flags (FMF) to select when inverting fcmp+select · 3d6b5398
      Sanjay Patel authored
      As noted by the FIXME comment, this is not correct based on our current FMF semantics.
      We should be propagating FMF from the final value in a sequence (in this case the
      'select'). So the behavior even without this patch is wrong, but we did not allow FMF
      on 'select' until recently.
      
      But if we do the correct thing right now in this patch, we'll inevitably introduce
      regressions because we have not wired up FMF propagation for 'phi' and 'select' in
      other passes (like SimplifyCFG) or other places in InstCombine. I'm not seeing a
      better incremental way to make progress.
      
      That said, the potential extra damage over the existing wrong behavior from this
      patch is very limited. AFAIK, the only way to have different FMF on IR in the same
      function is if we have LTO inlined IR from 2 modules that were compiled using
      different fast-math settings.
      
      As seen in the tests, we may actually see some improvements with this patch because
      adding the FMF to the 'select' allows matching to min/max intrinsics that were
      previously missed (in the common case, the 'fcmp' and 'select' should have identical
      FMF to begin with).
      
      Next steps in the transition:
      
          Make similar changes in instcombine as needed.
          Enable phi-to-select FMF propagation in SimplifyCFG.
          Remove dependencies on fcmp with FMF.
          Deprecate FMF on fcmp.
      
      Differential Revision: https://reviews.llvm.org/D69720
      3d6b5398
    • Pavel Labath's avatar
      DWARFDebugLoclists: Add an api to get the location lists of a DWARF unit · 1eea3fa0
      Pavel Labath authored
      Summary:
      This avoid the need to duplicate the location lists searching logic in
      various users. The "inline location list dumping" code (which is the
      only user actually updated to handle DWARF v5 location lists)  is
      switched to this method. After adding v4 location list support, I'll
      switch other users too.
      
      Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX
      
      Subscribers: hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D70084
      1eea3fa0
    • Simon Pilgrim's avatar
      86f07e82
    • Simon Pilgrim's avatar
      Fix uninitialized variable warning. NFCI. · e1670175
      Simon Pilgrim authored
      e1670175
    • Simon Pilgrim's avatar
      Fix uninitialized variable warning. NFCI. · 29a5a6ee
      Simon Pilgrim authored
      29a5a6ee
    • Simon Pilgrim's avatar
      Fix uninitialized variable warning. NFCI. · 6ebc5089
      Simon Pilgrim authored
      6ebc5089
    • Simon Pilgrim's avatar
      b3be859b
    • Simon Pilgrim's avatar
      PPCReduceCRLogicals - fix static analyzer warnings. NFC · 66f2ed07
      Simon Pilgrim authored
      - Fix uninitialized variable warnings.
      - Fix null dereference warnings.
      66f2ed07
    • Simon Pilgrim's avatar
      SLPVectorizer - make comparison operators + isInSchedulingRegion const · d1bd5e47
      Simon Pilgrim authored
      Fixes cppcheck warnings.
      d1bd5e47
    • Florian Hahn's avatar
      [InstCombine] Avoid moving ops that do restrict undef across shuffles. · f7499011
      Florian Hahn authored
      I think we have to be a bit more careful when it comes to moving
      ops across shuffles, if the op does restrict undef. For example, without
      this patch, we would move 'and %v, <0, 0, -1, -1>' over a
      'shufflevector %a, undef, <undef, undef, 1, 2>'. As a result, the first
      2 lanes of the result are undef after the combine, but they really
      should be 0, unless I am missing something.
      
      For ops that do fold to undef on undef operands, the current behavior
      should be fine. I've add conservative check OpDoesRestrictUndef, maybe
      there's a better existing utility?
      
      Reviewers: spatel, RKSimon, lebedev.ri
      
      Reviewed By: spatel
      
      Differential Revision: https://reviews.llvm.org/D70093
      f7499011
    • Luís Marques's avatar
      Revert "[RISCV] Fix wrong CFI directives" · c5b56caa
      Luís Marques authored
      test/DebugInfo/RISCV/relax-debug-frame.ll wasn't properly updated.
      c5b56caa
    • Sjoerd Meijer's avatar
      [ARM][MVE] canTailPredicateLoop · d90804d2
      Sjoerd Meijer authored
      This implements TTI hook 'preferPredicateOverEpilogue' for MVE.  This is a
      first version and it operates on single block loops only. With this change, the
      vectoriser will now determine if tail-folding scalar remainder loops is
      possible/desired, which is the first step to generate MVE tail-predicated
      vector loops.
      
      This is disabled by default for now. I.e,, this is depends on option
      -disable-mve-tail-predication, which is off by default.
      
      I will follow up on this soon with a patch for the vectoriser to respect loop
      hint 'vectorize.predicate.enable'. I.e., with this loop hint set to Disabled,
      we don't want to tail-fold and we shouldn't query this TTI hook, which is
      done in D70125.
      
      Differential Revision: https://reviews.llvm.org/D69845
      d90804d2
    • Luís Marques's avatar
      [RISCV] Fix wrong CFI directives · a5ce8bd7
      Luís Marques authored
      Summary: Removes CFI CFA directives that could incorrectly propagate
      beyond the basic block they were inteded for. Specifically it removes
      the epilogue CFI directives. See the branch_and_tail_call test for an
      example of the issue. Should fix the stack unwinding issues caused by
      the incorrect directives.
      
      Reviewers: asb, lenary, shiva0217
      Reviewed By: lenary
      Tags: #llvm
      Differential Revision: https://reviews.llvm.org/D69723
      a5ce8bd7
    • Simon Pilgrim's avatar
      [X86][AVX] Add plausible schedule classes to MASKPAIR/VP2INTERSECT/VDPBF16PS instructions · 4d0e7b62
      Simon Pilgrim authored
      These are really just placeholders that use approximately the right resources - once we have CPUs scheduler models that support these instructions they will need revisiting.
      
      In the meantime this means that all instructions have a class of some kind., meaning models can be more easily flagged as complete.
      4d0e7b62
    • Hans Wennborg's avatar
      Revert 57dd4b03 "[ValueTracking] Allow context-sensitive nullness check for non-pointers" · 6ea47759
      Hans Wennborg authored
      This caused miscompiles of Chromium (https://crbug.com/1023818). The reduced
      repro is small enough to fit here:
      
        $ cat /tmp/a.c
        unsigned char f(unsigned char *p) {
          unsigned char result = 0;
          for (int shift = 0; shift < 1; ++shift)
            result |= p[0] << (shift * 8);
          return result;
        }
        $ bin/clang -O2 -S -o - /tmp/a.c | grep -A4 f:
        f:                                      # @f
                .cfi_startproc
        # %bb.0:                                # %entry
                xorl    %eax, %eax
                retq
      
      That's nicely optimized, but I don't think it's the right result :-)
      
      > Same as D60846 but with a fix for the problem encountered there which
      > was a missing context adjustment in the handling of PHI nodes.
      >
      > The test that caused D60846 to be reverted was added in e15ab8f2.
      >
      > Reviewers: nikic, nlopes, mkazantsev,spatel, dlrobertson, uabelho, hakzsam
      >
      > Subscribers: hiraditya, bollu, llvm-commits
      >
      > Tags: #llvm
      >
      > Differential Revision: https://reviews.llvm.org/D69571
      
      This reverts commit 57dd4b03.
      6ea47759
    • Mirko Brkusanin's avatar
      [Mips] Add rematerialization support for ldi.fmt · fed17867
      Mirko Brkusanin authored
      Instruction ldi.fmt can be considered cheap enough to avoid spill and restore
      of value that it produces since it's loaded from immediate.
      
      Differential Revision: https://reviews.llvm.org/D69898
      fed17867
    • Simon Atanasyan's avatar
      [mips] Show an error if 64-bit target triple provided with 32-bit CPU · 068db2ed
      Simon Atanasyan authored
      When a 64-bit triple is used emit an error if the CPU only supports
      32-bit code.
      
      Patch by Miloš Stojanović.
      
      Differential Revision: https://reviews.llvm.org/D70018
      068db2ed
    • Daniil Suchkov's avatar
      Temporarily revert "[InstCombine] Fold PHIs with equal incoming pointers" · cba4a277
      Daniil Suchkov authored
      Revert due to sanitizer-windows buildbot failure.
      
      This reverts commit bbb29738.
      cba4a277
    • David Stenberg's avatar
      [DebugInfo] Avoid creating entry values for clobbered registers · 5e646ff5
      David Stenberg authored
      Summary:
      Entry values are considered for parameters that have register-described
      DBG_VALUEs in the entry block (along with other conditions).
      
      If a parameter's value has been propagated from the caller to the
      callee, then the parameter's DBG_VALUE in the entry block may be
      described using a register defined by some instruction, and entry values
      should not be emitted for the parameter, which can currently occur.
      One such case was seen in the attached test case, in which the second
      parameter, which is described by a redefinition of the first parameter's
      register, would incorrectly get an entry value using the first
      parameter's register. This commit intends to solve such cases by keeping
      track of register defines, and ignoring DBG_VALUEs in the entry block
      that are described by such registers.
      
      In a RelWithDebInfo build of clang-8, the average size of the set was
      27, and in a RelWithDebInfo+ASan build it was 30.
      
      Reviewers: djtodoro, NikolaPrica, aprantl, vsk
      
      Reviewed By: djtodoro, vsk
      
      Subscribers: hiraditya, llvm-commits
      
      Tags: #debug-info, #llvm
      
      Differential Revision: https://reviews.llvm.org/D69889
      5e646ff5
    • David Stenberg's avatar
      [DebugInfo] Add helper for finding entry value candidates [NFC] · 4fec44cd
      David Stenberg authored
      Summary:
      The conditions that are used to determine if entry values should be
      emitted for a parameter are quite many, and will grow slightly
      in a follow-up commit, so move those to a helper function, as was
      suggested in the code review for D69889.
      
      Reviewers: djtodoro, NikolaPrica
      
      Reviewed By: djtodoro
      
      Subscribers: probinson, hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D69955
      4fec44cd
    • Sander de Smalen's avatar
      [AArch64] Extend storeRegToStackSlot to spill SVE registers. · 3367686b
      Sander de Smalen authored
      This patch allows the register allocator to spill SVE registers to the stack.
      
      Reviewers: ostannard, efriedma, rengolin, cameron.mcinally
      
      Reviewed By: efriedma
      
      Differential Revision: https://reviews.llvm.org/D70082
      3367686b
    • Daniil Suchkov's avatar
      [InstCombine] Fold PHIs with equal incoming pointers · bbb29738
      Daniil Suchkov authored
      In case when all incoming values of a PHI are equal pointers, this
      transformation inserts a definition of such a pointer right after
      definition of the base pointer and replaces with this value both PHI and
      all it's incoming pointers. Primary goal of this transformation is
      canonicalization of this pattern in order to enable optimizations that
      can't handle PHIs. Non-inbounds pointers aren't currently supported.
      
      Reviewers: spatel, RKSimon, lebedev.ri, apilipenko
      
      Reviewed By: apilipenko
      
      Tags: #llvm
      
      Subscribers: hiraditya, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D68128
      bbb29738
    • Sander de Smalen's avatar
      [AArch64][SVE] Allocate locals that are scalable vectors. · 9a1c243a
      Sander de Smalen authored
      This patch adds a target interface to set the StackID for a given type,
      which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a
      'sve-vec' StackID, so it is allocated in the SVE area of the stack frame.
      
      Reviewers: ostannard, efriedma, rengolin, cameron.mcinally
      
      Reviewed By: efriedma
      
      Differential Revision: https://reviews.llvm.org/D70080
      9a1c243a
Loading