Skip to content
  1. Mar 08, 2019
    • Shoaib Meenai's avatar
      [cmake] Remove llvm from LLVM_ALL_PROJECTS · 7a462ab7
      Shoaib Meenai authored
      LLVM is always built; including it in LLVM_ENABLE_PROJECTS has no
      effect, but since it's in LLVM_ALL_PROJECTS, we produce a confusing
      message about it being disabled. Drop it from LLVM_ALL_PROJECTS to avoid
      this. Pointed out by David Greene on the mailing list [1].
      
      [1] http://lists.llvm.org/pipermail/llvm-dev/2019-March/130854.html
      
      llvm-svn: 355735
      7a462ab7
    • Mitch Phillips's avatar
      [GN] Merge 355720. · 13661a9c
      Mitch Phillips authored
      llvm-svn: 355734
      13661a9c
    • Michael Kruse's avatar
      [RegionPass] Fix forgotten "!". · 65c5821e
      Michael Kruse authored
      Commit r355068 "Fix IR/Analysis layering issue with OptBisect" uses the
      template
      
         return Gate.isEnabled() && !Gate.shouldRunPass(this, getDescription(...));
      
      for all pass kinds. For the RegionPass, it left out the not operator,
      causing region passes to be skipped as soon as a pass gate is used.
      
      llvm-svn: 355733
      65c5821e
    • Matt Arsenault's avatar
      AMDGPU: Move d16 load matching to preprocess step · e8c03a25
      Matt Arsenault authored
      When matching half of the build_vector to a load, there could still be
      a hidden dependency on the other half of the build_vector the pattern
      wouldn't detect. If there was an additional chain dependency on the
      other value, a cycle could be introduced.
      
      I don't think a tablegen pattern is capable of matching the necessary
      conditions, so move this into PreprocessISelDAG. Check isPredecessorOf
      for the other value to avoid a cycle. This has a warning that it's
      expensive, so this should probably be moved into an MI pass eventually
      that will have more freedom to reorder instructions to help match
      this. That is currently complicated by the lack of a computeKnownBits
      type mechanism for the selected function.
      
      llvm-svn: 355731
      e8c03a25
    • Matt Arsenault's avatar
      DAG: Don't try to cluster loads with tied inputs · 26e76ef0
      Matt Arsenault authored
      This avoids breaking possible value dependencies when sorting loads by
      offset.
      
      AMDGPU has some load instructions that write into the high or low bits
      of the destination register, and have a tied input for the other input
      bits. These can easily have the same base pointer, but be a swizzle so
      the high address load needs to come first. This was inserting glue
      forcing the opposite ordering, producing a cycle the InstrEmitter
      would assert on. It may be potentially expensive to look for the
      dependency between the other loads, so just skip any where this could
      happen.
      
      Fixes bug 40936 by reverting r351379, which added a hacky attempt to
      fix this by adding chains in this case, which I think was just working
      around broken glue before the InstrEmitter. The core of the patch is
      re-implementing the fix for that problem.
      
      llvm-svn: 355728
      26e76ef0
    • Sanjay Patel's avatar
      [x86] add tests for extracted vector FP cmp; NFC · 43f098e7
      Sanjay Patel authored
      llvm-svn: 355727
      43f098e7
    • Matthew Voss's avatar
      Revert "[runtimes] Move libunwind, libc++abi and libc++ to lib/ and include/" · 1262e52e
      Matthew Voss authored
      This broke the windows bots.
      
      This reverts commit 28302c66.
      
      llvm-svn: 355725
      1262e52e
    • Matt Arsenault's avatar
      AMDGPU: Add more tests for d16 loads · 74c9c305
      Matt Arsenault authored
      Also fix a few cases that weren't testing what they were supposed to.
      
      llvm-svn: 355724
      74c9c305
    • Matt Arsenault's avatar
      AMDGPU: Don't bother checking the chain in areLoadsFromSameBasePtr · f587fd9c
      Matt Arsenault authored
      This is only called in contexts that are verifying the chain itself,
      and the query itself is only asking about the address.
      
      llvm-svn: 355723
      f587fd9c
    • Matt Arsenault's avatar
      AMDGPU: Correct DS implementation of areLoadsFromSameBasePtr · 07f904be
      Matt Arsenault authored
      This was checking the wrong operands for the base register and the
      offsets. The indexes are shifted by the number of output registers
      from the machine instruction definition, and the chain is moved to the
      end.
      
      llvm-svn: 355722
      07f904be
    • Alexey Bataev's avatar
      [DEBUG_INFO][NVPTX]Emit empty .debug_loc section in presence of the debug option. · 78fcb838
      Alexey Bataev authored
      Summary:
      If the LLVM module shows that it has debug info, but the file is
      actually empty and the real debug info is not emitted, the ptxas tool
      emits error 'Debug information not found in presence of .target debug'.
      We need at leas one empty debug section to silence this message. Section
      `.debug_loc` is not emitted for PTX and we can emit empty `.debug_loc`
      section if `debug` option was emitted.
      
      Reviewers: tra
      
      Subscribers: jholewinski, aprantl, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D57250
      
      llvm-svn: 355719
      78fcb838
    • Amaury Sechet's avatar
      [DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a) · 782ac933
      Amaury Sechet authored
      Summary: This pattern is sometime created after legalization.
      
      Reviewers: efriedma, spatel, RKSimon, zvi, bkramer
      
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58874
      
      llvm-svn: 355716
      782ac933
    • George Burgess IV's avatar
      [CFLAnders] Fix typo in comment; NFC · 4ea679f1
      George Burgess IV authored
      Patch by Enna1!
      
      Differential Revision: https://reviews.llvm.org/D58756
      
      llvm-svn: 355715
      4ea679f1
    • Wei Mi's avatar
      [RegisterCoalescer] Limit the number of joins for large live interval with · 72ec6801
      Wei Mi authored
      many valnos.
      
      Recently we found compile time out problem in several cases when
      SpeculativeLoadHardening was enabled. The significant compile time was spent
      in register coalescing pass, where register coalescer tried to join many other
      live intervals with some very large live intervals with many valnos.
      
      Specifically, every time JoinVals::mapValues is called, computeAssignment will
      be called by getNumValNums() times of the target live interval. If the large
      live interval has N valnos and has N copies associated with it, trying to
      coalescing those copies will at least cost N^2 complexity.
      
      The patch adds some limit to the effort trying to join those very large live
      intervals with others. By default, for live interval with > 100 valnos, and
      when it has been coalesced with other live interval by more than 100 times,
      we will stop coalescing for the live interval anymore. That put a compile
      time cap for the N^2 algorithm and effectively solves the compile time
      problem we saw.
      
      Differential revision: https://reviews.llvm.org/D59143
      
      llvm-svn: 355714
      72ec6801
    • Sanjay Patel's avatar
      [x86] prevent infinite looping from inverse shuffle transforms · b22f438d
      Sanjay Patel authored
      llvm-svn: 355713
      b22f438d
    • Simon Pilgrim's avatar
      [X86] Add test case for PR22473 · 53652fea
      Simon Pilgrim authored
      llvm-svn: 355712
      53652fea
    • Diogo N. Sampaio's avatar
      [ARM][FIX] Fix vfmal.f16 and vfmsl.f16 operand · c20c37ba
      Diogo N. Sampaio authored
      The indexed variant of vfmal.f16 and vfmsl.f16
      instructions use the uppser bits of the indexed
      operand to store the index (1 bit for the double
      variant, 2 bits for the quad).
      
      This limits the usable registers to d0 - d7 or
      s0 - s15. This patch enforces this limitation.
      
      Differential Revision: https://reviews.llvm.org/D59021
      
      llvm-svn: 355707
      c20c37ba
    • Simon Pilgrim's avatar
      Fix typo in constant vector · 00ab0339
      Simon Pilgrim authored
      llvm-svn: 355699
      00ab0339
    • James Henderson's avatar
      [llvm-readelf]Don't lose negative-ness of negative addends for no symbol relocations · b41130be
      James Henderson authored
      llvm-readelf prints relocation addends as:
      
        <symbol value>[+-]<absolute addend>
      
      where [+-] is determined from whether addend is less than zero or not.
      However, it does not print the +/- if there is no symbol, which meant
      that negative addends became their positive value with no indication
      that this had happened. This patch stops the absolute conversion when
      addends are negative and there is no associated symbol.
      
      Reviewed by: Higuoxing, mattd, MaskRay
      
      Differential Revision: https://reviews.llvm.org/D59095
      
      llvm-svn: 355696
      b41130be
    • Nico Weber's avatar
      gn build: Merge r355685 · 6bce2f8e
      Nico Weber authored
      llvm-svn: 355695
      6bce2f8e
    • Nico Weber's avatar
      gn build: Unbreak finding a working `gn` on $PATH on Unix after r355645 · c3130a8a
      Nico Weber authored
      From the Python subprocess docs:
      
         If shell is True, it is recommended to pass args as a string rather than as
         a sequence.
      
         [...]
      
         If args is a sequence, the first item specifies the command string, and any
         additional items will be treated as additional arguments to the shell itself.
      
      Prior to this change, the `--version` would be passed to the shell, not to
      a potential gn binary on $PATH, and running `gn` without any arguments makes
      it exit with an exit code != 0, so the script would think that there wasn't
      a working gn binary on $PATH.
      
      Fix this by following the documentation's recommendation of using a string
      now that we pass shell=True. I tested this on macOS and Windows, each with
      the three cases of
      
      - no gn on PATH (should run gn downloaded by get.py if present,
        else suggest running get.py)
      - broken gn wrapper on PATH (should behave like the previous item)
      - working gn on PATH (should use gn on PATH)
      
      llvm-svn: 355694
      c3130a8a
    • Nico Weber's avatar
      gn build: Unbreak get.py and gn.py on Windows · 38e6bcc1
      Nico Weber authored
      `os.uname()` doesn't exist on Windows, so use `platform.machine()` which
      returns `os.uname()[4]` on non-Win and (on 64-bit systems) "AMD64" on Windows.
      Also use `sys.platform` instead of `platform` to check for Windows-ness for the
      file extension in gn.py (get.py got this right).
      
      Differential Revision: https://reviews.llvm.org/D59115
      
      llvm-svn: 355693
      38e6bcc1
    • Simon Pilgrim's avatar
      [DAGCombine] Merge visitSMULO+visitUMULO into visitMULO. NFCI. · 04e8439f
      Simon Pilgrim authored
      llvm-svn: 355690
      04e8439f
    • Simon Pilgrim's avatar
      [DAGCombine] Merge visitSADDO+visitUADDO into visitADDO. NFCI. · c71d6d15
      Simon Pilgrim authored
      llvm-svn: 355689
      c71d6d15
    • Simon Pilgrim's avatar
      [DAGCombine] Merge visitSSUBO+visitUSUBO into visitSUBO. NFCI. · 2c2e76a9
      Simon Pilgrim authored
      llvm-svn: 355688
      2c2e76a9
    • Michael Platings's avatar
      [IR][ARM] Add function pointer alignment to datalayout · 308e82ec
      Michael Platings authored
      Use this feature to fix a bug on ARM where 4 byte alignment is
      incorrectly assumed.
      
      Differential Revision: https://reviews.llvm.org/D57335
      
      llvm-svn: 355685
      308e82ec
    • Clement Courbet's avatar
      [SelectionDAG] Allow the user to specify a memeq function. · 8e16d733
      Clement Courbet authored
      Summary:
      Right now, when we encounter a string equality check,
      e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a
      small compile-time constant, and fall back on calling `memcmp()` else.
      
      This is sub-optimal because memcmp has to compute much more than
      equality.
      
      This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms
      that support `bcmp`.
      
      `bcmp` can be made much more efficient than `memcmp` because equality
      compare is trivially parallel while lexicographic ordering has a chain
      dependency.
      
      Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56593
      
      llvm-svn: 355672
      8e16d733
    • Carl Ritson's avatar
      [AMDGPU] V_CVT_F32_UBYTE{0,1,2,3} are full rate instructions · 1a98dc18
      Carl Ritson authored
      Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions.
      
      Reviewers: arsenm, rampitec
      
      Reviewed By: rampitec
      
      Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59091
      
      llvm-svn: 355671
      1a98dc18
    • Craig Topper's avatar
      [X86] Improve the type checking in isLegalMaskedLoad and isLegalMaskedGather. · 4505c99e
      Craig Topper authored
      We were just checking pointer size and type primitive size. But this caused unintended things like vectors of half being accepted by masked load/store.
      
      For FP we now explicitly check for only double and float.
      
      For pointers we now let any pointer through. Trusting that only 32 and 64 would be used to generate assembly.
      
      We only check bitwidth after checking that the type is an integer.
      
      llvm-svn: 355667
      4505c99e
    • Petr Hosek's avatar
      [runtimes] Move libunwind, libc++abi and libc++ to lib/ and include/ · 28302c66
      Petr Hosek authored
      This change is a consequence of the discussion in "RFC: Place libs in
      Clang-dedicated directories", specifically the suggestion that
      libunwind, libc++abi and libc++ shouldn't be using Clang resource
      directory.  Tools like clangd make this assumption, but this is
      currently not true for the LLVM_ENABLE_PER_TARGET_RUNTIME_DIR build.
      This change addresses that by moving the output of these libraries to
      lib/<target> and include/ directories, leaving resource directory only
      for compiler-rt runtimes and Clang builtin headers.
      
      Differential Revision: https://reviews.llvm.org/D59013
      
      llvm-svn: 355665
      28302c66
    • Steven Wu's avatar
      [Bitcode] Fix bitcode compatibility issue with clang.arc.use intrinsic · ed982292
      Steven Wu authored
      Summary:
      In r349534, objc arc implementation is switched to use intrinsics and at
      the same time, clang.arc.use is renamed to llvm.objc.clang.arc.use to
      make the naming more consistent. The side-effect of that is llvm no
      longer recognize it as intrinsics and codegen external references to
      it instead.
      
      Rather than upgrade the old intrinsics name to the new one and wait for
      the arc-contract pass to remove it, simply remove it in the bitcode
      upgrader.
      
      rdar://problem/48607063
      
      Reviewers: pete, ahatanak, erik.pilkington, dexonsmith
      
      Reviewed By: pete, dexonsmith
      
      Subscribers: jkorous, jdoerfert, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59112
      
      llvm-svn: 355663
      ed982292
    • Sanjay Patel's avatar
      [x86] add extract FP tests for target-specific nodes; NFC · 5ed14ef1
      Sanjay Patel authored
      llvm-svn: 355655
      5ed14ef1
    • Adrian Prantl's avatar
      Temporarily diasble debug output in GenericDomTreeConstruction.h · de04a8c1
      Adrian Prantl authored
      to get the modules bots running again.
      
      The LLVM_DEBUG macro only plays well with a modular build of LLVM when
      the header is marked as textual, but doing so causes redefinition
      errors.
      
      llvm-svn: 355653
      de04a8c1
    • Adrian Prantl's avatar
      Make GenericDomTreeConstruction textual instead. · 1d1ff88b
      Adrian Prantl authored
      I think the problem is that it uses the LLVM_DEBUG macro in funciton bodies.
      
      llvm-svn: 355652
      1d1ff88b
  2. Mar 07, 2019
    • Adrian Prantl's avatar
      d61c80b8
    • Mitch Phillips's avatar
      [GN] Locate prebuilt binaries correctly. · c90886b9
      Mitch Phillips authored
      Use the system shell to see if we can find a 'gn' binary on $PATH. This solves the error wherein subprocess.call fails ungracefully if the binary doesn't exist.
      
      llvm-svn: 355645
      c90886b9
    • Hubert Tong's avatar
      Add secondary libstdc++ 4.8 and 5.1 detection mechanisms · 51dcfdbb
      Hubert Tong authored
      Summary:
      The date-based approach to detecting unsupported versions of libstdc++
      does not handle bug fix releases of older versions. As an example, the
      `__GLIBCXX__` value associated with version 5.1, `20150422`, is less
      than the values associated with versions 4.8.5 and 4.9.3.
      
      This patch adds secondary checks based on certain properties in
      sufficiently new versions of libstdc++.
      
      Reviewers: jfb, tstellar, rnk, sfertile, nemanjai
      
      Reviewed By: jfb
      
      Subscribers: mgorny, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58682
      
      llvm-svn: 355638
      51dcfdbb
    • Craig Topper's avatar
      [X86] Correct scheduler information for rotate by constant for Haswell, Broadwell, and Skylake. · d0c2dba6
      Craig Topper authored
      Rotate with explicit immediate is a single uop from Haswell on. An immediate of 1 has a dependency on the previous writer of flags, but the other immediate values do not.
      
      The implicit rotate by 1 instruction is 2 uops. But the flags are merged after the rotate uop so the data result does not see the flag dependency. But I don't think we have any way of modeling that.
      
      RORX is 1 uop without the load. 2 uops with the load. We currently model these with WriteShift/WriteShiftLd.
      
      Differential Revision: https://reviews.llvm.org/D59077
      
      llvm-svn: 355636
      d0c2dba6
    • Craig Topper's avatar
      [X86] Model ADC/SBB with immediate 0 more accurately in the Haswell scheduler model · b3af5d3e
      Craig Topper authored
      Haswell and possibly Sandybridge have an optimization for ADC/SBB with immediate 0 to use a single uop flow. This only applies GR16/GR32/GR64 with an 8-bit immediate. It does not apply to GR8. It also does not apply to the implicit AX/EAX/RAX forms.
      
      Differential Revision: https://reviews.llvm.org/D59058
      
      llvm-svn: 355635
      b3af5d3e
    • Brian Gesiak's avatar
      [CodeGen] Reuse BlockUtils for -unreachableblockelim pass (NFC) · 4e467043
      Brian Gesiak authored
      Summary:
      The logic in the -unreachableblockelim pass does the following:
      
      1. It traverses the function it's given in depth-first order and
         creates a set of basic blocks that are unreachable from the
         function's entry node.
      2. It iterates over each of those unreachable blocks and (1) removes any
         successors' references to the dead block, and (2) replaces any uses of
         instructions from the dead block with null.
      
      The logic in (2) above is identical to what the `llvm::DeleteDeadBlocks`
      function from `BasicBlockUtils.h` does. The only difference is that
      `llvm::DeleteDeadBlocks` replaces uses of instructions from dead blocks
      not with null, but with undef.
      
      Replace the duplicate logic in the -unreachableblockelim pass with a
      call to `llvm::DeleteDeadBlocks`. This results in less code but no
      functional change (NFC).
      
      Reviewers: mkazantsev, wmi, davidxl, silvas, davide
      
      Reviewed By: davide
      
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59064
      
      llvm-svn: 355634
      4e467043
Loading