Skip to content
  1. Mar 08, 2019
    • Michael Kruse's avatar
      [RegionPass] Fix forgotten "!". · 65c5821e
      Michael Kruse authored
      Commit r355068 "Fix IR/Analysis layering issue with OptBisect" uses the
      template
      
         return Gate.isEnabled() && !Gate.shouldRunPass(this, getDescription(...));
      
      for all pass kinds. For the RegionPass, it left out the not operator,
      causing region passes to be skipped as soon as a pass gate is used.
      
      llvm-svn: 355733
      65c5821e
    • Matt Arsenault's avatar
      AMDGPU: Move d16 load matching to preprocess step · e8c03a25
      Matt Arsenault authored
      When matching half of the build_vector to a load, there could still be
      a hidden dependency on the other half of the build_vector the pattern
      wouldn't detect. If there was an additional chain dependency on the
      other value, a cycle could be introduced.
      
      I don't think a tablegen pattern is capable of matching the necessary
      conditions, so move this into PreprocessISelDAG. Check isPredecessorOf
      for the other value to avoid a cycle. This has a warning that it's
      expensive, so this should probably be moved into an MI pass eventually
      that will have more freedom to reorder instructions to help match
      this. That is currently complicated by the lack of a computeKnownBits
      type mechanism for the selected function.
      
      llvm-svn: 355731
      e8c03a25
    • Matt Arsenault's avatar
      DAG: Don't try to cluster loads with tied inputs · 26e76ef0
      Matt Arsenault authored
      This avoids breaking possible value dependencies when sorting loads by
      offset.
      
      AMDGPU has some load instructions that write into the high or low bits
      of the destination register, and have a tied input for the other input
      bits. These can easily have the same base pointer, but be a swizzle so
      the high address load needs to come first. This was inserting glue
      forcing the opposite ordering, producing a cycle the InstrEmitter
      would assert on. It may be potentially expensive to look for the
      dependency between the other loads, so just skip any where this could
      happen.
      
      Fixes bug 40936 by reverting r351379, which added a hacky attempt to
      fix this by adding chains in this case, which I think was just working
      around broken glue before the InstrEmitter. The core of the patch is
      re-implementing the fix for that problem.
      
      llvm-svn: 355728
      26e76ef0
    • Matt Arsenault's avatar
      AMDGPU: Don't bother checking the chain in areLoadsFromSameBasePtr · f587fd9c
      Matt Arsenault authored
      This is only called in contexts that are verifying the chain itself,
      and the query itself is only asking about the address.
      
      llvm-svn: 355723
      f587fd9c
    • Matt Arsenault's avatar
      AMDGPU: Correct DS implementation of areLoadsFromSameBasePtr · 07f904be
      Matt Arsenault authored
      This was checking the wrong operands for the base register and the
      offsets. The indexes are shifted by the number of output registers
      from the machine instruction definition, and the chain is moved to the
      end.
      
      llvm-svn: 355722
      07f904be
    • Alexey Bataev's avatar
      [DEBUG_INFO][NVPTX]Emit empty .debug_loc section in presence of the debug option. · 78fcb838
      Alexey Bataev authored
      Summary:
      If the LLVM module shows that it has debug info, but the file is
      actually empty and the real debug info is not emitted, the ptxas tool
      emits error 'Debug information not found in presence of .target debug'.
      We need at leas one empty debug section to silence this message. Section
      `.debug_loc` is not emitted for PTX and we can emit empty `.debug_loc`
      section if `debug` option was emitted.
      
      Reviewers: tra
      
      Subscribers: jholewinski, aprantl, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D57250
      
      llvm-svn: 355719
      78fcb838
    • Amaury Sechet's avatar
      [DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a) · 782ac933
      Amaury Sechet authored
      Summary: This pattern is sometime created after legalization.
      
      Reviewers: efriedma, spatel, RKSimon, zvi, bkramer
      
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58874
      
      llvm-svn: 355716
      782ac933
    • George Burgess IV's avatar
      [CFLAnders] Fix typo in comment; NFC · 4ea679f1
      George Burgess IV authored
      Patch by Enna1!
      
      Differential Revision: https://reviews.llvm.org/D58756
      
      llvm-svn: 355715
      4ea679f1
    • Wei Mi's avatar
      [RegisterCoalescer] Limit the number of joins for large live interval with · 72ec6801
      Wei Mi authored
      many valnos.
      
      Recently we found compile time out problem in several cases when
      SpeculativeLoadHardening was enabled. The significant compile time was spent
      in register coalescing pass, where register coalescer tried to join many other
      live intervals with some very large live intervals with many valnos.
      
      Specifically, every time JoinVals::mapValues is called, computeAssignment will
      be called by getNumValNums() times of the target live interval. If the large
      live interval has N valnos and has N copies associated with it, trying to
      coalescing those copies will at least cost N^2 complexity.
      
      The patch adds some limit to the effort trying to join those very large live
      intervals with others. By default, for live interval with > 100 valnos, and
      when it has been coalesced with other live interval by more than 100 times,
      we will stop coalescing for the live interval anymore. That put a compile
      time cap for the N^2 algorithm and effectively solves the compile time
      problem we saw.
      
      Differential revision: https://reviews.llvm.org/D59143
      
      llvm-svn: 355714
      72ec6801
    • Sanjay Patel's avatar
      [x86] prevent infinite looping from inverse shuffle transforms · b22f438d
      Sanjay Patel authored
      llvm-svn: 355713
      b22f438d
    • Diogo N. Sampaio's avatar
      [ARM][FIX] Fix vfmal.f16 and vfmsl.f16 operand · c20c37ba
      Diogo N. Sampaio authored
      The indexed variant of vfmal.f16 and vfmsl.f16
      instructions use the uppser bits of the indexed
      operand to store the index (1 bit for the double
      variant, 2 bits for the quad).
      
      This limits the usable registers to d0 - d7 or
      s0 - s15. This patch enforces this limitation.
      
      Differential Revision: https://reviews.llvm.org/D59021
      
      llvm-svn: 355707
      c20c37ba
    • Simon Pilgrim's avatar
      [DAGCombine] Merge visitSMULO+visitUMULO into visitMULO. NFCI. · 04e8439f
      Simon Pilgrim authored
      llvm-svn: 355690
      04e8439f
    • Simon Pilgrim's avatar
      [DAGCombine] Merge visitSADDO+visitUADDO into visitADDO. NFCI. · c71d6d15
      Simon Pilgrim authored
      llvm-svn: 355689
      c71d6d15
    • Simon Pilgrim's avatar
      [DAGCombine] Merge visitSSUBO+visitUSUBO into visitSUBO. NFCI. · 2c2e76a9
      Simon Pilgrim authored
      llvm-svn: 355688
      2c2e76a9
    • Michael Platings's avatar
      [IR][ARM] Add function pointer alignment to datalayout · 308e82ec
      Michael Platings authored
      Use this feature to fix a bug on ARM where 4 byte alignment is
      incorrectly assumed.
      
      Differential Revision: https://reviews.llvm.org/D57335
      
      llvm-svn: 355685
      308e82ec
    • Clement Courbet's avatar
      [SelectionDAG] Allow the user to specify a memeq function. · 8e16d733
      Clement Courbet authored
      Summary:
      Right now, when we encounter a string equality check,
      e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a
      small compile-time constant, and fall back on calling `memcmp()` else.
      
      This is sub-optimal because memcmp has to compute much more than
      equality.
      
      This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms
      that support `bcmp`.
      
      `bcmp` can be made much more efficient than `memcmp` because equality
      compare is trivially parallel while lexicographic ordering has a chain
      dependency.
      
      Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56593
      
      llvm-svn: 355672
      8e16d733
    • Carl Ritson's avatar
      [AMDGPU] V_CVT_F32_UBYTE{0,1,2,3} are full rate instructions · 1a98dc18
      Carl Ritson authored
      Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions.
      
      Reviewers: arsenm, rampitec
      
      Reviewed By: rampitec
      
      Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59091
      
      llvm-svn: 355671
      1a98dc18
    • Craig Topper's avatar
      [X86] Improve the type checking in isLegalMaskedLoad and isLegalMaskedGather. · 4505c99e
      Craig Topper authored
      We were just checking pointer size and type primitive size. But this caused unintended things like vectors of half being accepted by masked load/store.
      
      For FP we now explicitly check for only double and float.
      
      For pointers we now let any pointer through. Trusting that only 32 and 64 would be used to generate assembly.
      
      We only check bitwidth after checking that the type is an integer.
      
      llvm-svn: 355667
      4505c99e
    • Steven Wu's avatar
      [Bitcode] Fix bitcode compatibility issue with clang.arc.use intrinsic · ed982292
      Steven Wu authored
      Summary:
      In r349534, objc arc implementation is switched to use intrinsics and at
      the same time, clang.arc.use is renamed to llvm.objc.clang.arc.use to
      make the naming more consistent. The side-effect of that is llvm no
      longer recognize it as intrinsics and codegen external references to
      it instead.
      
      Rather than upgrade the old intrinsics name to the new one and wait for
      the arc-contract pass to remove it, simply remove it in the bitcode
      upgrader.
      
      rdar://problem/48607063
      
      Reviewers: pete, ahatanak, erik.pilkington, dexonsmith
      
      Reviewed By: pete, dexonsmith
      
      Subscribers: jkorous, jdoerfert, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59112
      
      llvm-svn: 355663
      ed982292
  2. Mar 07, 2019
    • Craig Topper's avatar
      [X86] Correct scheduler information for rotate by constant for Haswell, Broadwell, and Skylake. · d0c2dba6
      Craig Topper authored
      Rotate with explicit immediate is a single uop from Haswell on. An immediate of 1 has a dependency on the previous writer of flags, but the other immediate values do not.
      
      The implicit rotate by 1 instruction is 2 uops. But the flags are merged after the rotate uop so the data result does not see the flag dependency. But I don't think we have any way of modeling that.
      
      RORX is 1 uop without the load. 2 uops with the load. We currently model these with WriteShift/WriteShiftLd.
      
      Differential Revision: https://reviews.llvm.org/D59077
      
      llvm-svn: 355636
      d0c2dba6
    • Craig Topper's avatar
      [X86] Model ADC/SBB with immediate 0 more accurately in the Haswell scheduler model · b3af5d3e
      Craig Topper authored
      Haswell and possibly Sandybridge have an optimization for ADC/SBB with immediate 0 to use a single uop flow. This only applies GR16/GR32/GR64 with an 8-bit immediate. It does not apply to GR8. It also does not apply to the implicit AX/EAX/RAX forms.
      
      Differential Revision: https://reviews.llvm.org/D59058
      
      llvm-svn: 355635
      b3af5d3e
    • Brian Gesiak's avatar
      [CodeGen] Reuse BlockUtils for -unreachableblockelim pass (NFC) · 4e467043
      Brian Gesiak authored
      Summary:
      The logic in the -unreachableblockelim pass does the following:
      
      1. It traverses the function it's given in depth-first order and
         creates a set of basic blocks that are unreachable from the
         function's entry node.
      2. It iterates over each of those unreachable blocks and (1) removes any
         successors' references to the dead block, and (2) replaces any uses of
         instructions from the dead block with null.
      
      The logic in (2) above is identical to what the `llvm::DeleteDeadBlocks`
      function from `BasicBlockUtils.h` does. The only difference is that
      `llvm::DeleteDeadBlocks` replaces uses of instructions from dead blocks
      not with null, but with undef.
      
      Replace the duplicate logic in the -unreachableblockelim pass with a
      call to `llvm::DeleteDeadBlocks`. This results in less code but no
      functional change (NFC).
      
      Reviewers: mkazantsev, wmi, davidxl, silvas, davide
      
      Reviewed By: davide
      
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59064
      
      llvm-svn: 355634
      4e467043
    • Konstantin Zhuravlyov's avatar
      AMDHSA: Code object v3 updates · 47f0bf8f
      Konstantin Zhuravlyov authored
        - Copy kernel symbol attributes into kernel descriptor attributes
        - Make sure kernel symbol's visibility is not "higher" than protected
      
      Differential Revision: https://reviews.llvm.org/D59057
      
      llvm-svn: 355630
      47f0bf8f
    • Vlad Tsyrklevich's avatar
      Delete x86_64 ShadowCallStack support · 2e1479e2
      Vlad Tsyrklevich authored
      Summary:
      ShadowCallStack on x86_64 suffered from the same racy security issues as
      Return Flow Guard and had performance overhead as high as 13% depending
      on the benchmark. x86_64 ShadowCallStack was always an experimental
      feature and never shipped a runtime required to support it, as such
      there are no expected downstream users.
      
      Reviewers: pcc
      
      Reviewed By: pcc
      
      Subscribers: mgorny, javed.absar, hiraditya, jdoerfert, cfe-commits, #sanitizers, llvm-commits
      
      Tags: #clang, #sanitizers, #llvm
      
      Differential Revision: https://reviews.llvm.org/D59034
      
      llvm-svn: 355624
      2e1479e2
    • Jinsong Ji's avatar
      [PowerPC] Run clang format to avoid compiling warning. · de3348ae
      Jinsong Ji authored
      llvm-svn: 355623
      de3348ae
    • Mitch Phillips's avatar
      Rollback of rL355585. · 92dd321a
      Mitch Phillips authored
      Introduces memory leak in FunctionTest.GetPointerAlignment that breaks sanitizer buildbots:
      
      ```
      =================================================================
      ==2453==ERROR: LeakSanitizer: detected memory leaks
      
      Direct leak of 128 byte(s) in 1 object(s) allocated from:
          #0 0x610428 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:105
          #1 0x16936bc in llvm::User::operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/User.cpp:151:19
          #2 0x7c3fe9 in Create /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/IR/Function.h:144:12
          #3 0x7c3fe9 in (anonymous namespace)::FunctionTest_GetPointerAlignment_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/IR/FunctionTest.cpp:136
          #4 0x1a836a0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
          #5 0x1a836a0 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2474
          #6 0x1a85c55 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
          #7 0x1a870d0 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
          #8 0x1aa5b84 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
          #9 0x1aa4d30 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
          #10 0x1aa4d30 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4257
          #11 0x1a6b656 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
          #12 0x1a6b656 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50
          #13 0x7f5af37a22e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
      
      Indirect leak of 40 byte(s) in 1 object(s) allocated from:
          #0 0x610428 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:105
          #1 0x151be6b in make_unique<llvm::ValueSymbolTable> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/ADT/STLExtras.h:1349:29
          #2 0x151be6b in llvm::Function::Function(llvm::FunctionType*, llvm::GlobalValue::LinkageTypes, unsigned int, llvm::Twine const&, llvm::Module*) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/Function.cpp:241
          #3 0x7c4006 in Create /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/IR/Function.h:144:16
          #4 0x7c4006 in (anonymous namespace)::FunctionTest_GetPointerAlignment_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/IR/FunctionTest.cpp:136
          #5 0x1a836a0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
          #6 0x1a836a0 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2474
          #7 0x1a85c55 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
          #8 0x1a870d0 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
          #9 0x1aa5b84 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
          #10 0x1aa4d30 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
          #11 0x1aa4d30 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4257
          #12 0x1a6b656 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
          #13 0x1a6b656 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50
          #14 0x7f5af37a22e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
      
      SUMMARY: AddressSanitizer: 168 byte(s) leaked in 2 allocation(s).
      ```
      
      See http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/11358/steps/check-llvm%20asan/logs/stdio for more information.
      
      Also introduces use-of-uninitialized-value in ConstantsTest.FoldGlobalVariablePtr:
      ```
      ==7070==WARNING: MemorySanitizer: use-of-uninitialized-value
          #0 0x14e703c in User /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/User.h:79:5
          #1 0x14e703c in Constant /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/Constant.h:44
          #2 0x14e703c in llvm::GlobalValue::GlobalValue(llvm::Type*, llvm::Value::ValueTy, llvm::Use*, unsigned int, llvm::GlobalValue::LinkageTypes, llvm::Twine const&, unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/GlobalValue.h:78
          #3 0x14e5467 in GlobalObject /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/GlobalObject.h:34:9
          #4 0x14e5467 in llvm::GlobalVariable::GlobalVariable(llvm::Type*, bool, llvm::GlobalValue::LinkageTypes, llvm::Constant*, llvm::Twine const&, llvm::GlobalValue::ThreadLocalMode, unsigned int, bool) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/Globals.cpp:314
          #5 0x6938f1 in llvm::(anonymous namespace)::ConstantsTest_FoldGlobalVariablePtr_Test::TestBody() /b/sanitizer-x86_64-linux-fast/build/llvm/unittests/IR/ConstantsTest.cpp:565:18
          #6 0x1a240a1 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc
          #7 0x1a240a1 in testing::Test::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2474
          #8 0x1a26d26 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
          #9 0x1a2815f in testing::TestCase::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
          #10 0x1a43de8 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
          #11 0x1a42c47 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc
          #12 0x1a42c47 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:4257
          #13 0x1a0dfba in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
          #14 0x1a0dfba in main /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50
          #15 0x7f2081c412e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
          #16 0x4dff49 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/unittests/IR/IRTests+0x4dff49)
      
      SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/User.h:79:5 in User
      ```
      
      See http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/30222/steps/check-llvm%20msan/logs/stdio for more information.
      
      llvm-svn: 355616
      92dd321a
    • Petar Jovanovic's avatar
      [DebugInfo] Fix the type of the formated variable · 95817d36
      Petar Jovanovic authored
      Change the format type of *Personality and *LSDAAddress to PRIx64 since
      they are of type uint64_t.
      The problem was detected on mips builds, where it was printing junk values
      and causing test failure.
      
      Patch by Milos Stojanovic.
      
      Differential Revision: https://reviews.llvm.org/D58451
      
      llvm-svn: 355607
      95817d36
    • David Green's avatar
      [LSR] Attempt to increase the accuracy of LSR's setup cost · ffc922ec
      David Green authored
      In some loops, we end up generating loop induction variables that look like:
        {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1}
      As opposed to the simpler:
        {(zext i16 (%i0 * %i1) to i32),+,-1}
      i.e we count up from -limit to 0, not the simpler counting down from limit to
      0. This is because the scores, as LSR calculates them, are the same and the
      second is filtered in place of the first. We end up with a redundant SUB from 0
      in the code.
      
      This patch tries to make the calculation of the setup cost a little more
      thoroughly, recursing into the scev members to better approximate the setup
      required. The cost function for comparing LSR costs is:
      
      return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds,
                      C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
             std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds,
                      C2.ScaleCost, C2.ImmCost, C2.SetupCost);
      So this will only alter results if none of the other variables turn out to be
      different.
      
      Differential Revision: https://reviews.llvm.org/D58770
      
      llvm-svn: 355597
      ffc922ec
    • Petar Avramovic's avatar
      [MIPS GlobalISel] Fix mul operands · 3d3120dc
      Petar Avramovic authored
      Unsigned mul high for MIPS32 is selected into two PseudoInstructions:
      PseudoMULTu and PseudoMFHI that use accumulator register class ACC64 for
      some of its operands. Registers in this class have appropriate hi and lo
      register as subregisters: $lo0 and $hi0 are subregisters of $ac0 etc.
      mul instruction implicit-defs $lo0 and $hi0 according to MipsInstrInfo.td.
      In functions where mul and PseudoMULTu are present fastRegisterAllocator
      will "run out of registers during register allocation" because
      'calcSpillCost' for $ac0 will return spillImpossible because subregisters
      $lo0 and $hi0 of $ac0 are reserved by mul instruction above. A solution is
      to mark implicit-defs of $lo0 and $hi0 as dead in mul instruction.
      
      Differential Revision: https://reviews.llvm.org/D58715
      
      llvm-svn: 355594
      3d3120dc
    • George Rimar's avatar
      [yaml2obj] - Allow producing ELFDATANONE ELFs · a5a0a0f0
      George Rimar authored
      I need this to remove a binary from LLD test suite.
      The patch also simplifies the code a bit.
      
      Differential revision: https://reviews.llvm.org/D59082
      
      llvm-svn: 355591
      a5a0a0f0
    • Fangrui Song's avatar
      [IDF] Delete a redundant J-edge test · 9ade843c
      Fangrui Song authored
      In the DJ-graph based computation of iterated dominance frontiers,
      SuccNode->getIDom() == Node is one of the tests to check if (Node,Succ)
      is a J-edge. If it is true, since Node is dominated by Root,
      
        SuccLevel = level(Node)+1 > RootLevel
      
      which means the next test SuccLevel > RootLevel will also be true. test
      the check is redundant and can be deleted as it also involves one
      indirection and provides no speed-up.
      
      llvm-svn: 355589
      9ade843c
    • Kristof Beyls's avatar
      Add newline to interpreter debugging output · 730ecf8f
      Kristof Beyls authored
      When running lli --debug --force-interpreter=true the executed instructions are
      printed but are missing newlines. This commit adds the missing newlines.
      
      Patch by Andrew Brown.
      
      Differential Revision: https://reviews.llvm.org/D57806
      
      llvm-svn: 355587
      730ecf8f
    • Michael Platings's avatar
      [IR][ARM] Add function pointer alignment to datalayout · fd4156ed
      Michael Platings authored
      Use this feature to fix a bug on ARM where 4 byte alignment is
      incorrectly assumed.
      
      Differential Revision: https://reviews.llvm.org/D57335
      
      llvm-svn: 355585
      fd4156ed
    • Fangrui Song's avatar
      [BDCE] Optimize find+insert with early insert · b0f764c7
      Fangrui Song authored
      llvm-svn: 355583
      b0f764c7
    • Craig Topper's avatar
      [X86] Enable combineFMinNumFMaxNum for 512 bit vectors when AVX512 is enabled. · 3acc4236
      Craig Topper authored
      Simplified by just checking if the vector type is legal rather than listing all combinations of types and features.
      
      Fixes PR40984.
      
      llvm-svn: 355582
      3acc4236
    • Aakanksha Patil's avatar
      AMDGPU: Handle "uniform-work-group-size" attribute (fix for RADV) · c56d2afc
      Aakanksha Patil authored
      A previous patch for "uniform-work-group-size" attribute was found to break
      some RADV and possibly radeon SI tests and had to be retracted.
      This patch fixes that.
      
      Differential Revision: http://reviews.llvm.org/D58993
      
      llvm-svn: 355574
      c56d2afc
    • Nick Desaulniers's avatar
      [LoopRotate] fix crash encountered with callbr · 212c8ac2
      Nick Desaulniers authored
      Summary:
      While implementing inlining support for callbr
      (https://bugs.llvm.org/show_bug.cgi?id=40722), I hit a crash in Loop
      Rotation when trying to build the entire x86 Linux kernel
      (drivers/char/random.c). This is a small fix up to r353563.
      
      Test case is drivers/char/random.c (with callbr's inlined), then ran
      through creduce, then `opt -opt-bisect-limit=<limit>`, then bugpoint.
      
      Thanks to Craig Topper for immediately spotting the fix, and teaching me
      how to fish.
      
      Reviewers: craig.topper, jyknight
      
      Reviewed By: craig.topper
      
      Subscribers: hiraditya, llvm-commits, srhines
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58929
      
      llvm-svn: 355564
      212c8ac2
  3. Mar 06, 2019
Loading