Skip to content
  1. Sep 06, 2016
    • Tom Stellard's avatar
      AMDGPU/SI: Teach SIInstrInfo::FoldImmediate() to fold immediates into copies · 2add8a11
      Tom Stellard authored
      Summary:
      I put this code here, because I want to re-use it in a few other places.
      This supersedes some of the immediate folding code we have in SIFoldOperands.
      I think the peephole optimizers is probably a better place for folding
      immediates into copies, since it does some register coalescing in the same time.
      
      This will also make it easier to transition SIFoldOperands into a smarter pass,
      where it looks at all uses of instruction at once to determine the optimal way to
      fold operands.  Right now, the pass just considers one operand at a time.
      
      Reviewers: arsenm
      
      Subscribers: wdng, nhaehnle, arsenm, llvm-commits, kzhuravl
      
      Differential Revision: https://reviews.llvm.org/D23402
      
      llvm-svn: 280744
      2add8a11
    • Wei Ding's avatar
      AMDGPU : Add XNACK feature to GPUs that support it. · 5e832e86
      Wei Ding authored
      Differential Revision: http://reviews.llvm.org/D24276
      
      llvm-svn: 280742
      5e832e86
    • Reid Kleckner's avatar
      Fix ItaniumDemangle.cpp build with MSVC 2013 · b2881f1f
      Reid Kleckner authored
      llvm-svn: 280740
      b2881f1f
    • Ying Yi's avatar
      [llvm-cov] Add the "Go to first unexecuted line" feature. · d36b47c4
      Ying Yi authored
      This patch provides easy navigation to find the zero count lines, especially useful when the source file is very large.
      
      Differential Revision: https://reviews.llvm.org/D23277
      
      llvm-svn: 280739
      d36b47c4
    • Evandro Menezes's avatar
      [AArch64] Adjust the scheduling model for Exynos M1. · 405c90e6
      Evandro Menezes authored
      Further refine the model for branches.
      
      llvm-svn: 280736
      405c90e6
    • Evandro Menezes's avatar
      [AArch64] Adjust the scheduling model for Exynos M1. · 77e6b5d4
      Evandro Menezes authored
      Further refine the model for stores.
      
      llvm-svn: 280735
      77e6b5d4
    • Evandro Menezes's avatar
      [AArch64] Adjust the scheduling model for Exynos M1. · 199cad4f
      Evandro Menezes authored
      Further refine the model for loads.
      
      llvm-svn: 280734
      199cad4f
    • Rafael Espindola's avatar
      Add an c++ itanium demangler to llvm. · b940b66c
      Rafael Espindola authored
      This adds a copy of the demangler in libcxxabi.
      
      The code also has no dependencies on anything else in LLVM. To enforce
      that I added it as another library. That way a BUILD_SHARED_LIBS will
      fail if anyone adds an use of StringRef for example.
      
      The no llvm dependency combined with the fact that this has to build
      on linux, OS X and Windows required a few changes to the code. In
      particular:
      
          No constexpr.
          No alignas
      
      On OS X at least this library has only one global symbol:
      __ZN4llvm16itanium_demangleEPKcPcPmPi
      
      My current plan is:
      
          Commit something like this
          Change lld to use it
          Change lldb to use it as the fallback
      
          Add a few #ifdefs so that exactly the same file can be used in
          libcxxabi to export abi::__cxa_demangle.
      
      Once the fast demangler in lldb can handle any names this
      implementation can be replaced with it and we will have the one true
      demangler.
      
      llvm-svn: 280732
      b940b66c
    • Sanjay Patel's avatar
      fix formatting; NFC · 4e463b4a
      Sanjay Patel authored
      llvm-svn: 280727
      4e463b4a
    • Davide Italiano's avatar
      [MCTargetDesc] Delete dead code. Found by GCC7 -Wunused-function. · 5715012b
      Davide Italiano authored
      Also unbreak newer gcc build with -Werror.
      
      llvm-svn: 280726
      5715012b
    • Victor Leschuk's avatar
      Fix comment formatting for DebugInfoFlags.def · a2cd4131
      Victor Leschuk authored
      llvm-svn: 280722
      a2cd4131
    • Justin Bogner's avatar
      bugpoint: Return Errors instead of passing around strings · 1c039155
      Justin Bogner authored
      This replaces the threading of `std::string &Error` through all of
      these APIs with checked Error returns instead. There are very few
      places here that actually emit any errors right now, but threading the
      APIs through will allow us to replace a bunch of exit(1)'s that are
      scattered through this code with proper error handling.
      
      This is more or less NFC, but does move around where a couple of error
      messages are printed out.
      
      llvm-svn: 280720
      1c039155
    • Krzysztof Parzyszek's avatar
      [RDF] Ignore undef use operands · 7c9b0126
      Krzysztof Parzyszek authored
      llvm-svn: 280717
      7c9b0126
    • Leny Kholodov's avatar
      Formatting with clang-format patch r280700 · 40c6235b
      Leny Kholodov authored
      llvm-svn: 280716
      40c6235b
    • Simon Pilgrim's avatar
      [SelectionDAG] Simplify extract_subvector( insert_subvector ( Vec, In, Idx ), Idx ) -> In · 1b4462b7
      Simon Pilgrim authored
      If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector.
      
      This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes.
      
      Differential Revision: https://reviews.llvm.org/D24254
      
      llvm-svn: 280715
      1b4462b7
    • Adam Nemet's avatar
      [JumpThreading] Only write back branch-weight MDs for blocks that originally had PGO info · c520822d
      Adam Nemet authored
      Currently the pass updates branch weights in the IR if the function has
      any PGO info (entry frequency is set).  However we could still have
      regions of the CFG that does not have branch weights collected (e.g. a
      cold region).  In this case we'd use static estimates.  Since static
      estimates for branches are determined independently, they are
      inconsistent.  Updating them can "randomly" inflate block frequencies.
      
      I've run into this in a completely cold loop of h264ref from
      SPEC.  -Rpass-with-hotness showed the loop to be completely cold during
      inlining (before JT) but completely hot during vectorization (after JT).
      
      The new testcase demonstrate the problem.  We check array elements
      against 1, 2 and 3 in a loop.  The check against 3 is the loop-exiting
      check.  The block names should be self-explanatory.
      
      In this example, jump threading incorrectly updates the weight of the
      loop-exiting branch to 0, drastically inflating the frequency of the
      loop (in the range of billions).
      
      There is no run-time profile info for edges inside the loop, so branch
      probabilities are estimated.  These are the resulting branch and block
      frequencies for the loop body:
      
                      check_1 (16)
                  (8) /  |
                  eq_1   | (8)
                      \  |
                      check_2 (16)
                  (8) /  |
                  eq_2   | (8)
                      \  |
                      check_3 (16)
                  (1) /  |
             (loop exit) | (15)
                         |
                    (back edge)
      
      First we thread eq_1 -> check_2 to check_3.  Frequencies are updated to
      remove the frequency of eq_1 from check_2 and then from the false edge
      leaving check_2.  Changed frequencies are highlighted with * *:
      
                      check_1 (16)
                  (8) /  |
                 eq_1~   | (8)
                 /       |
                /     check_2 (*8*)
               /  (8) /  |
               \  eq_2   | (*0*)
                \     \  |
                 ` --- check_3 (16)
                  (1) /  |
             (loop exit) | (15)
                         |
                    (back edge)
      
      Next we thread eq_1 -> check_3 and eq_2 -> check_3 to check_1 as new
      back edges.  Frequencies are updated to remove the frequency of eq_1 and
      eq_3 from check_3 and then the false edge leaving check_3 (changed
      frequencies are highlighted with * *):
      
                        check_1 (16)
                    (8) /  |
                   eq_1~   | (8)
                   /       |
                  /     check_2 (*8*)
                 /  (8) /  |
                /-- eq_2~  | (*0*)
        (back edge)        |
                        check_3 (*0*)
                  (*0*) /  |
               (loop exit) | (*0*)
                           |
                      (back edge)
      
      As a result, the loop exit edge ends up with 0 frequency which in turn makes
      the loop header to have maximum frequency.
      
      There are a few potential problems here:
      
      1. The profile data seems odd.  There is a single profile sample of the
      loop being entered.  On the other hand, there are no weights inside the
      loop.
      
      2. Based on static estimation we shouldn't set edges to "extreme"
      values, i.e. extremely likely or unlikely.
      
      3. We shouldn't create profile metadata that is calculated from static
      estimation.  I am not sure what policy is but it seems to make sense to
      treat profile metadata as something that is known to originate from
      profiling.  Estimated probabilities should only be reflected in BPI/BFI.
      
      Any one of these would probably fix the immediate problem.  I went for 3
      because I think it's a good policy to have and added a FIXME about 2.
      
      Differential Revision: https://reviews.llvm.org/D24118
      
      llvm-svn: 280713
      c520822d
    • Leny Kholodov's avatar
      Fix for Bindings/Go/go.test after patch r280700 · dabff7d8
      Leny Kholodov authored
      llvm-svn: 280711
      dabff7d8
    • Chris Dewhurst's avatar
      [Sparc][Leon] Corrected supported atomics size for processors supporting Leon... · 92cac932
      Chris Dewhurst authored
      [Sparc][Leon] Corrected supported atomics size for processors supporting Leon CASA instruction back to 32 bits.
      
      This was erroneously checked-in for 64 bits while trying to find if there was a way to get 64 bit atomicity in Leon processors. There is not and this change should not have been checked-in. There is no unit test for this as the existing unit tests test for behaviour to 32 bits, which was the original intention of the code.
      
      llvm-svn: 280710
      92cac932
    • Simon Dardis's avatar
      [mips] Tighten FastISel restrictions · b432a3ed
      Simon Dardis authored
      LLVM PR/29052 highlighted that FastISel for MIPS attempted to lower
      arguments assuming that it was using the paired 32bit registers to
      perform operations for f64. This mode of operation is not supported
      for MIPSR6.
      
      This patch resolves the reported issue by adding additional checks
      for unsupported floating point unit configuration.
      
      Thanks to mike.k for reporting this issue!
      
      Reviewers: seanbruno, vkalintiris
      
      Differential Review: https://reviews.llvm.org/D23795
      
      llvm-svn: 280706
      b432a3ed
    • Krzysztof Parzyszek's avatar
      [PPC] Claim stack frame before storing into it, if no red zone is present · 020ec299
      Krzysztof Parzyszek authored
      Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of it 
      there is no guarantee that this part of the stack will not be modified 
      by any interrupt. To avoid this, make sure to claim the stack frame first
      before storing into it.
      
      This fixes https://llvm.org/bugs/show_bug.cgi?id=26519.
      
      Differential Revision: https://reviews.llvm.org/D24093
      
      llvm-svn: 280705
      020ec299
    • Leny Kholodov's avatar
      DebugInfo: use strongly typed enum for debug info flags · 5fcc4185
      Leny Kholodov authored
      Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes:
      
      Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4
      Flags are now strongly typed
      Patch by: Victor Leschuk <vleschuk@gmail.com>
      
      Differential Revision: https://reviews.llvm.org/D23766
      
      llvm-svn: 280700
      5fcc4185
    • Silviu Baranga's avatar
      [RegisterScavenger] Remove aliasing registers of operands from the candidate set · 0b7c4af3
      Silviu Baranga authored
      Summary:
      In addition to not including the register operand of the current
      instruction also don't include any aliasing registers. We can't consider
      these as candidates because using them will clobber the corresponding
      register operand of the current instruction.
      
      This change doesn't include a test case and it would probably be difficult
      to produce a stable one since the bug depends on the results of register
      allocation.
      
      Reviewers: MatzeB, qcolombet, hfinkel
      
      Subscribers: hfinkel, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24130
      
      llvm-svn: 280698
      0b7c4af3
    • Craig Topper's avatar
      [AVX-512] Fix masked VPERMI2PS isel when the index comes from a bitcast. · 4fa3b50f
      Craig Topper authored
      We need to bitcast the index operand to a floating point type so that it matches the result type. If not then the passthru part of the DAG will be a bitcast from the index's original type to the destination type. This makes it very difficult to match. The other option would be to add 5 sets of patterns for every other possible type.
      
      llvm-svn: 280696
      4fa3b50f
    • Craig Topper's avatar
      [AVX-512] Add a test case to show that we don't select masked vpermi2ps when... · cf9f1b8d
      Craig Topper authored
      [AVX-512] Add a test case to show that we don't select masked vpermi2ps when the index operand comes from a bitcast.
      
      It doesn't work because we're looking for a bitcast from the v4i32 index operand to v4f32 for the passthru part of the DAG. But since the index is bitcasted from v2i64 and bitcasts fold, we actually have a bitcast from v2i64 to v4f32 in the passthru part of the DAG.
      
      Taken from optimized output from clang's test case.
      
      llvm-svn: 280695
      cf9f1b8d
    • Craig Topper's avatar
      [X86] Remove unused encoding from IntrinsicType enum. · 43fbd840
      Craig Topper authored
      llvm-svn: 280694
      43fbd840
    • Craig Topper's avatar
      [X86] Fix indentation. NFC · a0055d31
      Craig Topper authored
      llvm-svn: 280693
      a0055d31
    • Justin Bogner's avatar
      Revert "bugpoint: Stop threading errors through APIs that never fail" · 24dac6af
      Justin Bogner authored
      This isn't the right thing to do - it turns out a number of the APIs
      that "never fail" just exit(1) if something bad happens. We can and
      should thread Error through this instead.
      
      That diff will make more sense with this reverted. Sorry for the
      noise.
      
      This reverts r280690
      
      llvm-svn: 280691
      24dac6af
    • Justin Bogner's avatar
      bugpoint: Stop threading errors through APIs that never fail · 46b1a9a7
      Justin Bogner authored
      This simplifies ListReducer and most of its subclasses by removing the
      std::string &Error that was threaded through all of them but almost
      never used. If we end up needing error handling in more places here we
      can reinstate it using llvm::Error instead of these unwieldy strings.
      
      The 2 cases (out of 12) that actually can hit the error cases are a
      little bit awkward now, but those will clean up as I refactor this API
      further.
      
      llvm-svn: 280690
      46b1a9a7
    • Saleem Abdulrasool's avatar
      ARM: workaround bundled operation predication · bfa25bd1
      Saleem Abdulrasool authored
      This is a Windows ARM specific issue.  If the code path in the if conversion
      ends up using a relocation which will form a IMAGE_REL_ARM_MOV32T, we end up
      with a bundle to ensure that the mov.w/mov.t pair is not split up.  This is
      normally fine, however, if the branch is also predicated, then we end up trying
      to predicate the bundle.
      
      For now, report a bundle as being unpredicatable.  Although this is false, this
      would trigger a failure case previously anyways, so this is no worse.  That is,
      there should not be any code which would previously have been if converted and
      predicated which would not be now.
      
      Under certain circumstances, it may be possible to "predicate the bundle".  This
      would require scanning all bundle instructions, and ensure that the bundle
      contains only predicatable instructions, and converting the bundle into an IT
      block sequence.  If the bundle is larger than the maximal IT block length (4
      instructions), it would require materializing multiple IT blocks from the single
      bundle.
      
      llvm-svn: 280689
      bfa25bd1
    • Mehdi Amini's avatar
      Revert "DebugInfo: use strongly typed enum for debug info flags" · 3821b53b
      Mehdi Amini authored
      This reverts commit r280686, bots are broken.
      
      llvm-svn: 280688
      3821b53b
    • Mehdi Amini's avatar
      [LTO] Constify (NFC) · 767e1457
      Mehdi Amini authored
      llvm-svn: 280687
      767e1457
    • Mehdi Amini's avatar
      DebugInfo: use strongly typed enum for debug info flags · 356d6b63
      Mehdi Amini authored
      Use ADT/BitmaskEnum for DINode::DIFlags for the following purposes:
          * Get rid of unsigned int for flags to avoid problems on platforms with sizeof(int) < 4
          * Flags are now strongly typed
      
      Patch by: Victor Leschuk <vleschuk@gmail.com>
      
      Differential Revision: https://reviews.llvm.org/D23766
      
      llvm-svn: 280686
      356d6b63
    • Mehdi Amini's avatar
      Fix DensetSet::insert_as() for MSVC2015 (NFC) · ac00212f
      Mehdi Amini authored
      The latest MSVC update apparently resolve the call from the
      const ref variant to itself, leading to an infinite
      recursion. It is not clear to me why the r-value overload is
      not selected. `ValueT` is a pointer type, and the functional-style
      cast in the call `insert_as(ValueT(V), LookupKey);` should result
      in a r-value ref. A bug in MSVC?
      
      Differential Revision: https://reviews.llvm.org/D23956
      
      llvm-svn: 280685
      ac00212f
    • Craig Topper's avatar
      [AVX-512] Fix v8i64 shift by immediate lowering on 32-bit targets. · 62d0a5e7
      Craig Topper authored
      llvm-svn: 280684
      62d0a5e7
    • Saleem Abdulrasool's avatar
      CodeGen: ensure that libcalls are always AAPCS CC · a6519b1d
      Saleem Abdulrasool authored
      All of the builtins are designed to be invoked with ARM AAPCS CC even on ARM
      AAPCS VFP CC hosts.  Tweak the default initialisation to ARM AAPCS CC rather
      than C CC for ARM/thumb targets.
      
      The changes to the tests are necessary to ensure that the calling convention for
      the lowered library calls are honoured.  Furthermore, these adjustments cause
      certain branch invocations to change to branch-and-link since the returned value
      needs to be moved across registers (d0 -> r0, r1).
      
      llvm-svn: 280683
      a6519b1d
    • Craig Topper's avatar
      [AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions... · dfc4fc9f
      Craig Topper authored
      [AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double.
      
      Still need to fix the register classes to allow the extended range of registers.
      
      llvm-svn: 280682
      dfc4fc9f
    • Craig Topper's avatar
      [X86] Update fast-isel store test to have more 256 and 512-bit test cases. Add... · 70e13480
      Craig Topper authored
      [X86] Update fast-isel store test to have more 256 and 512-bit test cases. Add command lines for AVX and AVX512 feature sets.
      
      llvm-svn: 280681
      70e13480
    • Craig Topper's avatar
      [X86] Update fast-isel vector load test to have more 256 and 512-bit test... · f54ebca2
      Craig Topper authored
      [X86] Update fast-isel vector load test to have more 256 and 512-bit test cases. Add a command line for SKX features too.
      
      llvm-svn: 280680
      f54ebca2
    • Sanjay Patel's avatar
      fix FileCheck variables for test added with r280677 · e341c919
      Sanjay Patel authored
      The script (utils/update_test_checks.py) seems to have problems 
      with variable names that start with the same string. 
      
      llvm-svn: 280679
      e341c919
    • Gor Nishanov's avatar
      [Coroutines] Part12: Handle alloca address-taken · ccabaca2
      Gor Nishanov authored
      Summary:
      Move early uses of spilled variables after CoroBegin.
      
      For example, if a parameter had address taken, we may end up with the code
      like:
              define @f(i32 %n) {
                %n.addr = alloca i32
                store %n, %n.addr
                ...
                call @coro.begin
      
      This patch fixes the problem by moving uses of spilled variables after CoroBegin.
      
      Reviewers: majnemer
      
      Subscribers: mehdi_amini, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24234
      
      llvm-svn: 280678
      ccabaca2
Loading