Skip to content
  1. Sep 04, 2016
    • Sanjay Patel's avatar
      [InstCombine] recode icmp fold in a vector-friendly way; NFC · 6b490974
      Sanjay Patel authored
      The transform in question:
      icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1'
      
      ...is still not enabled for vectors, thus no functional change intended.
      It's not clear to me if this is a good transform for vectors or even
      scalars in general. Changing that behavior may be a follow-on patch.
      
      llvm-svn: 280627
      6b490974
    • Hal Finkel's avatar
      [PowerPC] During branch relaxation, recompute padding offsets before each iteration · f0bc9db9
      Hal Finkel authored
      We used to compute the padding contributions to the block sizes during branch
      relaxation only at the start of the transformation. As we perform branch
      relaxation, we change the sizes of the blocks, and so the amount of inter-block
      padding might change. Accordingly, we need to recompute the (alignment-based)
      padding in between every iteration on our way toward the fixed point.
      
      Unfortunately, I don't have a test case (and none was provided in the bug
      report), and while this obviously seems needed, algorithmically, I don't have
      any way of generating a small and/or non-fragile regression test.
      
      llvm-svn: 280626
      f0bc9db9
    • Igor Breger's avatar
      revert r279960. · 7e2a0dfa
      Igor Breger authored
      https://llvm.org/bugs/show_bug.cgi?id=30249
      
      llvm-svn: 280625
      7e2a0dfa
    • Simon Pilgrim's avatar
      EOL fixes · 9a36318c
      Simon Pilgrim authored
      llvm-svn: 280624
      9a36318c
    • Simon Pilgrim's avatar
      Strip trailing whitespace · 122b0de1
      Simon Pilgrim authored
      llvm-svn: 280623
      122b0de1
    • Joerg Sonnenberger's avatar
      Test case for r280607 to check presence and sanity of the *_LOCK_FREE · 9bd9d989
      Joerg Sonnenberger authored
      macros.
      
      llvm-svn: 280622
      9bd9d989
    • Kuba Brecka's avatar
      [libcxx] Fix a data race in call_once · 224264ad
      Kuba Brecka authored
      call_once is using relaxed atomic load to perform double-checked locking, which contains a data race. The fast-path load has to be an acquire atomic load.
      
      Differential Revision: https://reviews.llvm.org/D24028
      
      llvm-svn: 280621
      224264ad
    • Chandler Carruth's avatar
      [PM] Revert r280447: Add a unittest for invalidating module analyses with an SCC pass. · ccd44939
      Chandler Carruth authored
      This was mistakenly committed. The world isn't ready for this test, the
      test code has horrible debugging code in it that should never have
      landed in tree, it currently passes because of bugs elsewhere, and it
      needs to be rewritten to not be susceptible to passing for the wrong
      reasons.
      
      I'll re-land this in a better form when the prerequisite patches land.
      
      So sorry that I got this mixed into a series of commits that *were*
      ready to land. I shouldn't have. =[ What's worse is that it stuck around
      for so long and I discovered it while fixing the underlying bug that
      caused it to pass.
      
      llvm-svn: 280620
      ccd44939
    • Chandler Carruth's avatar
      [LCG] Clean up and make NDEBUG verify calls more rigorous with · 11b3f60c
      Chandler Carruth authored
      make_scope_exit now that we have that utility.
      
      This makes the code much more clear and readable by isolating the check.
      It also makes it easy to go through and make sure all the interesting
      update routines have a start and end verify so we don't slowly let the
      graph drift into an invalid state.
      
      llvm-svn: 280619
      11b3f60c
    • Chandler Carruth's avatar
      [LCG] A NFC refactoring to extract the logic for doing · 1f621f0a
      Chandler Carruth authored
      a postorder-sequence based update after edge insertion into a generic
      helper function.
      
      This separates the SCC-specific logic into two fairly simple lambdas and
      extracts the rest into a generic helper template function. I think this
      is a net win on its own merits because it disentangles different pieces
      of the algorithm. Now there is one place that does the two-step
      partition to identify a set of newly connected components and at the
      same time update the postorder sequence.
      
      However, I'm also hoping to re-use this an upcoming patch to update
      a cached post-order sequence of RefSCCs when doing the analogous update
      to the RefSCC graph, and I don't want to have two copies.
      
      The diff is quite messy but this really is just moving things around and
      making types generic rather than specific.
      
      llvm-svn: 280618
      1f621f0a
    • Dorit Nuzman's avatar
      [InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing · abd15f69
      Dorit Nuzman authored
      memcpy with ld/st.
      
      When InstCombine replaces a memcpy with loads+stores it does not copy over the
      llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes
      that.
      
      Differential Revision: https://reviews.llvm.org/D23499
      
      llvm-svn: 280617
      abd15f69
    • Lang Hames's avatar
      [ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine. · 3301c7ee
      Lang Hames authored
      ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The
      practical impact of this change is that ORC users no longer need to link MCJIT
      to use ObjectCaches.
      
      llvm-svn: 280616
      3301c7ee
    • Dorit Nuzman's avatar
      Test commit. · 7673ba7a
      Dorit Nuzman authored
      llvm-svn: 280615
      7673ba7a
    • Hal Finkel's avatar
      [PowerPC] Zero-extend constants in FastISel · 73390c7a
      Hal Finkel authored
      As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which
      are illegal types promoted to i32 on PowerPC, is a choice constrained by
      assumptions within the infrastructure. Specifically, the logic in
      FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI
      operands will be zero extended, and so, at least when materializing constants
      that are PHI operands, we must do the same.
      
      The rest of our fast-isel implementation does not appear to depend on the fact
      that we were sign-extending i8/i16 constants, and all other targets also appear
      to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we
      had been doing this only for i1 constants, and sign-extending the others).
      
      Fixes PR27721.
      
      llvm-svn: 280614
      73390c7a
    • Elad Cohen's avatar
      [Modules] Add 'freestanding' to the 'requires-declaration' feature-list. · fb6358d2
      Elad Cohen authored
      This adds support for modules that require (non-)freestanding
      environment, such as the compiler builtin mm_malloc submodule.
      
      Differential Revision: https://reviews.llvm.org/D23871
      
      llvm-svn: 280613
      fb6358d2
    • Eric Fiselier's avatar
      Apply curr_symbol.pass.cpp test fix to missed test case · 8e571b55
      Eric Fiselier authored
      llvm-svn: 280612
      8e571b55
    • Craig Topper's avatar
    • Joseph Tremoulet's avatar
      Fix inliner funclet unwind memoization · e92e0a90
      Joseph Tremoulet authored
      Summary:
      The inliner may need to determine where a given funclet unwinds to,
      and this determination may depend on other funclets throughout the
      funclet tree.  The code that performs this walk in getUnwindDestToken
      memoizes results to avoid redundant computations.  In the case that
      a funclet's unwind destination is derived from its ancestor, there's
      code to walk back down the tree from the ancestor updating the memo
      map of its descendants to record the unwind destination.  This change
      fixes that code to account for the case that some descendant has a
      different unwind destination, which can happen if that unwind dest
      is a descendant of the EHPad being queried and thus didn't determine
      its unwind destination.
      
      Also update test inline-funclets.ll, which is supposed to cover such
      scenarios, to include a case that fails an assertion without this fix
      but passes with it.
      
      Fixes PR29151.
      
      
      Reviewers: majnemer
      
      Subscribers: llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D24117
      
      llvm-svn: 280610
      e92e0a90
    • Joerg Sonnenberger's avatar
      Trailing dot that shouldn't have been committed. · b50b2fac
      Joerg Sonnenberger authored
      llvm-svn: 280609
      b50b2fac
    • Eric Fiselier's avatar
      Fix bad locale test data when using the newest glibc · f49fe8f2
      Eric Fiselier authored
      llvm-svn: 280608
      f49fe8f2
    • Joerg Sonnenberger's avatar
      PR 27200: Fix names of the atomic lock-free macros. · 82216f0f
      Joerg Sonnenberger authored
      llvm-svn: 280607
      82216f0f
    • Todd Fiala's avatar
      XFAIL TestGdbRemoteExitCode failing tests · b1a503bd
      Todd Fiala authored
      Tracked by:
      llvm.org/pr30271
      
      llvm-svn: 280606
      b1a503bd
    • Marshall Clow's avatar
      Mark test as XFAIL for C++03, rather than providing a dummy pass. · 2a837eae
      Marshall Clow authored
      llvm-svn: 280605
      2a837eae
    • Todd Fiala's avatar
      [NFC] Darwin llgs support from Week of Code · e77fce0a
      Todd Fiala authored
      This code represents the Week of Code work I did on bringing up
      lldb-server LLGS support for Darwin.  It does not include the
      Xcode project changes needed, as we don't want to throw that switch
      until more support is implemented (i.e. this change is inert, no
      build systems use it yet.  I've verified on Ubuntu 16.04, macOS
      Xcode and macOS cmake builds).
      
      This change does some minimal refactoring of code that is shared
      with the Linux LLGS portion, moving it from NativeProcessLinux into
      NativeProcessProtocol.  That code is also used by NativeProcessDarwin.
      
      Current state on Darwin:
      * Process launching is implemented.  (Attach is not).
        Launching on devices has not yet been tested (FBS/BKS might
        need a bit of work).
      * Inferior waitpid monitoring and communication of exit status
        via MainLoop callback is implemented.
      * Memory read/write, breakpoints, thread register context, etc.
        are not yet implemented.  This impacts process stop/resume, as
        the initial launch suspended immediately starts the process
        up and running because it doesn't know it is supposed to remain
        stopped.
      * I implemented the equivalent of MachThreadList as
        NativeThreadListDarwin, in anticipation that we might want to
        factor out common parts into NativeThreadList{Protocol} and share
        some code here.  After writing it, though, the fallout from merging
        Mach Task/Process into a single concept plus some other minor
        changes makes the whole NativeThreadListDarwin concept nothing more
        than dead weight.  I am likely going to get rid of this class and
        just manage it directly in NativeProcessDarwin, much like I did
        for NativeProcessLinux.
      * There is a stub-out call for starting a STDIO thread.  That will
        go away and adopt the MainLoop pselect-based IOObject reading.
      
      I am developing the fully-integrated changes in the following repo,
      which contains the necessary Xcode bits and the glue that enables
      lldb-debugserver on a macOS system:
      
        https://github.com/tfiala/lldb/tree/llgs-darwin
      
      This change also breaks out a few of the lldb-server tests into
      their own directory, and adds some $qHostInfo tests (not sure why
      I didn't write those tests back when I initially implemented that
      on the Linux side).
      
      llvm-svn: 280604
      e77fce0a
    • Craig Topper's avatar
      [X86] Combine some of the strings in autoupgrade code. · a57d2ca4
      Craig Topper authored
      llvm-svn: 280603
      a57d2ca4
    • Xinliang David Li's avatar
      Cleanup : Use metadata preserving API for branch creation · 7a28a7fb
      Xinliang David Li authored
      Use the wrapper API in IRBuilder that does meta data copy
      to create new branch in LoopUnswitch.
      
      llvm-svn: 280602
      7a28a7fb
  2. Sep 03, 2016
    • Tobias Grosser's avatar
      ScopInfo: Do not derive assumptions from all GEP pointer instructions · 8d4cb1a0
      Tobias Grosser authored
      ... but instead rely on the assumptions that we derive for load/store
      instructions.
      
      Before we were able to delinearize arrays, we used GEP pointer instructions
      to derive information about the likely range of induction variables, which
      gave us more freedom during loop scheduling. Today, this is not needed
      any more as we delinearize multi-dimensional memory accesses and as part
      of this process also "assume" that all accesses to these arrays remain
      inbounds. The old derive-assumptions-from-GEP code has consequently become
      mostly redundant. We drop it both to clean up our code, but also to improve
      compile time. This change reduces the scop construction time for 3mm in
      no-asserts mode on my machine from 48 to 37 ms.
      
      llvm-svn: 280601
      8d4cb1a0
    • Xinliang David Li's avatar
      [Profile] preserve branch metadata lowering select in CGP · 241e6c70
      Xinliang David Li authored
      CGP currently drops select's MD_prof profile data when
      generating conditional branch which can lead to bad
      code layout. The patch fixes the issue.
      
      Differential Revision: http://reviews.llvm.org/D24169
      
      llvm-svn: 280600
      241e6c70
    • Mehdi Amini's avatar
      Fix ThinLTO crash with debug info · ebb34348
      Mehdi Amini authored
      Because the recent change about ODR type uniquing in the context,
      we can reach types defined in another module during IR linking.
      This triggered some assertions in case we IR link without starting
      from an empty module. To alleviate that, we can self-map metadata
      defined in the destination module so that they won't be visited.
      
      Differential Revision: https://reviews.llvm.org/D23841
      
      llvm-svn: 280599
      ebb34348
    • Simon Pilgrim's avatar
      Strip trailing whitespace · 3606d234
      Simon Pilgrim authored
      llvm-svn: 280598
      3606d234
    • Craig Topper's avatar
      f43e4a17
    • Craig Topper's avatar
      0e18976b
    • Matt Arsenault's avatar
      AMDGPU: Set sizes of spill pseudos · ac42ba86
      Matt Arsenault authored
      llvm-svn: 280595
      ac42ba86
    • Matt Arsenault's avatar
      AMDGPU: Fix adding duplicate implicit exec uses · 5ffe3e1d
      Matt Arsenault authored
      I'm not sure if this should be considered a bug in
      copyImplicitOps or not, but implicit operands that are part
      of the static instruction definition should not be copied.
      
      llvm-svn: 280594
      5ffe3e1d
    • Craig Topper's avatar
      [AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an... · 907b580d
      Craig Topper authored
      [AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test.
      
      llvm-svn: 280593
      907b580d
    • Craig Topper's avatar
    • Aaron Ballman's avatar
      Fix the attribute documentation build. · bb18e91c
      Aaron Ballman authored
      llvm-svn: 280591
      bb18e91c
    • Nicolai Haehnle's avatar
      AMDGPU: Reduce the duration of whole-quad-mode · 3bba6a84
      Nicolai Haehnle authored
      Summary:
      This contains two changes that reduce the time spent in WQM, with the
      intention of reducing bandwidth required by VMEM loads:
      
      1. Sampling instructions by themselves don't need to run in WQM, only their
         coordinate inputs need it (unless of course there is a dependent sampling
         instruction). The initial scanInstructions step is modified accordingly.
      
      2. When switching back from WQM to Exact, switch back as soon as possible.
         This affects the logic in processBlock.
      
      This should always be a win or at best neutral.
      
      There are also some cleanups (e.g. remove unused ExecExports) and some new
      debugging output.
      
      Reviewers: arsenm, tstellarAMD, mareko
      
      Subscribers: arsenm, llvm-commits, kzhuravl
      
      Differential Revision: http://reviews.llvm.org/D22092
      
      llvm-svn: 280590
      3bba6a84
    • Nicolai Haehnle's avatar
      AMDGPU: Fix an interaction between WQM and polygon stippling · a246dccc
      Nicolai Haehnle authored
      Summary:
      This fixes a rare bug in polygon stippling with non-monolithic pixel shaders.
      
      The underlying problem is as follows: the prolog part contains the polygon
      stippling sequence, i.e. a kill. The main part then enables WQM based on the
      _reduced_ exec mask, effectively undoing most of the polygon stippling.
      
      Since we cannot know whether polygon stippling will be used, the main part
      of a non-monolithic shader must always return to exact mode to fix this
      problem.
      
      Reviewers: arsenm, tstellarAMD, mareko
      
      Subscribers: arsenm, llvm-commits, kzhuravl
      
      Differential Revision: https://reviews.llvm.org/D23131
      
      llvm-svn: 280589
      a246dccc
    • Eric Fiselier's avatar
      Fix PR30202 - notify_all_at_thread_exit seg faults if run from a raw pthread context. · ff94d250
      Eric Fiselier authored
      Summary:
      This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`.
      
      Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should *never* be old thread local data because pthread *should* null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec:
      
      
      > 
      > Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread.
      > 
      > An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits.
      > 
      > If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop.
      
      However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit".
      
      
      
      
      
      Reviewers: mclow.lists, bcraig, EricWF
      
      Subscribers: cfe-commits
      
      Differential Revision: https://reviews.llvm.org/D24159
      
      llvm-svn: 280588
      ff94d250
Loading