- Sep 04, 2016
-
-
Sanjay Patel authored
The transform in question: icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1' ...is still not enabled for vectors, thus no functional change intended. It's not clear to me if this is a good transform for vectors or even scalars in general. Changing that behavior may be a follow-on patch. llvm-svn: 280627
-
Hal Finkel authored
We used to compute the padding contributions to the block sizes during branch relaxation only at the start of the transformation. As we perform branch relaxation, we change the sizes of the blocks, and so the amount of inter-block padding might change. Accordingly, we need to recompute the (alignment-based) padding in between every iteration on our way toward the fixed point. Unfortunately, I don't have a test case (and none was provided in the bug report), and while this obviously seems needed, algorithmically, I don't have any way of generating a small and/or non-fragile regression test. llvm-svn: 280626
-
Igor Breger authored
https://llvm.org/bugs/show_bug.cgi?id=30249 llvm-svn: 280625
-
Simon Pilgrim authored
llvm-svn: 280624
-
Simon Pilgrim authored
llvm-svn: 280623
-
Joerg Sonnenberger authored
macros. llvm-svn: 280622
-
Kuba Brecka authored
call_once is using relaxed atomic load to perform double-checked locking, which contains a data race. The fast-path load has to be an acquire atomic load. Differential Revision: https://reviews.llvm.org/D24028 llvm-svn: 280621
-
Chandler Carruth authored
This was mistakenly committed. The world isn't ready for this test, the test code has horrible debugging code in it that should never have landed in tree, it currently passes because of bugs elsewhere, and it needs to be rewritten to not be susceptible to passing for the wrong reasons. I'll re-land this in a better form when the prerequisite patches land. So sorry that I got this mixed into a series of commits that *were* ready to land. I shouldn't have. =[ What's worse is that it stuck around for so long and I discovered it while fixing the underlying bug that caused it to pass. llvm-svn: 280620
-
Chandler Carruth authored
make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619
-
Chandler Carruth authored
a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618
-
Dorit Nuzman authored
memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617
-
Lang Hames authored
ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The practical impact of this change is that ORC users no longer need to link MCJIT to use ObjectCaches. llvm-svn: 280616
-
Dorit Nuzman authored
llvm-svn: 280615
-
Hal Finkel authored
As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614
-
Elad Cohen authored
This adds support for modules that require (non-)freestanding environment, such as the compiler builtin mm_malloc submodule. Differential Revision: https://reviews.llvm.org/D23871 llvm-svn: 280613
-
Eric Fiselier authored
llvm-svn: 280612
-
Craig Topper authored
llvm-svn: 280611
-
Joseph Tremoulet authored
Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610
-
Joerg Sonnenberger authored
llvm-svn: 280609
-
Eric Fiselier authored
llvm-svn: 280608
-
Joerg Sonnenberger authored
llvm-svn: 280607
-
Todd Fiala authored
Tracked by: llvm.org/pr30271 llvm-svn: 280606
-
Marshall Clow authored
llvm-svn: 280605
-
Todd Fiala authored
This code represents the Week of Code work I did on bringing up lldb-server LLGS support for Darwin. It does not include the Xcode project changes needed, as we don't want to throw that switch until more support is implemented (i.e. this change is inert, no build systems use it yet. I've verified on Ubuntu 16.04, macOS Xcode and macOS cmake builds). This change does some minimal refactoring of code that is shared with the Linux LLGS portion, moving it from NativeProcessLinux into NativeProcessProtocol. That code is also used by NativeProcessDarwin. Current state on Darwin: * Process launching is implemented. (Attach is not). Launching on devices has not yet been tested (FBS/BKS might need a bit of work). * Inferior waitpid monitoring and communication of exit status via MainLoop callback is implemented. * Memory read/write, breakpoints, thread register context, etc. are not yet implemented. This impacts process stop/resume, as the initial launch suspended immediately starts the process up and running because it doesn't know it is supposed to remain stopped. * I implemented the equivalent of MachThreadList as NativeThreadListDarwin, in anticipation that we might want to factor out common parts into NativeThreadList{Protocol} and share some code here. After writing it, though, the fallout from merging Mach Task/Process into a single concept plus some other minor changes makes the whole NativeThreadListDarwin concept nothing more than dead weight. I am likely going to get rid of this class and just manage it directly in NativeProcessDarwin, much like I did for NativeProcessLinux. * There is a stub-out call for starting a STDIO thread. That will go away and adopt the MainLoop pselect-based IOObject reading. I am developing the fully-integrated changes in the following repo, which contains the necessary Xcode bits and the glue that enables lldb-debugserver on a macOS system: https://github.com/tfiala/lldb/tree/llgs-darwin This change also breaks out a few of the lldb-server tests into their own directory, and adds some $qHostInfo tests (not sure why I didn't write those tests back when I initially implemented that on the Linux side). llvm-svn: 280604
-
Craig Topper authored
llvm-svn: 280603
-
Xinliang David Li authored
Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602
-
- Sep 03, 2016
-
-
Tobias Grosser authored
... but instead rely on the assumptions that we derive for load/store instructions. Before we were able to delinearize arrays, we used GEP pointer instructions to derive information about the likely range of induction variables, which gave us more freedom during loop scheduling. Today, this is not needed any more as we delinearize multi-dimensional memory accesses and as part of this process also "assume" that all accesses to these arrays remain inbounds. The old derive-assumptions-from-GEP code has consequently become mostly redundant. We drop it both to clean up our code, but also to improve compile time. This change reduces the scop construction time for 3mm in no-asserts mode on my machine from 48 to 37 ms. llvm-svn: 280601
-
Xinliang David Li authored
CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600
-
Mehdi Amini authored
Because the recent change about ODR type uniquing in the context, we can reach types defined in another module during IR linking. This triggered some assertions in case we IR link without starting from an empty module. To alleviate that, we can self-map metadata defined in the destination module so that they won't be visited. Differential Revision: https://reviews.llvm.org/D23841 llvm-svn: 280599
-
Simon Pilgrim authored
llvm-svn: 280598
-
Craig Topper authored
llvm-svn: 280597
-
Craig Topper authored
llvm-svn: 280596
-
Matt Arsenault authored
llvm-svn: 280595
-
Matt Arsenault authored
I'm not sure if this should be considered a bug in copyImplicitOps or not, but implicit operands that are part of the static instruction definition should not be copied. llvm-svn: 280594
-
Craig Topper authored
[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test. llvm-svn: 280593
-
Craig Topper authored
llvm-svn: 280592
-
Aaron Ballman authored
llvm-svn: 280591
-
Nicolai Haehnle authored
Summary: This contains two changes that reduce the time spent in WQM, with the intention of reducing bandwidth required by VMEM loads: 1. Sampling instructions by themselves don't need to run in WQM, only their coordinate inputs need it (unless of course there is a dependent sampling instruction). The initial scanInstructions step is modified accordingly. 2. When switching back from WQM to Exact, switch back as soon as possible. This affects the logic in processBlock. This should always be a win or at best neutral. There are also some cleanups (e.g. remove unused ExecExports) and some new debugging output. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22092 llvm-svn: 280590
-
Nicolai Haehnle authored
Summary: This fixes a rare bug in polygon stippling with non-monolithic pixel shaders. The underlying problem is as follows: the prolog part contains the polygon stippling sequence, i.e. a kill. The main part then enables WQM based on the _reduced_ exec mask, effectively undoing most of the polygon stippling. Since we cannot know whether polygon stippling will be used, the main part of a non-monolithic shader must always return to exact mode to fix this problem. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23131 llvm-svn: 280589
-
Eric Fiselier authored
Summary: This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`. Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should *never* be old thread local data because pthread *should* null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec: > > Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread. > > An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits. > > If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop. However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit". Reviewers: mclow.lists, bcraig, EricWF Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24159 llvm-svn: 280588
-