[NativeProcessLinux] Fix race condition during inferior thread creation
The following situation occured if we were stopping a process (due to breakpoint, watchpoint, ... hit) while a new thread was being created. - process has two threads: A and B. - thread A hits a breakpoint: we send a STOP signal to thread B and register a callback with ThreadStateCoordinator to send a stop notification after the thread stops. - thread B stops, but not due to the SIGSTOP, but on a thread creation event (of a new thread C). We are unaware of our desire to stop, so we queue ThreadStopped and RequestResume operations with TSC, so the thread can continue running. - TSC receives the ThreadStopped event, sees that all threads are stopped and fires the delayed stop notification. - immediately after that TSC gets the RequestResume operation, so it resumes the thread. At this point the state is inconsistent because LLDB thinks the process is stopped and will start issuing commands to it, but one of the threads is in fact running. Things eventually break. I address this problem by omitting the two TSC events altogether and Resuming the thread B directly. This way the short stop is invisible to the TSC and the delayed notification will not fire. We will fire the notification when we actually process the SIGSTOP on thread B. When we get the initial SIGSTOP for thread C, we also resume the thread and send a ThreadWasCreated message (is_stopped = false) to the TSC. This way, the TSC can stop the thread on its own and handle the stop event later. This way the state of the new thread is correctly handled as well (thanks Chaoren for the idea). This patch also removes the synchronisation between the thread creation notifications on threads B and C. The need for this synchronisation is unclear (the comments seem to hint that the new thread is "fully created" only after we process both events, but I have noticed no regressions in treating it as "created" even after just processing the initial C event), but it is a source for many kinds of obscure races, since it introduces a new thread state "Launching" and the rest of the code does not handle this state at all (what happens if we get a resume request from LLDB while this thread is launching? what happens if we get a stop request? etc.). This fixes the "spurious $O packet" problem in TestPrintStackTraces.py. However, the test remains disabled on i386 due to the VDSO issue. Test Plan: TestPrintStackTraces works on x86_64. No regressions in the rest of the test suite. Reviewers: vharron, chaoren Subscribers: lldb-commits Differential Revision: http://reviews.llvm.org/D9145 llvm-svn: 235579
Loading
Please sign in to comment