- May 12, 2016
-
-
Jonathan Peyton authored
After hot teams were enabled by default, the library started using levels kept in the team structure. The levels are broken in case foreign thread exits and puts its team into the pool which is then re-used by another foreign thread. The broken behavior observed is when printing the levels for each new team, one gets 1, 2, 1, 2, 1, 2, etc. This makes the library believe that every other team is nested which is incorrect. What is wanted is for the levels to be 1, 1, 1, etc. Differential Revision: http://reviews.llvm.org/D19980 llvm-svn: 269363
-
- May 05, 2016
-
-
Jonathan Peyton authored
This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code. There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct. The changes are mostly changing the MACROs to use the new PARITIONED_* macros, the new partitionedTimers class and its methods, and new state logic. Differential Revision: http://reviews.llvm.org/D19229 llvm-svn: 268640
-
- Apr 19, 2016
-
-
Jonathan Peyton authored
llvm-svn: 266760
-
- Apr 18, 2016
-
-
Jonathan Peyton authored
Some codes that use TLS fail intermittently because one thread tries to write TLS values after the TLS key has been destroyed by another thread. This happens when one thread executes library shutdown (and destroys TLS keys), while another thread starts to execute the TLS key destructor routine. Before this change, the kmp_init_runtime flag was checked before calling pthread_* TLS functions, but this flag is set to FALSE later than the destruction of the TLS keys, which leads to failure. The fix is to check kmp_init_gtid instead, as this flag is unset *before* the destruction of TLS keys. Differential Revision: http://reviews.llvm.org/D19022 llvm-svn: 266674
-
- Mar 29, 2016
-
-
Jonathan Peyton authored
llvm-svn: 264776
-
- Mar 02, 2016
-
-
Jonathan Peyton authored
From the standard: A doacross loop nest is a loop nest that has cross-iteration dependence. An iteration is dependent on one or more lexicographically earlier iterations. The ordered clause parameter on a loop directive identifies the loop(s) associated with the doacross loop nest. The init/fini routines allocate/free doacross buffer(s) for each loop for each thread. The wait routine waits for a flag designated by the dependence vector. The post routine sets the flag designated by current iteration vector. We use a similar technique of shared buffer indices that covers up to 7 nowait loops executed simultaneously by different threads (number 7 has no real meaning, just heuristic value). Also, the size of structures are kept intact via reducing dummy arrays. This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17399 llvm-svn: 262532
-
- Feb 25, 2016
-
-
Jonathan Peyton authored
This change introduces the new OpenMP 4.5 affinity api surrounding OpenMP Places. There are six new entry points: Typically called in serial region: * omp_get_num_places - returns the number of places available to the execution environment in the place list. * omp_get_place_num_procs - returns the number of processors available to the execution environment in the specified place. * omp_get_place_proc_ids - returns the numerical identifiers of the processors available to the execution environment in the specified place. Typically called inside parallel region: * omp_get_place_num - returns the place number of the place to which the encountering thread is bound. * omp_get_partition_num_places - returns the number of places in the place partition of the innermost implicit task. * omp_get_partition_place_nums - returns the list of place numbers corresponding to the places in the place-var ICV of the innermost implicit task. Differential Revision: http://reviews.llvm.org/D17417 llvm-svn: 261915
-
- Feb 09, 2016
-
-
Jonathan Peyton authored
The problem is that the master's thread state was not saved before entering a parallel region so it does not remember tasks when it returns. llvm-svn: 260306
-
- Jan 28, 2016
-
-
Jonas Hahnfeld authored
When the code behind the barrier is executed, the master thread may have already resumed execution. That's why we cannot safely assume that *pteam is not yet freed. This has been introduced by r258866. llvm-svn: 259037
-
- Jan 27, 2016
-
-
Jonathan Peyton authored
Removing extraneous { } bracket sections. Unindenting blocks of code as a result. Also removing empty #ifdef KMP_STUB llvm-svn: 258986
-
Jonathan Peyton authored
Removing references to non-existent functions, fixing typos. llvm-svn: 258985
-
Jonathan Peyton authored
llvm-svn: 258984
-
- Jan 26, 2016
-
-
Jonathan Peyton authored
For implcit barriers in simple parallel for loops, the order of the OMPT events was wrong. The barrier_{begin,end} events came after the implcit_task_end event for the implcit barrier at the end of the parallel region. This is wrong because the implicit task executes the barrier before ending. This patch fixes the order of the event: It will be triggerd now just before __kmp_pop_current_task_from_thread() is called. Patch by Tim Cramer Differential Revision: http://reviews.llvm.org/D16347 llvm-svn: 258866
-
- Jan 11, 2016
-
-
Jonathan Peyton authored
Change (__kmp_mic_type != non_mic) to (__kmp_mic_type == mic2) llvm-svn: 257380
-
- Dec 17, 2015
-
-
Jonathan Peyton authored
llvm-svn: 255901
-
- Nov 30, 2015
-
-
Jonathan Peyton authored
Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a number less than the number of processors. Now the number of threads will be silently reduced if the user didn't specify teams parameters or with a warning if the user specified teams parameters conflicting with OMP_THREAD_LIMIT. Differential Revision: http://reviews.llvm.org/D14732 llvm-svn: 254322
-
- Nov 16, 2015
-
-
Jonathan Peyton authored
llvm-svn: 253264
-
- Nov 12, 2015
-
-
Jonathan Peyton authored
llvm-svn: 252953
-
- Nov 04, 2015
-
-
Jonathan Peyton authored
in __kmp_free_team(), the team's number of processors can be == 1. llvm-svn: 252086
-
Jonathan Peyton authored
llvm-svn: 252084
-
Jonathan Peyton authored
This is a refactoring of the task_team code that more elegantly handles the two task_team case. Two task_teams per team are kept in use for the lifetime of the team. Thus no reference counting is needed. Differential Revision: http://reviews.llvm.org/D13993 llvm-svn: 252082
-
- Oct 20, 2015
-
-
Jonathan Peyton authored
The th.th_task_state for the master thread at the start of a nested parallel should not be zeroed in __kmp_allocate_team() because it is later put in the stack of states in __kmp_fork_call() for further re-use after exiting the nested region. It is zeroed after being put in the stack. Differential Revision: http://reviews.llvm.org/D13702 llvm-svn: 250847
-
- Oct 19, 2015
-
-
Jonathan Peyton authored
Without this fix, cancellation requests in one parallel region cause cancellation of the second region even though the second one was not intended to be cancelled. llvm-svn: 250727
-
- Oct 08, 2015
-
-
Jonathan Peyton authored
llvm-svn: 249725
-
Jonathan Peyton authored
These changes improve/update the trace messages and debug asserts related to the previous wait/release checkin. llvm-svn: 249717
-
Jonathan Peyton authored
These changes improve the wait/release mechanism for threads spinning in barriers that are handling tasks while spinnin by providing feedback to the barriers about any task stealing that occurs. Differential Revision: http://reviews.llvm.org/D13353 llvm-svn: 249711
-
- Sep 21, 2015
-
-
Joerg Sonnenberger authored
llvm-svn: 248204
-
Jonathan Peyton authored
Prior to this change, OMPT had a status flag ompt_status, which could take several values. This was due to an earlier OMPT design that had several levels of enablement (ready, disabled, tracking state, tracking callbacks). The current OMPT design has OMPT support either on or off. This revision replaces ompt_status with a boolean flag ompt_enabled, which simplifies the runtime logic for OMPT. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D12999 llvm-svn: 248189
-
Jonathan Peyton authored
The OMPT specification has changed. This revision brings the LLVM OpenMP implementation up to date. Technical overview of changes: Previously, a public weak symbol ompt_initialize was called after the OpenMP runtime is initialized. The new interface calls a global weak symbol ompt_tool prior to initialization. If a tool is present, ompt_tool returns a pointer to a function that matches the signature for ompt_initialize. After OpenMP is initialized the function pointer is called to initialize a tool. Knowing that OMPT will be enabled before initialization allows OMPT support to be initialized as part of initialization instead of back patching initialization of OMPT support after the fact. Post OpenMP initialization support has been generalized moves from ompt-specific.c into ompt-general.c, since the OMPT initialization logic is no longer implementation specific. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D12998 llvm-svn: 248187
-
Jonathan Peyton authored
This change adds guards to the code in places where they are missing to enable the OpenMP 3.0 build. Patch by Diego Caballero and Johnny Peyton Mailing List: http://lists.llvm.org/pipermail/openmp-dev/2015-September/000935.html llvm-svn: 248178
-
- Sep 10, 2015
-
-
Jonathan Peyton authored
This only triggered when built in debug mode with OMPT enabled: __kmp_wait_template expected the state of the current thread to be either ompt_state_idle or ompt_state_wait_barrier{,_implicit,_explicit}. Patch by Jonas Hahnfeld Differential Revision: http://reviews.llvm.org/D12754 llvm-svn: 247339
-
Jonathan Peyton authored
Some of this is improvement to code suggested by Hal Finkel. Four changes here: 1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not 2.Separated this and other classes and common functions out to a header file 3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup) 4.Remove some redundant code that is hopefully no longer needed Differential Revision: http://reviews.llvm.org/D12449 llvm-svn: 247326
-
Jonathan Peyton authored
The fix is to make b_arrived flag 64 bit in both structures - kmp_balign_team_t and kmp_balign_t. Otherwise when flag in kmp_balign_team_t wrapped over UINT_MAX the library hangs. Differential Revision: http://reviews.llvm.org/D12563 llvm-svn: 247320
-
- Sep 02, 2015
-
-
Jonathan Peyton authored
The th.th_team_nproc is assigned in __kmp_allocate_thread() just 3 lines above, so there is no need to assign the same value again. llvm-svn: 246703
-
- Aug 31, 2015
-
-
Jonathan Peyton authored
Conditionally include the fork_context parameter to __kmp_join_call() only if OMPT_SUPPORT=1 Differential Revision: http://reviews.llvm.org/D12495 llvm-svn: 246460
-
- Aug 18, 2015
-
-
Andrey Churbanov authored
llvm-svn: 245286
-
- Aug 17, 2015
-
-
Andrey Churbanov authored
llvm-svn: 245209
-
Andrey Churbanov authored
llvm-svn: 245206
-
- Aug 11, 2015
-
-
Jonathan Peyton authored
This removes some statistics counters and timers which were not used, adds new counters and timers for some language features that were not monitored previously and separates the counters and timers into those which are of interest for investigating user code and those which are only of interest to the developer of the runtime itself. The runtime developer statistics are now ony collected if the additional #define KMP_DEVELOPER_STATS is set. Additional user statistics which are now collected include: * Count of nested parallelism (omp parallel inside a parallel region) * Count of omp distribute occurrences * Count of omp teams occurrences * Counts of task related statistics (taskyield, task execution, task cancellation, task steal) * Values passed to omp_set_numtheads * Time spent in omp single and omp master None of this affects code compiled without stats gathering enabled, which is the normal library build mode. This also fixes the CMake build by linking to the standard c++ library when building the stats library as it is a requirement. The normal library does not have this requirement and its link phase is left alone. Differential Revision: http://reviews.llvm.org/D11759 llvm-svn: 244677
-
- Jul 23, 2015
-
-
Jonathan Peyton authored
Compiling simple testcase with g++ and linking it to the LLVM OpenMP runtime compiled in debug mode trips an assertion that produces a fatal error. When the assertion is skipped, the program runs successfully to completion and produces the same answer as the sequential code. Intel will restore the assertion with a patch that fixes the issues that cause it to trip. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D11269 llvm-svn: 243032
-