- Apr 19, 2016
-
-
Jonathan Peyton authored
llvm-svn: 266760
-
- Apr 18, 2016
-
-
Jonathan Peyton authored
Some codes that use TLS fail intermittently because one thread tries to write TLS values after the TLS key has been destroyed by another thread. This happens when one thread executes library shutdown (and destroys TLS keys), while another thread starts to execute the TLS key destructor routine. Before this change, the kmp_init_runtime flag was checked before calling pthread_* TLS functions, but this flag is set to FALSE later than the destruction of the TLS keys, which leads to failure. The fix is to check kmp_init_gtid instead, as this flag is unset *before* the destruction of TLS keys. Differential Revision: http://reviews.llvm.org/D19022 llvm-svn: 266674
-
- Mar 29, 2016
-
-
Jonathan Peyton authored
llvm-svn: 264776
-
- Mar 02, 2016
-
-
Jonathan Peyton authored
From the standard: A doacross loop nest is a loop nest that has cross-iteration dependence. An iteration is dependent on one or more lexicographically earlier iterations. The ordered clause parameter on a loop directive identifies the loop(s) associated with the doacross loop nest. The init/fini routines allocate/free doacross buffer(s) for each loop for each thread. The wait routine waits for a flag designated by the dependence vector. The post routine sets the flag designated by current iteration vector. We use a similar technique of shared buffer indices that covers up to 7 nowait loops executed simultaneously by different threads (number 7 has no real meaning, just heuristic value). Also, the size of structures are kept intact via reducing dummy arrays. This needs to be put into the OpenMP runtime library in order for the compiler team to develop the compiler side of the implementation. Differential Revision: http://reviews.llvm.org/D17399 llvm-svn: 262532
-
- Feb 25, 2016
-
-
Jonathan Peyton authored
This change introduces the new OpenMP 4.5 affinity api surrounding OpenMP Places. There are six new entry points: Typically called in serial region: * omp_get_num_places - returns the number of places available to the execution environment in the place list. * omp_get_place_num_procs - returns the number of processors available to the execution environment in the specified place. * omp_get_place_proc_ids - returns the numerical identifiers of the processors available to the execution environment in the specified place. Typically called inside parallel region: * omp_get_place_num - returns the place number of the place to which the encountering thread is bound. * omp_get_partition_num_places - returns the number of places in the place partition of the innermost implicit task. * omp_get_partition_place_nums - returns the list of place numbers corresponding to the places in the place-var ICV of the innermost implicit task. Differential Revision: http://reviews.llvm.org/D17417 llvm-svn: 261915
-
- Feb 09, 2016
-
-
Jonathan Peyton authored
The problem is that the master's thread state was not saved before entering a parallel region so it does not remember tasks when it returns. llvm-svn: 260306
-
- Jan 28, 2016
-
-
Jonas Hahnfeld authored
When the code behind the barrier is executed, the master thread may have already resumed execution. That's why we cannot safely assume that *pteam is not yet freed. This has been introduced by r258866. llvm-svn: 259037
-
- Jan 27, 2016
-
-
Jonathan Peyton authored
Removing extraneous { } bracket sections. Unindenting blocks of code as a result. Also removing empty #ifdef KMP_STUB llvm-svn: 258986
-
Jonathan Peyton authored
Removing references to non-existent functions, fixing typos. llvm-svn: 258985
-
Jonathan Peyton authored
llvm-svn: 258984
-
- Jan 26, 2016
-
-
Jonathan Peyton authored
For implcit barriers in simple parallel for loops, the order of the OMPT events was wrong. The barrier_{begin,end} events came after the implcit_task_end event for the implcit barrier at the end of the parallel region. This is wrong because the implicit task executes the barrier before ending. This patch fixes the order of the event: It will be triggerd now just before __kmp_pop_current_task_from_thread() is called. Patch by Tim Cramer Differential Revision: http://reviews.llvm.org/D16347 llvm-svn: 258866
-
- Jan 11, 2016
-
-
Jonathan Peyton authored
Change (__kmp_mic_type != non_mic) to (__kmp_mic_type == mic2) llvm-svn: 257380
-
- Dec 17, 2015
-
-
Jonathan Peyton authored
llvm-svn: 255901
-
- Nov 30, 2015
-
-
Jonathan Peyton authored
Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a number less than the number of processors. Now the number of threads will be silently reduced if the user didn't specify teams parameters or with a warning if the user specified teams parameters conflicting with OMP_THREAD_LIMIT. Differential Revision: http://reviews.llvm.org/D14732 llvm-svn: 254322
-
- Nov 16, 2015
-
-
Jonathan Peyton authored
llvm-svn: 253264
-
- Nov 12, 2015
-
-
Jonathan Peyton authored
llvm-svn: 252953
-
- Nov 04, 2015
-
-
Jonathan Peyton authored
in __kmp_free_team(), the team's number of processors can be == 1. llvm-svn: 252086
-
Jonathan Peyton authored
llvm-svn: 252084
-
Jonathan Peyton authored
This is a refactoring of the task_team code that more elegantly handles the two task_team case. Two task_teams per team are kept in use for the lifetime of the team. Thus no reference counting is needed. Differential Revision: http://reviews.llvm.org/D13993 llvm-svn: 252082
-
- Oct 20, 2015
-
-
Jonathan Peyton authored
The th.th_task_state for the master thread at the start of a nested parallel should not be zeroed in __kmp_allocate_team() because it is later put in the stack of states in __kmp_fork_call() for further re-use after exiting the nested region. It is zeroed after being put in the stack. Differential Revision: http://reviews.llvm.org/D13702 llvm-svn: 250847
-
- Oct 19, 2015
-
-
Jonathan Peyton authored
Without this fix, cancellation requests in one parallel region cause cancellation of the second region even though the second one was not intended to be cancelled. llvm-svn: 250727
-
- Oct 08, 2015
-
-
Jonathan Peyton authored
llvm-svn: 249725
-
Jonathan Peyton authored
These changes improve/update the trace messages and debug asserts related to the previous wait/release checkin. llvm-svn: 249717
-
Jonathan Peyton authored
These changes improve the wait/release mechanism for threads spinning in barriers that are handling tasks while spinnin by providing feedback to the barriers about any task stealing that occurs. Differential Revision: http://reviews.llvm.org/D13353 llvm-svn: 249711
-
- Sep 21, 2015
-
-
Joerg Sonnenberger authored
llvm-svn: 248204
-
Jonathan Peyton authored
Prior to this change, OMPT had a status flag ompt_status, which could take several values. This was due to an earlier OMPT design that had several levels of enablement (ready, disabled, tracking state, tracking callbacks). The current OMPT design has OMPT support either on or off. This revision replaces ompt_status with a boolean flag ompt_enabled, which simplifies the runtime logic for OMPT. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D12999 llvm-svn: 248189
-
Jonathan Peyton authored
The OMPT specification has changed. This revision brings the LLVM OpenMP implementation up to date. Technical overview of changes: Previously, a public weak symbol ompt_initialize was called after the OpenMP runtime is initialized. The new interface calls a global weak symbol ompt_tool prior to initialization. If a tool is present, ompt_tool returns a pointer to a function that matches the signature for ompt_initialize. After OpenMP is initialized the function pointer is called to initialize a tool. Knowing that OMPT will be enabled before initialization allows OMPT support to be initialized as part of initialization instead of back patching initialization of OMPT support after the fact. Post OpenMP initialization support has been generalized moves from ompt-specific.c into ompt-general.c, since the OMPT initialization logic is no longer implementation specific. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D12998 llvm-svn: 248187
-
Jonathan Peyton authored
This change adds guards to the code in places where they are missing to enable the OpenMP 3.0 build. Patch by Diego Caballero and Johnny Peyton Mailing List: http://lists.llvm.org/pipermail/openmp-dev/2015-September/000935.html llvm-svn: 248178
-
- Sep 10, 2015
-
-
Jonathan Peyton authored
This only triggered when built in debug mode with OMPT enabled: __kmp_wait_template expected the state of the current thread to be either ompt_state_idle or ompt_state_wait_barrier{,_implicit,_explicit}. Patch by Jonas Hahnfeld Differential Revision: http://reviews.llvm.org/D12754 llvm-svn: 247339
-
Jonathan Peyton authored
Some of this is improvement to code suggested by Hal Finkel. Four changes here: 1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not 2.Separated this and other classes and common functions out to a header file 3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup) 4.Remove some redundant code that is hopefully no longer needed Differential Revision: http://reviews.llvm.org/D12449 llvm-svn: 247326
-
Jonathan Peyton authored
The fix is to make b_arrived flag 64 bit in both structures - kmp_balign_team_t and kmp_balign_t. Otherwise when flag in kmp_balign_team_t wrapped over UINT_MAX the library hangs. Differential Revision: http://reviews.llvm.org/D12563 llvm-svn: 247320
-
- Sep 02, 2015
-
-
Jonathan Peyton authored
The th.th_team_nproc is assigned in __kmp_allocate_thread() just 3 lines above, so there is no need to assign the same value again. llvm-svn: 246703
-
- Aug 31, 2015
-
-
Jonathan Peyton authored
Conditionally include the fork_context parameter to __kmp_join_call() only if OMPT_SUPPORT=1 Differential Revision: http://reviews.llvm.org/D12495 llvm-svn: 246460
-
- Aug 18, 2015
-
-
Andrey Churbanov authored
llvm-svn: 245286
-
- Aug 17, 2015
-
-
Andrey Churbanov authored
llvm-svn: 245209
-
Andrey Churbanov authored
llvm-svn: 245206
-
- Aug 11, 2015
-
-
Jonathan Peyton authored
This removes some statistics counters and timers which were not used, adds new counters and timers for some language features that were not monitored previously and separates the counters and timers into those which are of interest for investigating user code and those which are only of interest to the developer of the runtime itself. The runtime developer statistics are now ony collected if the additional #define KMP_DEVELOPER_STATS is set. Additional user statistics which are now collected include: * Count of nested parallelism (omp parallel inside a parallel region) * Count of omp distribute occurrences * Count of omp teams occurrences * Counts of task related statistics (taskyield, task execution, task cancellation, task steal) * Values passed to omp_set_numtheads * Time spent in omp single and omp master None of this affects code compiled without stats gathering enabled, which is the normal library build mode. This also fixes the CMake build by linking to the standard c++ library when building the stats library as it is a requirement. The normal library does not have this requirement and its link phase is left alone. Differential Revision: http://reviews.llvm.org/D11759 llvm-svn: 244677
-
- Jul 23, 2015
-
-
Jonathan Peyton authored
Compiling simple testcase with g++ and linking it to the LLVM OpenMP runtime compiled in debug mode trips an assertion that produces a fatal error. When the assertion is skipped, the program runs successfully to completion and produces the same answer as the sequential code. Intel will restore the assertion with a patch that fixes the issues that cause it to trip. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D11269 llvm-svn: 243032
-
- Jul 21, 2015
-
-
Jonathan Peyton authored
This patch makes it possible for a performance tool that uses call stack unwinding to map implementation-level call stacks from master and worker threads into a unified global view. There are several components to this patch. include/*/ompt.h.var Add a new enumeration type that indicates whether the code for a master task for a parallel region is invoked by the user program or the runtime system Change the signature for OMPT parallel begin/end callbacks to indicate whether the master task will be invoked by the program or the runtime system. This enables a performance tool using call stack unwinding to handle these two cases differently. For this case, a profiler that uses call stack unwinding needs to know that the call path prefix for the master task may differ from those available within the begin/end callbacks if the program invokes the master. kmp.h Change the signature for __kmp_join_call to take an additional parameter indicating the fork_context type. This is needed to supply the OMPT parallel end callback with information about whether the compiler or the runtime invoked the master task for a parallel region. kmp_csupport.c Ensure that the OMPT task frame field reenter_runtime_frame is properly set and cleared before and after calls to fork and join threads for a parallel region. Adjust the code for the new signature for __kmp_join_call. Adjust the OMPT parallel begin callback invocations to carry the extra parameter indicating whether the program or the runtime invokes the master task for a parallel region. kmp_gsupport.c Apply all of the analogous changes described for kmp_csupport.c for the GOMP interface Add OMPT support for the GOMP combined parallel region + loop API to maintain the OMPT task frame field reenter_runtime_frame. kmp_runtime.c: Use the new information passed by __kmp_join_call to adjust the OMPT parallel end callback invocations to carry the extra parameter indicating whether the program or the runtime invokes the master task for a parallel region. ompt_internal.h: Use the flavor of the parallel region API (GNU or Intel) to determine who invokes the master task. Differential Revision: http://reviews.llvm.org/D11259 llvm-svn: 242817
-
- Jul 09, 2015
-
-
Jonathan Peyton authored
These changes enable external debuggers to conveniently interface with the LLVM OpenMP Library. Structures are added which describe the important internal structures of the OpenMP Library e.g., teams, threads, etc. This feature is turned on by default (CMake variable LIBOMP_USE_DEBUGGER) and can be turned off with -DLIBOMP_USE_DEBUGGER=off. Differential Revision: http://reviews.llvm.org/D10038 llvm-svn: 241832
-