- Dec 14, 2015
-
-
Jonathan Peyton authored
Visual studio can't handle the asm extension in the KMP_USE_TSX code sections. llvm-svn: 255514
-
- Dec 11, 2015
-
-
Jonathan Peyton authored
This change set includes all changes to make the code conform to the OMP 4.5 specification: * Removed hint / hinted_init definitions from include/40 files * Hint values are powers of 2 to enable composition (4.5 spec) * Hinted lock initialization functions were renamed (4.5 spec) kmp_init_lock_hinted -> omp_init_lock_with_hint kmp_init_nest_lock_hinted -> omp_init_nest_lock_with_hint * __kmpc_critical_section_with_hint was added to support a critical section with a hint (4.5 spec) * __kmp_map_hint_to_lock was added to convert a hint (possibly a composite) to an internal lock type * kmpc_init_lock_with_hint and kmpc_init_nest_lock_with_hint were added as internal entries for the hinted lock initializers. The preivous internal functions (__kmp_init*) were moved to kmp_csupport.c and reused in multiple places * Added the two init functions to dllexports * KMP_USE_DYNAMIC_LOCK is turned on if OMP_41_ENABLED is turned on Differential Revision: http://reviews.llvm.org/D15205 llvm-svn: 255376
-
Jonathan Peyton authored
* Added a new user TSX lock implementation, RTM, This implementation is a light-weight version of the adaptive lock implementation, omitting the back-off logic for deciding when to specualte (or not). The fall-back lock is still the queuing lock. * Changed indirect lock table management. The data for indirect lock management was encapsulated in the "kmp_indirect_lock_table_t" type. Also, the lock table dimension was changed to 2D (was linear), and each entry is a kmp_indirect_lock_t object now (was a pointer to an object). * Some clean up in the critical section code * Removed the limits of the tuning parameters read from KMP_ADAPTIVE_LOCK_PROPS * KMP_USE_DYNAMIC_LOCK=1 also turns on these two switches: KMP_USE_TSX, KMP_USE_ADAPTIVE_LOCKS Differential Revision: http://reviews.llvm.org/D15204 llvm-svn: 255375
-
Jonathan Peyton authored
There are going to be two more patches which bring this feature up to date and in line with OpenMP 4.5. * Renamed jump tables for the lock functions (and some clean up). * Renamed some macros to be in KMP_ namespace. * Return type of unset functions changed from void to int. * Enabled use of _xebgin() et al. intrinsics for accessing TSX instructions. Differential Revision: http://reviews.llvm.org/D15199 llvm-svn: 255373
-
- Dec 03, 2015
-
-
Jonathan Peyton authored
llvm-svn: 254637
-
- Nov 30, 2015
-
-
Jonathan Peyton authored
Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a number less than the number of processors. Now the number of threads will be silently reduced if the user didn't specify teams parameters or with a warning if the user specified teams parameters conflicting with OMP_THREAD_LIMIT. Differential Revision: http://reviews.llvm.org/D14732 llvm-svn: 254322
-
Jonathan Peyton authored
The task_team pointer is dereferenced unconditionally which causes a SEGFAULT when it is NULL (e.g. for serialized parallel, that can happen for "teams" construct or for "target nowait"). The solution is to skip second task team setup for single thread team. Differential Revision: http://reviews.llvm.org/D14729 llvm-svn: 254321
-
Jonathan Peyton authored
These changes allow libhwloc to be used as the topology discovery/affinity mechanism for libomp. It is supported on Unices. The code additions: * Canonicalize KMP_CPU_* interface macros so bitmask operations are implementation independent and work with both hwloc bitmaps and libomp bitmaps. So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and the like. These are all in kmp.h and appropriately placed. * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc interface to create a libomp address2os object which the rest of libomp knows how to handle already. * To build, use -DLIBOMP_USE_HWLOC=on and -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake can't find the library or hwloc.h, then it will tell you and exit. Differential Revision: http://reviews.llvm.org/D13991 llvm-svn: 254320
-
- Nov 16, 2015
-
-
Jonathan Peyton authored
llvm-svn: 253265
-
Jonathan Peyton authored
llvm-svn: 253264
-
Alexey Bataev authored
llvm-svn: 253200
-
- Nov 12, 2015
-
-
Jonathan Peyton authored
Trace when thread is waiting at join phase for oncore children. llvm-svn: 252954
-
Jonathan Peyton authored
llvm-svn: 252953
-
Jonathan Peyton authored
Fix ittnotify loop metadata reporting for schedule(runtime) and chunked schedule set via OMP_SCHEDULE. The bug was that chunk=1 reported always. llvm-svn: 252952
-
- Nov 11, 2015
-
-
Jonathan Peyton authored
The patch adds support for ompt_event_task_switch into LLVM/OpenMP. Note that the patch has also updated the signature of ompt_event_task_switch to ompt_task_pair_callback_t (rather than the previous ompt_task_switch_callback_t). Patch by Harald Servat Differential Revision: http://reviews.llvm.org/D14566 llvm-svn: 252761
-
Jonathan Peyton authored
Patch by Harald Servat Differential Revision: http://reviews.llvm.org/D14565 llvm-svn: 252756
-
- Nov 09, 2015
-
-
Jonathan Peyton authored
1) Add get_ptr_type() method to all wait flag types. 2) Flag in sleep_loc may change type by the time the resume is called from __kmp_null_resume_wrapper. We use get_ptr_type to obtain the real type and compare it to the casted object received. If they don't match, we know the flag has changed (already resumed and replaced by another flag). If they match, it doesn't hurt to go ahead and resume it. Differential Revision: http://reviews.llvm.org/D14458 llvm-svn: 252487
-
Jonathan Peyton authored
1) When the number of threads in a team increases, new threads need to have all their barrier struct fields initialized. We were missing the parent_bar and team fields. 2) For non-forkjoin barriers, we now do the __kmp_task_team_setup before the gather. The setup now sets up the task_team that all the threads will switch to after the barrier, but it needs to be done before other threads do the switch. 3) Remove an unneeded assignment of tt_found_tasks in task team free function. Differential Revision: http://reviews.llvm.org/D14456 llvm-svn: 252486
-
Jonathan Peyton authored
These changes include: 1) Machine hierarchy now uses the base_num_threads field to indicate the maximum number of threads the current hierarchy can handle without a resize. 2) In __kmp_get_hierarchy, we need to get depth after any potential resize is done. 3) Cleanup of hierarchy resize code to support 1 above. Differential Revision: http://reviews.llvm.org/D14455 llvm-svn: 252475
-
Jonathan Peyton authored
llvm-svn: 252472
-
- Nov 06, 2015
-
-
Jonathan Peyton authored
Setting dynamic schedule with chunk size 0 via omp_set_schedule(dynamic,0) and then using "schedule (runtime)" causes infinite loop because for the chunked dynamic schedule we didn't correct zero chunk to the default (1). llvm-svn: 252338
-
- Nov 05, 2015
-
-
Jonathan Peyton authored
Use of #ifdef OMPT_DEBUG was causing messages to be generated under normal operation when the OpenMP library was compiled with KMP_DEBUG enabled. Elsewhere, KMP_DEBUG evaluates assertions, but never produces messages during normal operation. To avoid this inconsistency, set OMPT_DEBUG using a cmake variable LIBOMP_OMPT_DEBUG. While I was editing the associated ompt-specific.h and ompt-general.c files, make the spacing and comments consistent. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D14355 llvm-svn: 252173
-
- Nov 04, 2015
-
-
Jonathan Peyton authored
in __kmp_free_team(), the team's number of processors can be == 1. llvm-svn: 252086
-
Jonathan Peyton authored
llvm-svn: 252084
-
Jonathan Peyton authored
This is a refactoring of the task_team code that more elegantly handles the two task_team case. Two task_teams per team are kept in use for the lifetime of the team. Thus no reference counting is needed. Differential Revision: http://reviews.llvm.org/D13993 llvm-svn: 252082
-
- Nov 02, 2015
-
-
Alexey Bataev authored
Add additional dependency to clang/clang-headers/FileCheck to avoid possible troubles with in-tree build/test of libomp + allow parallel testing of libomp. Also includes bugfixes for tests + improvements to avoid possible race conditions. Differential Revision: http://reviews.llvm.org/D14055 llvm-svn: 251797
-
- Oct 30, 2015
-
-
Jonathan Peyton authored
llvm-svn: 251719
-
- Oct 29, 2015
-
-
Jonathan Peyton authored
The problem is that the ompt_tool() function (which must be implemented by a performance tool) should be defined in the RTL as well to cover the case when the tool is not present in the address space of the process. This functionality is accomplished with weak symbols in Unices. Unfortunately, Windows does not support weak symbols. The solution in these changes is to grab the list of all modules loaded by the process and then search for symbol "ompt_tool()" within them. The function ompt_tool_windows() performs the search of the ompt_tool symbol. If ompt_tool is found, then its return value is used to initialize the tool. If ompt_tool is not found, then ompt_tool_windows() returns NULL and OMPT is thus, disabled. While doing these changes, the OMPT_SUPPORT detection in CMake was changed to test for the required featuers for OMPT_SUPPORT, namely: builtin_frame_address() existence, weak attribute existence and psapi.dll existence. For LIBOMP_HAVE_OMPT_SUPPORT to be true, it must be that the builtin_frame_address() intrinsic exists AND one of: either weak attributes exist or psapi.dll exists. Also, since Process Status API is used I had to add new dependency -- psapi.dll to the library dependency micro test. Differential Revision: http://reviews.llvm.org/D14027 llvm-svn: 251654
-
- Oct 20, 2015
-
-
Jonathan Peyton authored
The th.th_task_state for the master thread at the start of a nested parallel should not be zeroed in __kmp_allocate_team() because it is later put in the stack of states in __kmp_fork_call() for further re-use after exiting the nested region. It is zeroed after being put in the stack. Differential Revision: http://reviews.llvm.org/D13702 llvm-svn: 250847
-
Jonathan Peyton authored
Moved '@' from delimiters to offset designators for the KMP_PLACE_THREADS environment variable. Only one of: postfix "o" or prefix @, should be used in the value of KMP_PLACE_THREADS. For example, '2s@2,4c@2,1t'. This is also the format of KMP_SETTINGS=1 output now (removed "o" from there). e.g., 2s,2o,4c,2o,1t. Differential Revision: http://reviews.llvm.org/D13701 llvm-svn: 250846
-
- Oct 19, 2015
-
-
Jonathan Peyton authored
Just moved the *scan++ line up before the recursive call. Otherwise, infinite recursion occurs and leads to a segmentation fault. llvm-svn: 250729
-
Jonathan Peyton authored
Without this fix, cancellation requests in one parallel region cause cancellation of the second region even though the second one was not intended to be cancelled. llvm-svn: 250727
-
Dimitry Andric authored
warnings similar to the following: runtime/src/kmp_global.c:117:35: warning: implicit conversion from 'unsigned long' to 'int' changes value from 18446744073709551615 to -1 [-Wconstant-conversion] int __kmp_sys_max_nth = KMP_MAX_NTH; ~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~ runtime/src/kmp.h:849:34: note: expanded from macro 'KMP_MAX_NTH' # define KMP_MAX_NTH PTHREAD_THREADS_MAX ^~~~~~~~~~~~~~~~~~~ Clamp KMP_MAX_NTH to INT_MAX to avoid these warnings. Also use INT_MAX whenever PTHREAD_THREADS_MAX is not defined at all. Differential Revision: http://reviews.llvm.org/D13827 llvm-svn: 250708
-
- Oct 16, 2015
-
-
Jonathan Peyton authored
This fix implements the following OMPT events for the API locking routines: * ompt_event_acquired_lock * ompt_event_acquired_nest_lock_first * ompt_event_acquired_nest_lock_next * ompt_event_init_lock * ompt_event_init_nest_lock * ompt_event_destroy_lock * ompt_event_destroy_nest_lock For the acquired events the depths of the locks ist required, so a return value was added similiar to the return values we already have for the release lock routines. Patch by Tim Cramer Differential Revision: http://reviews.llvm.org/D13689 llvm-svn: 250526
-
- Oct 13, 2015
-
-
Jonathan Peyton authored
llvm-svn: 250198
-
- Oct 12, 2015
-
-
Jonathan Peyton authored
Patch by Alexey Bataev Differential Revision: http://reviews.llvm.org/D13661 llvm-svn: 250066
-
- Oct 09, 2015
-
-
Jonathan Peyton authored
* Avoid computing state needed only by OMPT unless the ompt_enabled flag is set. * Properly handle a corner case in OMPT where team == NULL. Patch by John Mellor-Crummey Differential Revision: http://reviews.llvm.org/D13502 llvm-svn: 249857
-
Jonathan Peyton authored
Because __kmp_task_init_ompt is called for every initial task in each thread and always generated task ids, this was a big performance issue on bigger systems even without any tool attached. After changing the initialization interface to ompt_tool, we can now rely on already knowing whether a tool is attached and OMPT is enabled at this point. Patch by Jonas Hahnfeld Differential Revision: http://reviews.llvm.org/D13494 llvm-svn: 249855
-
- Oct 08, 2015
-
-
Jonathan Peyton authored
llvm-svn: 249725
-
Jonathan Peyton authored
These changes improve/update the trace messages and debug asserts related to the previous wait/release checkin. llvm-svn: 249717
-