- Nov 30, 2018
-
-
Jonathan Peyton authored
There is a conflict between libomptarget and libomp concerning some of the standard OpenMP device API which needs further intestigation. llvm-svn: 347932
-
- Nov 29, 2018
-
-
Jonathan Peyton authored
This patch adds __kmpc_omp_reg_task_with_affinity to register affinity information for tasks. For now, the affinity information is not used, and the function always succeeds. This also adds the kmp_task_affinity_info_t structure to store the task affinity information. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55026 llvm-svn: 347907
-
- Nov 28, 2018
-
-
Jonathan Peyton authored
This change renames ompt_mutex_impl_unknown to ompt_mutex_impl_none, following the name change in the specification. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D54347 llvm-svn: 347802
-
Jonathan Peyton authored
* Fix calculation of string length. * Remove NULL-check of pointer which has been dereferenced. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D54948 llvm-svn: 347801
-
Jonathan Peyton authored
There is low probability that array th_hot_teams can be accessed out of bound (when many nested levels are requested to keep hot teams via KMP_HOT_TEAMS_MAX_LEVEL). The patch adds the check of index that fixes the problem. Patch by Andrey Churbanov Differential Revision: https://reviews.llvm.org/D54950 llvm-svn: 347800
-
Jonathan Peyton authored
Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Also, did some cleanup and updating of device API functions to make them into weak functions that should be replaced with libomptarget functions when libomptarget is present. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D54342 llvm-svn: 347799
-
- Nov 27, 2018
-
-
Gheorghe-Teodor Bercea authored
Summary: To enable the compiler to optimize parts of the function that are not needed when runtime can be omitted, a new version of the SPMD deinit kernel function is needed. This function takes the runtime required flag as an argument. Reviewers: ABataev, kkwli0, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D54969 llvm-svn: 347714
-
Alexey Bataev authored
Summary: Added functions __kmpc_nvptx_teams_reduce_nowait_simple and __kmpc_nvptx_teams_end_reduce_nowait_simple to implement basic support for reductions across the teams. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D54967 llvm-svn: 347710
-
Gheorghe-Teodor Bercea authored
Summary: Refactor the checking for SPMD mode and whether the runtime is initialized or not. This uses constant flags which enables the runtime to optimize out unused sections of code that depend on these flags. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54960 llvm-svn: 347698
-
- Nov 20, 2018
-
-
Alexey Bataev authored
Summary: Improved support for critical constructs + omp_..._lock... constructs. Reviewers: gtbercea, kkwli0, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54766 llvm-svn: 347342
-
- Nov 14, 2018
-
-
https://bugs.llvm.org/show_bug.cgi?id=39137Andrey Churbanov authored
Do not write to internal structure if it keeps same value. Differential Revision: https://reviews.llvm.org/D54305 llvm-svn: 346862
-
- Nov 08, 2018
-
-
Alexey Bataev authored
Summary: The base pointer for the lambda mapping must point to the lambda capture placement and pointer must point to the captured variable itself. Patch fixes this problem. Reviewers: gtbercea Subscribers: guansong, openmp-commits, kkwli0, caomhin Differential Revision: https://reviews.llvm.org/D54260 llvm-svn: 346407
-
- Nov 07, 2018
-
-
Andrey Churbanov authored
Patch by samuel.thibault@ens-lyon.org Differential Revision: https://reviews.llvm.org/D54079 llvm-svn: 346310
-
Andrey Churbanov authored
Differential Revision: https://reviews.llvm.org/D53380 llvm-svn: 346307
-
- Nov 02, 2018
-
-
Alexey Bataev authored
Summary: The previously used combination `PTR_AND_OBJ | PRIVATE` could be used for mapping of some data in Fortran. Changed it to `PTR_AND_OBJ | LITERAL`. Reviewers: gtbercea Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D54035 llvm-svn: 345981
-
Alexey Bataev authored
Summary: Current globalization scheme works correctly only for SPMD+lightweight runtime mode and does not work for full runtime. Patch improves support for the globalization scheme + reduces global memory consumption in lightweight runtime mode. Patch adds runtime functions to work with the statically allocated global memory. It allows to improve performance and memory consumption. This global memory must be allocated by the compiler. Reviewers: grokos, kkwli0, gtbercea, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53943 llvm-svn: 345976
-
- Nov 01, 2018
-
-
Gheorghe-Teodor Bercea authored
Summary: In the case of coalesced global records, we need to push the exact data size passed in. This patch fixes this by outlining the common functionality of the previous push function and by adding a separate entry point for coalesced pushes. The pop function remains unchanged. Reviewers: ABataev, grokos, caomhin Reviewed By: ABataev, grokos Subscribers: jholewinski, cfe-commits, Hahnfeld, guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D53141 llvm-svn: 345867
-
- Oct 30, 2018
-
-
Alexey Bataev authored
Summary: Added support for correct mapping of variables captured by reference in lambdas. That kind of mapping may appear only in target-executable regions and must follow the original lambda or another lambda capture for the same lambda. The expected data: base address - the address of the lambda, begin pointer - pointer to the address of the lambda capture, size - size of the captured variable. When OMP_TGT_MAPTYPE_PTR_AND_OBJ mapping type is seen in target-executable region, the target address of the last processed item is taken as the address of the original lambda `tgt_lambda_ptr`. Then, the pointer to capture on the device is calculated like `tgt_lambda_ptr + (host_begin_pointer - host_begin_base)` and the target-based address of the original variable (which host address is `*(void**)begin_pointer`) is written to that pointer. Reviewers: kkwli0, gtbercea, grokos Subscribers: openmp-commits Differential Revision: https://reviews.llvm.org/D51107 llvm-svn: 345608
-
- Oct 25, 2018
-
-
Andrey Churbanov authored
Patch by squallatf@gmail.com Differential Revision: https://reviews.llvm.org/D53480 llvm-svn: 345255
-
- Oct 05, 2018
-
-
Jonathan Peyton authored
llvm-svn: 343869
-
Jonathan Peyton authored
The KMP_DYNAMIC_LIB guard was hard set to 1. This patch has the guard depend on CMake variable LIBOMP_ENABLE_SHARED. llvm-svn: 343866
-
- Oct 04, 2018
-
-
Jonathan Peyton authored
Initializing an ompt_data_t object using the pointer union member is potentially unsafe in 32-bit programs. This change fixes the issue by using the constant, ompt_data_none. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D52046 llvm-svn: 343785
-
- Oct 02, 2018
-
-
Jonathan Peyton authored
On Windows, child workers are terminated by the parent during the normal program exit process (ExitProcess()) and they are not able to finish generating their OpenMP events. We can force manual library shut down in __kmpc_end() to fix this at least for the cases where __kmpc_end() is properly inserted. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D52628 llvm-svn: 343619
-
- Oct 01, 2018
-
-
Jonas Hahnfeld authored
If the user requested LIBOMPTARGET_NVPTX_DEBUG, include asserts in the bitcode library. Everything else will have very unpleasent effects because asserts will appear when falling back to the static library libomptarget-nvptx.a. Differential Revision: https://reviews.llvm.org/D52701 llvm-svn: 343477
-
Jonas Hahnfeld authored
Pass in the correct value of isRuntimeUninitialized() which solves parallel reductions as reported on the mailing list. For reference: r333285 did the same for loop scheduling. Differential Revision: https://reviews.llvm.org/D52725 llvm-svn: 343476
-
https://reviews.llvm.org/D51694Andrey Churbanov authored
Patch suggested by Kelvin Li: removed optional "kind=" part of kind-selector for variables with long names and kind names. Differential Revision: https://reviews.llvm.org/D52712 llvm-svn: 343475
-
- Sep 30, 2018
-
-
Jonas Hahnfeld authored
NVPTX requires addresses of pointer locations to be 8-byte aligned or there will be an exception during runtime. This could happen without this patch as shown in the added test: getId() requires 4 byte of stack and putValueInParallel() uses 16 bytes to store the addresses of the captured variables. Differential Revision: https://reviews.llvm.org/D52655 llvm-svn: 343402
-
Jonas Hahnfeld authored
According to OpenMP 4.5, p250:12-14: If the requested nest level is outside the range of 0 and the nest level of the current thread, as returned by the omp_get_level routine, the routine returns -1. The SPMD code path will need a similar fix. Differential Revision: https://reviews.llvm.org/D51787 llvm-svn: 343401
-
- Sep 29, 2018
-
-
Jonas Hahnfeld authored
Clang trunk will serialize nested parallel regions. Check that this is correctly reflected in various API methods. Differential Revision: https://reviews.llvm.org/D51786 llvm-svn: 343382
-
Jonas Hahnfeld authored
There is no support and according to the OpenMP 4.5, p238:7-9: For implementations that do not support dynamic adjustment of the number of threads this routine has no effect: the value of dyn-var remains false. Add a test that cancellation and nested parallelism aren't supported either. Differential Revision: https://reviews.llvm.org/D51785 llvm-svn: 343381
-
Jonas Hahnfeld authored
If there is no num_threads() clause we must consider the nthreads-var ICV. Its value is set by omp_set_num_threads() and can be queried using omp_get_max_num_threads(). The rewritten code now closely resembles the algorithm given in the OpenMP standard. Differential Revision: https://reviews.llvm.org/D51783 llvm-svn: 343380
-
- Sep 28, 2018
-
-
Alexey Bataev authored
infinite loop on removing non-mapped pointer-with-object. Added test to check that libomptarget does not cause infinite loop when trying to unmap the pointer-with-object data that was not previously mapped. llvm-svn: 343344
-
Jonas Hahnfeld authored
This patch also introduces testing for libomptarget-nvptx which has been missing until now. I propose to add tests for all bugs that are fixed in the future. The target check-libomptarget-nvptx is not run by default because - we can't determine if there is a GPU plugged into the system. - it will require the latest Clang compiler. Keeping compatibility with older releases would prevent testing newer code generation developed in trunk. Differential Revision: https://reviews.llvm.org/D51687 llvm-svn: 343324
-
- Sep 26, 2018
-
-
Jonathan Peyton authored
This patch puts the __kmpc_critical_with_hint function in dllexports and also replaces some OMP_45_ENABLED to OMP_50_ENABLED Differential Revision: https://reviews.llvm.org/D52380 llvm-svn: 343143
-
Jonathan Peyton authored
Balanced affinity only updated the thread's affinity with the operating system. This change also has the thread's private mask reflect that change as well so that any API that probes the thread's affinity mask will report the correct mask value. Differential Revision: https://reviews.llvm.org/D52379 llvm-svn: 343142
-
Jonathan Peyton authored
This patch updates the ittnotify sources to the latest corresponding with Intel(R) VTune(TM) Amplifier 2018 Differential Revision: https://reviews.llvm.org/D52378 llvm-svn: 343139
-
Jonathan Peyton authored
This change improves the performance of 376.kdtree by giving the compiler an opportunity to do inlining and other optimizations for the call path, __kmpc_omp_task_complete_if0()->__kmp_task_finish(), which is one of the hot paths in the program; some functions in kmp_taskdeps.cpp were moved to the new header file, kmp_taskdeps.h to achieve this. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D51889 llvm-svn: 343138
-
Jonathan Peyton authored
This change includes miscellaneous improvements as follows: 1) Added ompt_get_proc_id() implementation for Windows 2) Added parser and print tool for omp-tool-var, just in case it needs to be printed (OMP_DISPLAY_ENV) 3) omp_control_tool is exported on Windows Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D50538 llvm-svn: 343137
-
- Sep 25, 2018
-
-
Gheorghe-Teodor Bercea authored
Summary: NFC - just fixing a bug: the empty slot test was before the re-setting of the Stack pointer. Reviewers: ABataev, caomhin, Hahnfeld Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D52122 llvm-svn: 343006
-
Gheorghe-Teodor Bercea authored
Summary: There is currently no supported situation where the warp master is not the first thread in the warp. This also avoids the device execution from hanging on Volta GPUs when ballot_sync is called by a number of threads that is less that the size of a warp. Reviewers: ABataev, caomhin, grokos Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D50188 llvm-svn: 342972
-