"clang/LICENSE.TXT" did not exist on "12b1b42beae60b3ea1de0ce36f004c1cc542cd8d"
- Oct 07, 2021
-
-
Jon Chesterfield authored
Follow on to D110006, related to D110957 Where implementations have diverged this resolves to match the new DeviceRTL - replaces definitions of this struct in deviceRTL and plugins with include - changes the dynamic_shared_size field from D110006 to 32 bits - handles stdint being unavailable in DeviceRTL - adds a zero initializer for the field to amdgpu - moves the extern declaration for deviceRTL to target_interface (omptarget.h is more natural, but doesn't work due to include order with debug.h) - Renames the fields everywhere to match the LLVM format used in DeviceRTL - Makes debug_level uint32_t everywhere (previously sometimes int32_t) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D111069
-
- Oct 04, 2021
-
-
Michał Górny authored
The hand-rolled linking logic in elf_common does not account for the possibility of using LLVM dylib rather than a dozen static libraries. Since it does not seem to be easily convertible to add_llvm_library, just hand-roll support for LLVM_LINK_LLVM_DYLIB. This is necessary to support stand-alone builds against installed LLVM. Differential Revision: https://reviews.llvm.org/D111038
-
- Oct 03, 2021
-
-
Martin Storsjö authored
Differential Revision: https://reviews.llvm.org/D110963
-
- Oct 01, 2021
-
-
Peyton, Jonathan L authored
Store CPUID support flags as bits instead of using entire integers. Differential Revision: https://reviews.llvm.org/D110091
-
Peyton, Jonathan L authored
-
Jon Chesterfield authored
-
- Sep 30, 2021
-
-
Jon Chesterfield authored
Add a FAQ entry about the names of openmp offloading components and how they are searched for. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D109619
-
Jon Chesterfield authored
Fixes 51982. Adds a missing CreatePointerCast and allocates a global in the correct address space. Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\ blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting parts while checking the assertion failure still occurred. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110556
-
Jon Chesterfield authored
Use enum for execution mode. This is partly a port from ROCm and partly a port from D110029. Attempted to make the same choices as ROCm as far as comments etc go to reduce the merge conflicts. There is some cleanup warranted here - in particular I like the cuda patch factoring out the comparisons into named variables - but I'd like to leave that for a follow up patch, keeping this one minimal. Reviewed By: carlo.bertolli Differential Revision: https://reviews.llvm.org/D110845
-
- Sep 29, 2021
-
-
Dhruva Chakrabarti authored
[libomptarget] [amdgpu] After a kernel dispatch packet is published, its contents must not be accessed. Fixes: SWDEV-275232 (With contributions from Ammar Elwazir, Laurent Morichetti, and Tony Tye) The current code is racy. After the packet is submitted, the GPU will increment the read index. If this wraps around before the memory is read from it'll refer to a signal from an unrelated packet. Change avoids reading from the packet post-submission. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D110679
-
- Sep 27, 2021
-
-
Jon Chesterfield authored
-
Jon Chesterfield authored
-
Jon Chesterfield authored
This reverts commit 1a761e5b. Failed CI, albeit with a different failure mode to BZ51982
-
Jon Chesterfield authored
Fixes 51982. Minor refactor to remove `return x = y` construct. Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\ blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting parts while checking the assertion failure still occurred. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110556
-
@vladaindjic authored
The minor code refactorization introduces the TASK_TIED constant inside kmp_gsupprot.cpp as a replacement for the literal value 1. The mentioned constant is now used in both kmp_tasking.cpp and kmp_gsupport.cpp files. Differential Revision: https://reviews.llvm.org/D110441
-
Joseph Huber authored
This path defines the newly added `__kmpc_disitrute_static_init` functions in the device runtime library. These functions are currently exact copies of the current worksharing method but can be tuned later. Depends on D110429 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D110430
-
Pushpinder Singh authored
Keeping all the checks in one place for future simplification. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D110513
-
Michael Kruse authored
Use the in-project clang, llvm-link and opt if available and unless CMake cache variables specify to use a different compiler. This applies D101265 to the new DeviceRTL's CMakeLists.txt which was copied before D101265 was applied. Fixes the openmp-offloading-cuda-runtime builder which was failing since D110006. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D110251
-
Pushpinder Singh authored
-
Jon Chesterfield authored
-
Vignesh Balu authored
This is a continuation of the review: https://reviews.llvm.org/D100182 This patch implements the OMPD API as specified in the standard doc. Reviewed By: @hbae Differential Revision: https://reviews.llvm.org/D100183
-
- Sep 26, 2021
-
-
Jon Chesterfield authored
Store queues in unique_ptr so they are destroyed when the global DeviceInfo is. Currently they leak which raises an assert in debug builds of hsa. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D109511
-
- Sep 23, 2021
-
-
Joseph Huber authored
This patch fixes a data-race observed when using the new device runtime library. The Internal control variable for the parallel level is read in the `__kmpc_parallel_51` function while it could potentially be written by other threads. This causes data corruption and will cause nondetermistic behaviour in the runtime. This patch fixes this by adding an explicit synchronization before the region starts. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110366
-
- Sep 22, 2021
-
-
Shilei Tian authored
[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit` This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110279
-
Joseph Huber authored
Summary: Functions were called the wrong way around, this didn't keep the symbol alive.
-
Joseph Huber authored
This patch adds support for an RAII struct that will print function traces when placed inside of a function declaration. Each successive call will increase the indentation to make it easier to visually inspect. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110202
-
Shilei Tian authored
The execution mode of a kernel is stored in a global variable, whose value means: - 0 - SPMD mode - 1 - indicates generic mode - 2 - SPMD mode execution with generic mode semantics We are going to add support for SIMD execution mode. It will be come with another execution mode, such as SIMD-generic mode. As a result, this value-based indicator is not flexible. This patch changes to bitset based solution to encode execution mode. Each position is: [0] - generic mode [1] - SPMD mode [2] - SIMD mode (will be added later) In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode execution with generic mode semantics. In the future after we add the support for SIMD mode, `0b1xx` will be in SIMD mode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110029
-
Joseph Huber authored
Summary: The thread ID function was reintroduced in D110195, but could potentially be removed by the optimizer. Make the function noinline to preserve the call sites and add it to the externalization RAII so its definition is not removed by the attributor.
-
- Sep 21, 2021
-
-
Joseph Huber authored
The new device runtime library currently lacks the `kmpc_get_hardware_thread_id_in_block` function which is currently used when doing the SPMDzation optimization. This call would be introduced through the optimization and then cause a linking error because it was not present. This patch adds support for this runtime call. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D110195
-
Giorgis Georgakoudis authored
This reverts commit 1d66649a. Revert to fix AMG GPU issue.
-
Usman Nadeem authored
Differential Revision: https://reviews.llvm.org/D110120 Change-Id: I9d39dacfab5b7fbab37ee4b4d960d51e0892b24d
-
Giorgis Georgakoudis authored
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D102107
-
- Sep 20, 2021
-
-
Shilei Tian authored
Reviewed By: jhuber6, grokos Differential Revision: https://reviews.llvm.org/D110104
-
Peyton, Jonathan L authored
The indirect lock table can exhibit a race condition during initializing and setting/unsetting locks. This occurs if the lock table is resized by one thread (during an omp_init_lock) and accessed (during an omp_set|unset_lock) by another thread. The test runtime/test/lock/omp_init_lock.c test exposed this issue and will fail if run enough times. This patch restructures the lock table so pointer/iterator validity is always kept. Instead of reallocating a single table to a larger size, the lock table begins preallocated to accommodate 8K locks. Each row of the table is allocated as needed with each row allowing 1K locks. If the 8K limit is reached for the initial table, then another table, capable of holding double the number of locks, is allocated and linked as the next table. The indices stored in the user's locks take this linked structure into account when finding the lock within the table. Differential Revision: https://reviews.llvm.org/D109725
-
- Sep 18, 2021
-
-
Joseph Huber authored
This patch adds support for using dynamic shared memory in the new device runtime. The new function `__kmpc_get_dynamic_shared` will return a pointer to the buffer of dynamic shared memory. Currently the amount of memory allocated is set by an environment variable. In the future this amount will be added to the amount used for the smart stack which will be configured in a similar way. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D110006
-
Joseph Huber authored
This patch adds fields for the device number and number of devices into the device environment struct and debugging values. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110004
-
Joseph Huber authored
This patch implements the `__assert_fail` function in the new device runtime. This allows users and developers to use the standars assert function inside of the device. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D109886
-
- Sep 17, 2021
-
-
Shilei Tian authored
-
AndreyChurbanov authored
-
AndreyChurbanov authored
The third-party ittnotify sources updated from https://github.com/intel/ittapi. Changes applied: - llvm license aded to all files; initial BSD license saved in LICENSE.txt; - clang-formatted; - renamed *.c to *.cpp, similar to what we did with all our sources; - added #include "kmp_config.h" with definition of INTEL_ITTNOTIFY_PREFIX macro into ittnotify_static.cpp. Differential Revision: https://reviews.llvm.org/D109333
-