- Dec 06, 2021
-
-
Jon Chesterfield authored
Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D114891
-
- Nov 29, 2021
-
-
Matt Arsenault authored
This was trying to figure out the build path for amdgpu-arch, and making assumptions about where it is which were not working on my system. Whether a standalone build or not, we should have a proper imported target to get the location from.
-
- Nov 23, 2021
-
-
Jon Chesterfield authored
OpenMP (compiler) does not currently request any implicit kernel arguments. OpenMP (runtime) allocates and initialises a reasonable guess at the implicit kernel arguments anyway. This change makes the plugin check the number of explicit arguments, instead of all arguments, and puts the pointer to hostcall buffer in both the current location and at the offset expected when implicit arguments are added to the metadata by D113538. This is intended to keep things running while fixing the oversight in the compiler (in D113538). Once that patch lands, and a following one marks openmp kernels that use printf such that the backend emits an args element with the right type (instead of hidden_node), the over-allocation can be removed and the hardcoded 8*e+3 offset replaced with one read from the .offset of the corresponding metadata element. Reviewed By: estewart08 Differential Revision: https://reviews.llvm.org/D114274
-
- Nov 19, 2021
-
-
Jon Chesterfield authored
Removes a +x/-x pair on the only store/load of a variable and deletes some nearby dead code. Also reduces the size of the implicit struct to reflect the code currently emitted by clang. Differential Revision: https://reviews.llvm.org/D114270
-
Jon Chesterfield authored
-
- Oct 28, 2021
-
-
Jon Chesterfield authored
- more tests failing on CI than failed locally when writing this patch This reverts commit 33427fdb.
-
Jon Chesterfield authored
Passes same tests as the current deviceRTL. Includes cmake change from D111987. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D112227
-
- Oct 23, 2021
-
-
Jon Chesterfield authored
Implemented by patching python config instead of modifying all the tests so that -generic and XFAIL work as usual. Expectation is for this to be reverted once the old runtime is deleted. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D112225
-
- Oct 19, 2021
-
-
Joseph Huber authored
The plugin currently uses a macro to check if this is a debug built before assigning the debug kind variable to the device environment struct. This is being deprecated because the new device runtime does not maintain separate debug builds and should always be availible. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D112083
-
- Oct 09, 2021
-
-
Ron Lieberman authored
-
- Oct 07, 2021
-
-
Jon Chesterfield authored
-
Jon Chesterfield authored
Follow on to D110006, related to D110957 Where implementations have diverged this resolves to match the new DeviceRTL - replaces definitions of this struct in deviceRTL and plugins with include - changes the dynamic_shared_size field from D110006 to 32 bits - handles stdint being unavailable in DeviceRTL - adds a zero initializer for the field to amdgpu - moves the extern declaration for deviceRTL to target_interface (omptarget.h is more natural, but doesn't work due to include order with debug.h) - Renames the fields everywhere to match the LLVM format used in DeviceRTL - Makes debug_level uint32_t everywhere (previously sometimes int32_t) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D111069
-
- Oct 01, 2021
-
-
Jon Chesterfield authored
-
- Sep 30, 2021
-
-
Jon Chesterfield authored
Use enum for execution mode. This is partly a port from ROCm and partly a port from D110029. Attempted to make the same choices as ROCm as far as comments etc go to reduce the merge conflicts. There is some cleanup warranted here - in particular I like the cuda patch factoring out the comparisons into named variables - but I'd like to leave that for a follow up patch, keeping this one minimal. Reviewed By: carlo.bertolli Differential Revision: https://reviews.llvm.org/D110845
-
- Sep 29, 2021
-
-
Dhruva Chakrabarti authored
[libomptarget] [amdgpu] After a kernel dispatch packet is published, its contents must not be accessed. Fixes: SWDEV-275232 (With contributions from Ammar Elwazir, Laurent Morichetti, and Tony Tye) The current code is racy. After the packet is submitted, the GPU will increment the read index. If this wraps around before the memory is read from it'll refer to a signal from an unrelated packet. Change avoids reading from the packet post-submission. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D110679
-
- Sep 27, 2021
-
-
Jon Chesterfield authored
-
Jon Chesterfield authored
-
Pushpinder Singh authored
Keeping all the checks in one place for future simplification. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D110513
-
Pushpinder Singh authored
-
Jon Chesterfield authored
-
- Sep 26, 2021
-
-
Jon Chesterfield authored
Store queues in unique_ptr so they are destroyed when the global DeviceInfo is. Currently they leak which raises an assert in debug builds of hsa. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D109511
-
- Sep 09, 2021
-
-
Jon Chesterfield authored
The hsa library must be initialized before any calls into it and destructed after the last call into it. There have been a number of bugs in this area related to member variables which would like to use raii to manage resources acquired from hsa. This patch moves the init/shutdown of hsa into a class, such that when used as the first member variable (could be a base), the lifetime of other member variables are reliably scoped within it. This will allow other classes to use raii reliably when used as member variables within the global. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D109512
-
Jon Chesterfield authored
-
- Sep 02, 2021
-
-
Jon Chesterfield authored
Use the same debug print as the rest of libomptarget plugins with the same environment control. Also drop the max queue size debugging hook as I don't believe it is still in use, can bring it back near the rest of the env handling in rtl.cpp if someone objects. That makes most of rt.h and all of utils.cpp unused. Clean that up and simplify control flow in a couple of places. Behaviour change is that debug prints that used to use the old environment variable now use the new one and print in slightly different format, and the removal of the max queue size variable. Reviewed By: pdhaliwal Differential Revision: https://reviews.llvm.org/D108784
-
- Sep 01, 2021
-
-
Jon Chesterfield authored
-
- Aug 27, 2021
-
-
Jon Chesterfield authored
Lets wavefront size be 32 for amdgpu openmp, as well as 64. Fixes up as little as possible to pass that through the libraries. This change is end to end, as opposed to updating clang/devicertl/plugin separately. It can be broken up for review/commit if preferred. Posting as-is so that others with a gfx10 can try it out. It works roughly as well as gfx9 for me, but there are probably bugs remaining as well as the todo: for letting grid values vary more. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D108708
-
- Aug 26, 2021
-
-
Jon Chesterfield authored
-
Jon Chesterfield authored
-
- Aug 25, 2021
-
-
Jon Chesterfield authored
-
Jon Chesterfield authored
Move most debug printing in rtl.cpp behind DP() macro Adjust the print output for gpu arch mismatch when the architectures match Convert an assert into graceful failure Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108562
-
Jon Chesterfield authored
-
- Aug 24, 2021
-
-
Pushpinder Singh authored
With uses of g_atl_machine gone, a significant portion of dead code has been removed. This patch depends on D104691 and D104695. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D104696
-
- Aug 19, 2021
-
-
Jon Chesterfield authored
[nfc] Replaces enum indices into an array with a struct. Named the fields to match the enum, leaves memory layout and initialization unchanged. Motivation is to later safely remove dead fields and replace redundant ones with (compile time) computation. It should also be possible to factor some common fields into a base and introduce a gfx10 amdgpu instance with less duplication than the arrays of integers require. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D108339
-
- Aug 08, 2021
-
-
Dimitry Andric authored
On FreeBSD, the `environ` symbol is undefined at link time for shared libraries, but resolved by the dynamic linker at runtime. Therefore, allow the symbol to be undefined when creating a shared library, by using the `--allow-shlib-undefined` linker flag, instead of `-z defs` (a.k.a `--no-undefined`). Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D107698
-
- Aug 06, 2021
-
-
Dimitry Andric authored
On FreeBSD, the system `<libelf.h>` already declares `struct Elf_Note` indirectly (via `<sys/elf_common.h>`). This results in compile errors when building the libomptarget amdgpu plugin. Avoid redeclaring `struct Elf_Note` on FreeBSD to fix the errors. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D107661
-
- Jul 29, 2021
-
-
Jon Chesterfield authored
-
- Jul 26, 2021
-
-
Jon Chesterfield authored
Default to building the amdgpu plugin to use dlopen when hsa is not found instead of disabling it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106600
-
Jon Chesterfield authored
If hsa_init fails, subsequent calls into hsa are not safe. Except for hsa_init, but we don't retry on failure. This patch: - deletes a print that called into hsa to ask why it can't call into hsa - drops a merge conflict block next to that print - reliably initializes number of devices to zero - skips the plugin destructor contents if the constructor failed to init hsa Tested by making hsa_init return error, and by forcing the dynamic library use which was then deleted from disk. Before this patch, both segv. After it, friendly message about offloading being unavailable. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106774
-
- Jul 25, 2021
-
-
Jon Chesterfield authored
Inaccurate error handling around hsa_init This reverts commit e30b3b23.
-
Jon Chesterfield authored
Default to building the amdgpu plugin to use dlopen when hsa is not found instead of disabling it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106600
-