Skip to content
  1. Oct 07, 2021
    • Jon Chesterfield's avatar
      [libomptarget] Move device environment to shared header, remove divergence · 0c554a47
      Jon Chesterfield authored
      Follow on to D110006, related to D110957
      
      Where implementations have diverged this resolves to match the new DeviceRTL
      
      - replaces definitions of this struct in deviceRTL and plugins with include
      - changes the dynamic_shared_size field from D110006 to 32 bits
      - handles stdint being unavailable in DeviceRTL
      - adds a zero initializer for the field to amdgpu
      - moves the extern declaration for deviceRTL to target_interface
        (omptarget.h is more natural, but doesn't work due to include order
        with debug.h)
      - Renames the fields everywhere to match the LLVM format used in DeviceRTL
      - Makes debug_level uint32_t everywhere (previously sometimes int32_t)
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D111069
      0c554a47
  2. Oct 04, 2021
    • Michał Górny's avatar
      [openmp] [elf_common] Fix linking against LLVM dylib · 0873b9be
      Michał Górny authored
      The hand-rolled linking logic in elf_common does not account for
      the possibility of using LLVM dylib rather than a dozen static
      libraries.  Since it does not seem to be easily convertible
      to add_llvm_library, just hand-roll support for LLVM_LINK_LLVM_DYLIB.
      This is necessary to support stand-alone builds against installed LLVM.
      
      Differential Revision: https://reviews.llvm.org/D111038
      0873b9be
  3. Oct 01, 2021
  4. Sep 30, 2021
  5. Sep 29, 2021
    • Dhruva Chakrabarti's avatar
      [libomptarget] [amdgpu] After a kernel dispatch packet is published, its... · 62262702
      Dhruva Chakrabarti authored
      [libomptarget] [amdgpu] After a kernel dispatch packet is published, its contents must not be accessed.
      
      Fixes: SWDEV-275232 (With contributions from Ammar Elwazir, Laurent Morichetti, and Tony Tye)
      
      The current code is racy. After the packet is submitted, the GPU will increment the read index. If this wraps around before the memory is read from it'll refer to a signal from an unrelated packet. Change avoids reading from the packet post-submission.
      
      Reviewed By: JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D110679
      62262702
  6. Sep 27, 2021
  7. Sep 26, 2021
  8. Sep 23, 2021
    • Joseph Huber's avatar
      [OpenMP] Fix data-race in new device RTL · d83ca624
      Joseph Huber authored
      This patch fixes a data-race observed when using the new device runtime
      library. The Internal control variable for the parallel level is read in
      the `__kmpc_parallel_51` function while it could potentially be written
      by other threads. This causes data corruption and will cause
      nondetermistic behaviour in the runtime. This patch fixes this by adding
      an explicit synchronization before the region starts.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D110366
      d83ca624
  9. Sep 22, 2021
    • Shilei Tian's avatar
      [OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in... · 423d34f7
      Shilei Tian authored
      [OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`
      
      This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D110279
      423d34f7
    • Joseph Huber's avatar
      [OpenMP] Fix KeepAlive usage · 60a40cf3
      Joseph Huber authored
      Summary:
      Functions were called the wrong way around, this didn't keep the symbol
      alive.
      60a40cf3
    • Joseph Huber's avatar
      [OpenMP] Add function tracing debugging to device RTL · 277b681e
      Joseph Huber authored
      This patch adds support for an RAII struct that will print function
      traces when placed inside of a function declaration. Each successive
      call will increase the indentation to make it easier to visually
      inspect.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D110202
      277b681e
    • Shilei Tian's avatar
      [OpenMP][Offloading] Use bitset to indicate execution mode instead of value · ca999f71
      Shilei Tian authored
      The execution mode of a kernel is stored in a global variable, whose value means:
      - 0 - SPMD mode
      - 1 - indicates generic mode
      - 2 - SPMD mode execution with generic mode semantics
      
      We are going to add support for SIMD execution mode. It will be come with another
      execution mode, such as SIMD-generic mode. As a result, this value-based indicator
      is not flexible.
      
      This patch changes to bitset based solution to encode execution mode. Each
      position is:
      [0] - generic mode
      [1] - SPMD mode
      [2] - SIMD mode (will be added later)
      
      In this way, `0x1` is generic mode, `0x2` is SPMD mode, and `0x3` is SPMD mode
      execution with generic mode semantics. In the future after we add the support for
      SIMD mode, `0b1xx` will be in SIMD mode.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D110029
      ca999f71
    • Joseph Huber's avatar
      [OpenMP] Make sure the Thread ID function is not removed · 1cf86df8
      Joseph Huber authored
      Summary:
      The thread ID function was reintroduced in D110195, but could
      potentially be removed by the optimizer. Make the function noinline to
      preserve the call sites and add it to the externalization RAII so its
      definition is not removed by the attributor.
      1cf86df8
  10. Sep 21, 2021
    • Joseph Huber's avatar
      [OpenMP] Add thread ID function into new RTL · e95731cc
      Joseph Huber authored
      The new device runtime library currently lacks the
      `kmpc_get_hardware_thread_id_in_block` function which is currently used
      when doing the SPMDzation optimization. This call would be introduced
      through the optimization and then cause a linking error because it was
      not present. This patch adds support for this runtime call.
      
      Reviewed By: tianshilei1992
      
      Differential Revision: https://reviews.llvm.org/D110195
      e95731cc
    • Giorgis Georgakoudis's avatar
      Revert "[OpenMP] Codegen aggregate for outlined function captures" · ac90dfc4
      Giorgis Georgakoudis authored
      This reverts commit 1d66649a.
      
      Revert to fix AMG GPU issue.
      ac90dfc4
    • Giorgis Georgakoudis's avatar
      [OpenMP] Codegen aggregate for outlined function captures · 1d66649a
      Giorgis Georgakoudis authored
      Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3)  forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
      
      Reviewed By: jdoerfert, jhuber6
      
      Differential Revision: https://reviews.llvm.org/D102107
      1d66649a
  11. Sep 20, 2021
  12. Sep 18, 2021
  13. Sep 17, 2021
  14. Sep 15, 2021
  15. Sep 10, 2021
  16. Sep 09, 2021
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] Precisely manage hsa lifetime · 6760234e
      Jon Chesterfield authored
      The hsa library must be initialized before any calls into it and
      destructed after the last call into it. There have been a number of bugs in
      this area related to member variables which would like to use raii to manage
      resources acquired from hsa.
      
      This patch moves the init/shutdown of hsa into a class, such that when used as
      the first member variable (could be a base), the lifetime of other member
      variables are reliably scoped within it. This will allow other classes to use
      raii reliably when used as member variables within the global.
      
      Reviewed By: pdhaliwal
      
      Differential Revision: https://reviews.llvm.org/D109512
      6760234e
    • Jon Chesterfield's avatar
      [openmp] No longer use LIBRARY_PATH to find devicertl · 2a581710
      Jon Chesterfield authored
      Given D109057, change test runner to use the libomptarget-x-bc-path
      argument instead of the LIBRARY_PATH environment variable to find the device
      library.
      
      Also drop the use of LIBRARY_PATH environment variable as it is far
      too easy to pull in the device library from an unrelated toolchain by accident
      with the current setup. No loss in flexibility to developers as the clang
      commandline used here is still available.
      
      Reviewed By: jdoerfert, tianshilei1992
      
      Differential Revision: https://reviews.llvm.org/D109061
      2a581710
    • Jon Chesterfield's avatar
      d642156f
  17. Sep 07, 2021
  18. Sep 03, 2021
  19. Sep 02, 2021
Loading