Skip to content
  1. Nov 04, 2021
  2. Nov 03, 2021
  3. Oct 30, 2021
    • Shilei Tian's avatar
      [OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3 · 025f5492
      Shilei Tian authored
      The synchronization at the end of parallel region cannot make sure all threads
      exit the scope. As a result, the assertions right after it might be hit, and
      further the `state::assumeInitialState(IsSPMD)` in `__kmpc_target_deinit` may
      not hold as well. We either add a synchronization right after the parallel region,
      or remove the assertions and assuptions. Here we choose the first one as those
      assertions and assumptions can help optimizations.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D112861
      025f5492
  4. Oct 29, 2021
  5. Oct 28, 2021
  6. Oct 26, 2021
  7. Oct 21, 2021
    • Jon Chesterfield's avatar
      [libomptarget][DeviceRTL] Generalise and simplify cmakelists · a602c2b5
      Jon Chesterfield authored
      Step towards building the DeviceRTL for amdgpu.
      
      Mostly replaces cuda-specific toolchain finding logic with the
      generic logic currently found in the amdgpu deviceRTL cmake. Also
      deletes dead code and changes the default to build on systems
      without cuda installed, as the library doesn't use cuda and the
      amdgpu-only systems generally won't have cuda installed.
      
      Reviewed By: Meinersbur
      
      Differential Revision: https://reviews.llvm.org/D111983
      a602c2b5
  8. Oct 19, 2021
  9. Oct 09, 2021
  10. Oct 08, 2021
  11. Oct 07, 2021
    • Jon Chesterfield's avatar
      [libomptarget] Move device environment to shared header, remove divergence · 0c554a47
      Jon Chesterfield authored
      Follow on to D110006, related to D110957
      
      Where implementations have diverged this resolves to match the new DeviceRTL
      
      - replaces definitions of this struct in deviceRTL and plugins with include
      - changes the dynamic_shared_size field from D110006 to 32 bits
      - handles stdint being unavailable in DeviceRTL
      - adds a zero initializer for the field to amdgpu
      - moves the extern declaration for deviceRTL to target_interface
        (omptarget.h is more natural, but doesn't work due to include order
        with debug.h)
      - Renames the fields everywhere to match the LLVM format used in DeviceRTL
      - Makes debug_level uint32_t everywhere (previously sometimes int32_t)
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D111069
      0c554a47
  12. Sep 27, 2021
  13. Sep 23, 2021
    • Joseph Huber's avatar
      [OpenMP] Fix data-race in new device RTL · d83ca624
      Joseph Huber authored
      This patch fixes a data-race observed when using the new device runtime
      library. The Internal control variable for the parallel level is read in
      the `__kmpc_parallel_51` function while it could potentially be written
      by other threads. This causes data corruption and will cause
      nondetermistic behaviour in the runtime. This patch fixes this by adding
      an explicit synchronization before the region starts.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D110366
      d83ca624
  14. Sep 22, 2021
  15. Sep 21, 2021
    • Joseph Huber's avatar
      [OpenMP] Add thread ID function into new RTL · e95731cc
      Joseph Huber authored
      The new device runtime library currently lacks the
      `kmpc_get_hardware_thread_id_in_block` function which is currently used
      when doing the SPMDzation optimization. This call would be introduced
      through the optimization and then cause a linking error because it was
      not present. This patch adds support for this runtime call.
      
      Reviewed By: tianshilei1992
      
      Differential Revision: https://reviews.llvm.org/D110195
      e95731cc
  16. Sep 18, 2021
Loading