Skip to content
  1. Sep 02, 2021
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] Drop env variables · 3153bdd5
      Jon Chesterfield authored
      Use the same debug print as the rest of libomptarget plugins with
      the same environment control. Also drop the max queue size debugging hook as
      I don't believe it is still in use, can bring it back near the rest of the env
      handling in rtl.cpp if someone objects.
      
      That makes most of rt.h and all of utils.cpp unused. Clean that up and simplify
      control flow in a couple of places.
      
      Behaviour change is that debug prints that used to use the old environment
      variable now use the new one and print in slightly different format, and the
      removal of the max queue size variable.
      
      Reviewed By: pdhaliwal
      
      Differential Revision: https://reviews.llvm.org/D108784
      3153bdd5
  2. Sep 01, 2021
  3. Aug 27, 2021
    • Jon Chesterfield's avatar
      [openmp][amdgpu] Initial gfx10 offloading implementation · 78f92c38
      Jon Chesterfield authored
      Lets wavefront size be 32 for amdgpu openmp, as well as 64.
      
      Fixes up as little as possible to pass that through the libraries. This change
      is end to end, as opposed to updating clang/devicertl/plugin separately. It can
      be broken up for review/commit if preferred. Posting as-is so that others with
      a gfx10 can try it out. It works roughly as well as gfx9 for me, but there are
      probably bugs remaining as well as the todo: for letting grid values vary more.
      
      Reviewed By: ronlieb
      
      Differential Revision: https://reviews.llvm.org/D108708
      78f92c38
  4. Aug 26, 2021
  5. Aug 25, 2021
  6. Aug 24, 2021
  7. Aug 19, 2021
    • Jon Chesterfield's avatar
      [openmp][nfc] Replace OMPGridValues array with struct · 77579b99
      Jon Chesterfield authored
      [nfc] Replaces enum indices into an array with a struct. Named the
      fields to match the enum, leaves memory layout and initialization unchanged.
      
      Motivation is to later safely remove dead fields and replace redundant ones
      with (compile time) computation. It should also be possible to factor some
      common fields into a base and introduce a gfx10 amdgpu instance with less
      duplication than the arrays of integers require.
      
      Reviewed By: ronlieb
      
      Differential Revision: https://reviews.llvm.org/D108339
      77579b99
  8. Aug 08, 2021
  9. Aug 06, 2021
  10. Jul 29, 2021
  11. Jul 26, 2021
    • Jon Chesterfield's avatar
      [libomptarget] Build amdgpu plugin without hsa · 2a613a77
      Jon Chesterfield authored
      Default to building the amdgpu plugin to use dlopen when hsa is
      not found instead of disabling it.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106600
      2a613a77
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] More robust handling of failure to init HSA · dd0b463d
      Jon Chesterfield authored
      If hsa_init fails, subsequent calls into hsa are not safe. Except for
      hsa_init, but we don't retry on failure.
      
      This patch:
      - deletes a print that called into hsa to ask why it can't call into hsa
      - drops a merge conflict block next to that print
      - reliably initializes number of devices to zero
      - skips the plugin destructor contents if the constructor failed to init hsa
      
      Tested by making hsa_init return error, and by forcing the dynamic library
      use which was then deleted from disk. Before this patch, both segv. After it,
      friendly message about offloading being unavailable.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106774
      dd0b463d
  12. Jul 25, 2021
  13. Jul 22, 2021
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu][nfc] Normalise license headers · 9e05c084
      Jon Chesterfield authored
      Reviewed By: gregrodgers, jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106581
      9e05c084
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu][nfc] Replace use of gelf.h with libelf.h · 14e34a83
      Jon Chesterfield authored
      AMDGPU can assume Elf64 so doesn't need to abstract over Elf32
      
      Drop a few other unused headers at the same time. Now only llvm elf
      and libelf are used by the plugin.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106579
      14e34a83
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] Implement dlopen of libhsa · 1a965706
      Jon Chesterfield authored
      AMDGPU plugin equivalent of D95155, build without HSA installed locally
      
      Compiles a new file, plugins/amdgpu/dynamic_hsa/hsa.cpp, to an object file that
      exposes the same symbols that the plugin presently uses from hsa. The object
      file contains dlopen of hsa and cached dlsym calls. Also provides header files
      corresponding to the subset that is used.
      
      This is behind a feature flag, LIBOMPTARGET_FORCE_DLOPEN_LIBHSA, default off.
      That allows developers to build against the dlopen/dlsym implementation, e.g.
      while testing this mode.
      
      Enabling by default will cause this plugin to build on a wider variety of
      machines than it does at present so may break some CI builds. That risk can
      be minimised by reviewing the header dependencies of the library and ensuring
      it doesn't use any libraries that are not already used by libomptarget.
      
      Separating the implementation from enabling by default in case the latter needs
      to be rolled back after wider CI results.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106559
      1a965706
    • Joseph Huber's avatar
      [OpenMP] Fix warnings for uninitialized block counts · a158d366
      Joseph Huber authored
      Summary:
      Fixes some warning given for uninitialized block counts if the exection mode is
      not recognized. This shouldn't happen in practice because the execution mode is
      checked when it's read from the device.
      a158d366
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu][nfc] Drop dead signal pool setup · dc1f6f8b
      Jon Chesterfield authored
      This class is instantiated once in rtl.cpp before hsa_init is
      called. The hsa_signal_create call therefore fails leaving the pool empty.
      
      This signal pool is a legacy from ATMI where it was constructed after hsa_init.
      Moving the state into the rtl.cpp global class disabled the initial populating
      of the pool without noticeably changing performance. Just rechecked with a fix
      that allocates the signals after hsa_init and that also doesn't noticeably
      change performance.
      
      This patch therefore drops the initialisation. Only change from main is to
      drop a DEBUG_PRINT statement that would say the pool initial size is zero.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106515
      dc1f6f8b
    • Joseph Huber's avatar
      [OpenMP] Add new execution mode for SPMD execution with Generic semantics · 7d576392
      Joseph Huber authored
      Qualified kernels can be transformed from generic-mode to SPMD mode using an
      optimization in OpenMPOpt. This patch introduces a new execution mode to
      indicate kernels that have been transformed from generic-mode to SPMD-mode.
      These kernels have SPMD-mode execution, but need generic-mode semantics for
      scheduling the blocks and threads. Without this far too few blocks will be
      scheduled for a generic region as SPMD mode expects the trip count to be
      divided by the number of threads.
      
      Reviewed By: ggeorgakoudis
      
      Differential Revision: https://reviews.llvm.org/D106460
      7d576392
  14. Jul 21, 2021
  15. Jul 06, 2021
  16. Jul 01, 2021
  17. Jun 30, 2021
  18. Jun 29, 2021
  19. Jun 28, 2021
    • Pushpinder Singh's avatar
      [AMDGPU][Libomptarget] Collect allocatable memory pools using HSA · 20df2c70
      Pushpinder Singh authored
      The logic is almost similar to that of system.cpp with one change that
      instead of adding all the memory pools to a device struct it only
      keeps a single pool. The existing approach also always allocated memory on
      the first HSA pool found for a GPU.
      
      This depends on D104691. The goal of this series of patches is to remove
      _atl_machine global. The next patch will drop g_atl_machine entirely.
      
      Reviewed By: JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D104695
      20df2c70
  20. Jun 25, 2021
  21. Jun 24, 2021
  22. Jun 22, 2021
  23. Jun 21, 2021
  24. Jun 16, 2021
  25. Jun 15, 2021
  26. Jun 10, 2021
Loading