Skip to content
  1. Dec 06, 2021
  2. Nov 29, 2021
    • Matt Arsenault's avatar
      OpenMP: Correctly query location for amdgpu-arch · 935abeaa
      Matt Arsenault authored
      This was trying to figure out the build path for amdgpu-arch, and
      making assumptions about where it is which were not working on my
      system. Whether a standalone build or not, we should have a proper
      imported target to get the location from.
      935abeaa
  3. Nov 23, 2021
    • Jon Chesterfield's avatar
      [openmp][amdgpu] Make plugin robust to presence of explicit implicit arguments · ae5348a3
      Jon Chesterfield authored
      OpenMP (compiler) does not currently request any implicit kernel
      arguments. OpenMP (runtime) allocates and initialises a reasonable guess at
      the implicit kernel arguments anyway.
      
      This change makes the plugin check the number of explicit arguments, instead
      of all arguments, and puts the pointer to hostcall buffer in both the current
      location and at the offset expected when implicit arguments are added to the
      metadata by D113538.
      
      This is intended to keep things running while fixing the oversight in the
      compiler (in D113538). Once that patch lands, and a following one marks
      openmp kernels that use printf such that the backend emits an args element
      with the right type (instead of hidden_node), the over-allocation can be
      removed and the hardcoded 8*e+3 offset replaced with one read from the
      .offset of the corresponding metadata element.
      
      Reviewed By: estewart08
      
      Differential Revision: https://reviews.llvm.org/D114274
      ae5348a3
  4. Nov 19, 2021
  5. Oct 28, 2021
  6. Oct 23, 2021
  7. Oct 19, 2021
  8. Oct 09, 2021
  9. Oct 07, 2021
  10. Oct 01, 2021
  11. Sep 30, 2021
    • Jon Chesterfield's avatar
      [libomptarget] Apply D110029 to amdgpu · b75a7481
      Jon Chesterfield authored
      Use enum for execution mode.
      
      This is partly a port from ROCm and partly a port from D110029. Attempted to
      make the same choices as ROCm as far as comments etc go to reduce the merge
      conflicts.
      
      There is some cleanup warranted here - in particular I like the cuda patch
      factoring out the comparisons into named variables - but I'd like to leave
      that for a follow up patch, keeping this one minimal.
      
      Reviewed By: carlo.bertolli
      
      Differential Revision: https://reviews.llvm.org/D110845
      b75a7481
  12. Sep 29, 2021
    • Dhruva Chakrabarti's avatar
      [libomptarget] [amdgpu] After a kernel dispatch packet is published, its... · 62262702
      Dhruva Chakrabarti authored
      [libomptarget] [amdgpu] After a kernel dispatch packet is published, its contents must not be accessed.
      
      Fixes: SWDEV-275232 (With contributions from Ammar Elwazir, Laurent Morichetti, and Tony Tye)
      
      The current code is racy. After the packet is submitted, the GPU will increment the read index. If this wraps around before the memory is read from it'll refer to a signal from an unrelated packet. Change avoids reading from the packet post-submission.
      
      Reviewed By: JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D110679
      62262702
  13. Sep 27, 2021
  14. Sep 26, 2021
  15. Sep 09, 2021
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] Precisely manage hsa lifetime · 6760234e
      Jon Chesterfield authored
      The hsa library must be initialized before any calls into it and
      destructed after the last call into it. There have been a number of bugs in
      this area related to member variables which would like to use raii to manage
      resources acquired from hsa.
      
      This patch moves the init/shutdown of hsa into a class, such that when used as
      the first member variable (could be a base), the lifetime of other member
      variables are reliably scoped within it. This will allow other classes to use
      raii reliably when used as member variables within the global.
      
      Reviewed By: pdhaliwal
      
      Differential Revision: https://reviews.llvm.org/D109512
      6760234e
    • Jon Chesterfield's avatar
      d642156f
  16. Sep 02, 2021
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] Drop env variables · 3153bdd5
      Jon Chesterfield authored
      Use the same debug print as the rest of libomptarget plugins with
      the same environment control. Also drop the max queue size debugging hook as
      I don't believe it is still in use, can bring it back near the rest of the env
      handling in rtl.cpp if someone objects.
      
      That makes most of rt.h and all of utils.cpp unused. Clean that up and simplify
      control flow in a couple of places.
      
      Behaviour change is that debug prints that used to use the old environment
      variable now use the new one and print in slightly different format, and the
      removal of the max queue size variable.
      
      Reviewed By: pdhaliwal
      
      Differential Revision: https://reviews.llvm.org/D108784
      3153bdd5
  17. Sep 01, 2021
  18. Aug 27, 2021
    • Jon Chesterfield's avatar
      [openmp][amdgpu] Initial gfx10 offloading implementation · 78f92c38
      Jon Chesterfield authored
      Lets wavefront size be 32 for amdgpu openmp, as well as 64.
      
      Fixes up as little as possible to pass that through the libraries. This change
      is end to end, as opposed to updating clang/devicertl/plugin separately. It can
      be broken up for review/commit if preferred. Posting as-is so that others with
      a gfx10 can try it out. It works roughly as well as gfx9 for me, but there are
      probably bugs remaining as well as the todo: for letting grid values vary more.
      
      Reviewed By: ronlieb
      
      Differential Revision: https://reviews.llvm.org/D108708
      78f92c38
  19. Aug 26, 2021
  20. Aug 25, 2021
  21. Aug 24, 2021
  22. Aug 19, 2021
    • Jon Chesterfield's avatar
      [openmp][nfc] Replace OMPGridValues array with struct · 77579b99
      Jon Chesterfield authored
      [nfc] Replaces enum indices into an array with a struct. Named the
      fields to match the enum, leaves memory layout and initialization unchanged.
      
      Motivation is to later safely remove dead fields and replace redundant ones
      with (compile time) computation. It should also be possible to factor some
      common fields into a base and introduce a gfx10 amdgpu instance with less
      duplication than the arrays of integers require.
      
      Reviewed By: ronlieb
      
      Differential Revision: https://reviews.llvm.org/D108339
      77579b99
  23. Aug 08, 2021
  24. Aug 06, 2021
  25. Jul 29, 2021
  26. Jul 26, 2021
    • Jon Chesterfield's avatar
      [libomptarget] Build amdgpu plugin without hsa · 2a613a77
      Jon Chesterfield authored
      Default to building the amdgpu plugin to use dlopen when hsa is
      not found instead of disabling it.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106600
      2a613a77
    • Jon Chesterfield's avatar
      [libomptarget][amdgpu] More robust handling of failure to init HSA · dd0b463d
      Jon Chesterfield authored
      If hsa_init fails, subsequent calls into hsa are not safe. Except for
      hsa_init, but we don't retry on failure.
      
      This patch:
      - deletes a print that called into hsa to ask why it can't call into hsa
      - drops a merge conflict block next to that print
      - reliably initializes number of devices to zero
      - skips the plugin destructor contents if the constructor failed to init hsa
      
      Tested by making hsa_init return error, and by forcing the dynamic library
      use which was then deleted from disk. Before this patch, both segv. After it,
      friendly message about offloading being unavailable.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D106774
      dd0b463d
  27. Jul 25, 2021
Loading