Skip to content
  1. Jan 21, 2022
    • Joseph Huber's avatar
      [Libomptarget] Change visibility to hidden for device RTL · 26feef08
      Joseph Huber authored
      This patch changes the visibility for all construct in the new device
      RTL to be hidden by default. This is done after the changes introduced
      in D117806 changed the visibility from being hidden by default for all
      device compilations. This asserts that the visibility for the device
      runtime library will be hidden except for the internal environment
      variable. This is done to aid optimization and linking of the device
      library.
      
      Reviewed By: JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D117807
      26feef08
  2. Jan 20, 2022
    • Joseph Huber's avatar
      [OpenMP] Expand short verisions of OpenMP offloading triples · 28d71860
      Joseph Huber authored
      The OpenMP offloading libraries are built with fixed triples and linked
      in during compile time. This would cause un-helpful errors if the user
      passed in the wrong expansion of the triple used for the bitcode
      library. because we only support these triples for OpenMP offloading we
      can normalize them to the full verion used in the bitcode library.
      
      Reviewed By: jdoerfert, JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D117634
      28d71860
  3. Jan 19, 2022
    • Joseph Huber's avatar
      [Libomptarget] Fix external visibility for internal variables · 4863fed9
      Joseph Huber authored
      After the changes in D117362 made variables declared inside of a target
      declare directive visible outside the plugin, some variables inside the
      runtime were given visiblity that conflicted with their address space
      type. This caused problems when shared or local memory was made
      externally visible. This patch fixes this issue by making these
      varialbes static within the module, therefore limiting their visibility
      to being internal.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D117526
      4863fed9
  4. Jan 18, 2022
  5. Jan 17, 2022
    • Joseph Huber's avatar
      [Libomptarget] Add `cold` to KeepAlive attributes · 4869a22d
      Joseph Huber authored
      This patch adds the `cold` attribute to the keepAlive functions in the
      RTL. This dummy function exists to keep certain RTL calls alive without
      them being optimized out, but it is never called and can be declared
      cold. This also helps some erroneous remarks being given on this
      function because it has weak linkage and cannot be made internal.
      
      Reviewed By: tianshilei1992
      
      Differential Revision: https://reviews.llvm.org/D117513
      4869a22d
  6. Jan 13, 2022
  7. Dec 27, 2021
    • Joseph Huber's avatar
      [OpenMP][FIX] Change globalization alignment to 16 · 7cdaa5a9
      Joseph Huber authored
      This patch changes the default aligntment from 8 to 16, and encodes this
      information in the `__kmpc_alloc_shared` runtime call to communicate it
      to the HeapToStack pass. The previous alignment of 8 was not sufficient
      for the maximum size of primitive types on 64-bit systems, and needs to
      be increaesd. This reduces the amount of space availible in the data
      sharing stack, so this implementation will need to be improved later to
      include the alignment requirements in the allocation call, and use it
      properly in the data sharing stack in the runtime.
      
      Depends on D115888
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D115971
      7cdaa5a9
  8. Dec 09, 2021
    • Joseph Huber's avatar
      [OpenMP][FIX] Pass the num_threads value directly to parallel_51 · bc9c4d72
      Joseph Huber authored
      The problem with the old scheme is that we would need to keep track of
      the "next region" and reset the num_threads value after it. The new RT
      doesn't do it and an assertion is triggered. The old RT doesn't do it
      either, I haven't tested it but I assume a num_threads clause might
      impact multiple parallel regions "accidentally". Further, in SPMD mode
      num_threads was simply ignored, for some reason beyond me.
      
      In any case, parallel_51 is designed to take the clause value directly,
      so let's do that instead.
      
      Reviewed By: tianshilei1992
      
      Differential Revision: https://reviews.llvm.org/D113623
      bc9c4d72
  9. Nov 30, 2021
  10. Nov 16, 2021
    • Joseph Huber's avatar
      [OpenMP] Fix initializer not working on AMDGPU · 374cd0fb
      Joseph Huber authored
      The RAII class used for debugging RTL entry used a shared variable to
      keep track of the current depth. This used a global initializer, which
      isn't supported on AMDGPU. This patch removes the initializer and
      instead sets it to zero when the state is initialized in the runtime.
      
      Reviewed By: jdoerfert, JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D113963
      374cd0fb
  11. Nov 12, 2021
  12. Nov 10, 2021
    • Jon Chesterfield's avatar
      [OpenMP] Lower printf to __llvm_omp_vprintf · 27177b82
      Jon Chesterfield authored
      Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf`
      which takes the same const char*, void* arguments as cuda vprintf and also
      passes the size of the void* alloca which will be needed by a non-stub
      implementation of `__llvm_omp_vprintf` for amdgpu.
      
      This removes the amdgpu link error on any printf in a target region in favour
      of silently compiling code that doesn't print anything to stdout.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D112680
      27177b82
  13. Nov 09, 2021
  14. Nov 08, 2021
  15. Nov 04, 2021
  16. Nov 03, 2021
  17. Oct 30, 2021
    • Shilei Tian's avatar
      [OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3 · 025f5492
      Shilei Tian authored
      The synchronization at the end of parallel region cannot make sure all threads
      exit the scope. As a result, the assertions right after it might be hit, and
      further the `state::assumeInitialState(IsSPMD)` in `__kmpc_target_deinit` may
      not hold as well. We either add a synchronization right after the parallel region,
      or remove the assertions and assuptions. Here we choose the first one as those
      assertions and assumptions can help optimizations.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D112861
      025f5492
  18. Oct 29, 2021
  19. Oct 28, 2021
Loading