Skip to content
  1. Jan 13, 2022
  2. Jan 10, 2022
  3. Jan 06, 2022
    • Shilei Tian's avatar
      [OpenMP][Offloading] Fixed a crash caused by dereferencing nullptr · aab62aab
      Shilei Tian authored
      In function `DeviceTy::getTargetPointer`, `Entry` could be `nullptr` because of
      zero length array section. We need to check if it is a valid iterator before
      using it.
      
      Reviewed By: ronlieb
      
      Differential Revision: https://reviews.llvm.org/D116716
      aab62aab
    • Shilei Tian's avatar
      [OpenMP][Offloading] Fixed data race in libomptarget caused by async data movement · 9584c6fa
      Shilei Tian authored
      The async data movement can cause data race if the target supports it.
      Details can be found in [1]. This patch tries to fix this problem by attaching
      an event to the entry of data mapping table. Here are the details.
      
      For each issued data movement, a new event is generated and returned to `libomptarget`
      by calling `createEvent`. The event will be attached to the corresponding mapping table
      entry.
      
      For each data mapping lookup, if there is no need for a data movement, the
      attached event has to be inserted into the queue to gaurantee that all following
      operations in the queue can only be executed if the event is fulfilled.
      
      This design is to avoid synchronization on host side.
      
      Note that we are using CUDA terminolofy here. Similar mechanism is assumped to
      be supported by another targets. Even if the target doesn't support it, it can
      be easily implemented in the following fall back way:
      - `Event` can be any kind of flag that has at least two status, 0 and 1.
      - `waitEvent` can directly busy loop if `Event` is still 0.
      
      My local test shows that `bug49334.cpp` can pass.
      
      Reference:
      [1] https://bugs.llvm.org/show_bug.cgi?id=49940
      
      Reviewed By: grokos, JonChesterfield, ye-luo
      
      Differential Revision: https://reviews.llvm.org/D104418
      9584c6fa
  4. Jan 03, 2022
    • RitanyaB's avatar
      SIGSEGV in ompt_tsan_dependences with for-ordered · 378b0ac1
      RitanyaB authored
      Segmentation fault in ompt_tsan_dependences function due to an unchecked NULL pointer dereference is as follows:
      
      ```
      ThreadSanitizer:DEADLYSIGNAL
      	==140865==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000050 (pc 0x7f217c2d3652 bp 0x7ffe8cfc7e00 sp 0x7ffe8cfc7d90 T140865)
      	==140865==The signal is caused by a READ memory access.
      	==140865==Hint: address points to the zero page.
      	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 1012a
      	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 133b5
      	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 1371a
      	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 13a58
      	#0 ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 (libarcher.so+0x15652)
      	#1 __kmpc_doacross_post /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:4280 (libomp.so+0x74d98)
      	#2 .omp_outlined. for_ordered_01.c:? (for_ordered_01.exe+0x5186cb)
      	#3 __kmp_invoke_microtask /ptmp/bhararit/llvm-project/openmp/runtime/src/z_Linux_asm.S:1166 (libomp.so+0x14e592)
      	#4 __kmp_invoke_task_func /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:7556 (libomp.so+0x909ad)
      	#5 __kmp_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:2284 (libomp.so+0x8461a)
      	#6 __kmpc_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:308 (libomp.so+0x6db55)
      	#7 main ??:? (for_ordered_01.exe+0x51828f)
      	#8 __libc_start_main ??:? (libc.so.6+0x24349)
      	#9 _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 (for_ordered_01.exe+0x4214e9)
      
      	ThreadSanitizer can not provide additional info.
      	SUMMARY: ThreadSanitizer: SEGV /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 in ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int)
      	==140865==ABORTING
      ```
      
      	To reproduce the error, use the following openmp code snippet:
      
      ```
      /* initialise  testMatrixInt Matrix, cols, r and c */
      	  #pragma omp parallel private(r,c) shared(testMatrixInt)
      	    {
      	      #pragma omp for ordered(2)
      	      for (r=1; r < rows; r++) {
      	        for (c=1; c < cols; c++) {
      	          #pragma omp ordered depend(sink:r-1, c+1) depend(sink:r-1,c-1)
      	          testMatrixInt[r][c] = (testMatrixInt[r-1][c] + testMatrixInt[r-1][c-1]) % cols ;
      	          #pragma omp ordered depend (source)
      	        }
      	      }
      	    }
      ```
      
      	Compilation:
      ```
      clang -g -stdlib=libc++ -fsanitize=thread -fopenmp -larcher test_case.c
      ```
      
      	It seems like the changes introduced by the commit https://reviews.llvm.org/D114005 causes this particular SEGV while using Archer.
      
      Reviewed By: protze.joachim
      
      Differential Revision: https://reviews.llvm.org/D115328
      378b0ac1
  5. Dec 30, 2021
  6. Dec 29, 2021
  7. Dec 28, 2021
  8. Dec 27, 2021
    • Joseph Huber's avatar
      [OpenMP][FIX] Change globalization alignment to 16 · 7cdaa5a9
      Joseph Huber authored
      This patch changes the default aligntment from 8 to 16, and encodes this
      information in the `__kmpc_alloc_shared` runtime call to communicate it
      to the HeapToStack pass. The previous alignment of 8 was not sufficient
      for the maximum size of primitive types on 64-bit systems, and needs to
      be increaesd. This reduces the amount of space availible in the data
      sharing stack, so this implementation will need to be improved later to
      include the alignment requirements in the allocation call, and use it
      properly in the data sharing stack in the runtime.
      
      Depends on D115888
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D115971
      7cdaa5a9
    • Shilei Tian's avatar
      [OpenMP][Plugin] Introduce generic resource pool · a697a0a4
      Shilei Tian authored
      Currently CUDA streams are managed by `StreamManagerTy`. It works very well. Now
      we have the need that some resources, such as CUDA stream and event, will be
      hold by `libomptarget`. It is always good to buffer those resources. What's more
      important, given the way that `libomptarget` and plugins are connected, we cannot
      make sure whether plugins are still alive when `libomptarget` is destroyed. That
      leads to an issue that those resouces hold by `libomptarget` might not be
      released correctly. As a result, we need an unified management of all the resources
      that can be shared between `libomptarget` and plugins.
      
      `ResourcePoolTy` is designed to manage the type of resource for one device.
      It has to work with an allocator which is supposed to provide `create` and
      `destroy`. In this way, when the plugin is destroyed, we can make sure that
      all resources allocated from native runtime library will be released correctly,
      no matter whether `libomptarget` starts its destroy.
      
      Reviewed By: ye-luo
      
      Differential Revision: https://reviews.llvm.org/D111954
      a697a0a4
  9. Dec 20, 2021
    • Jonathan Peyton's avatar
      [OpenMP][libomp] Add use-all syntax to KMP_HW_SUBSET · 6a556eca
      Jonathan Peyton authored
      This patch allows the user to request all resources of a particular
      layer (or core-attribute). The syntax of KMP_HW_SUBSET is modified
      so the number of units requested is optional or can be replaced with an
      '*' character.
      
      e.g., KMP_HW_SUBSET=c:intel_atom@3 will use all the cores after offset 3
      e.g., KMP_HW_SUBSET=*c:intel_core will use all the big cores
      e.g., KMP_HW_SUBSET=*s,*c,1t will use all the sockets, all cores per
            each socket and 1 thread per core.
      
      Differential Revision: https://reviews.llvm.org/D115826
      6a556eca
  10. Dec 17, 2021
  11. Dec 15, 2021
  12. Dec 14, 2021
  13. Dec 13, 2021
  14. Dec 12, 2021
  15. Dec 11, 2021
  16. Dec 10, 2021
  17. Dec 09, 2021
    • Joseph Huber's avatar
      [OpenMP][FIX] Pass the num_threads value directly to parallel_51 · bc9c4d72
      Joseph Huber authored
      The problem with the old scheme is that we would need to keep track of
      the "next region" and reset the num_threads value after it. The new RT
      doesn't do it and an assertion is triggered. The old RT doesn't do it
      either, I haven't tested it but I assume a num_threads clause might
      impact multiple parallel regions "accidentally". Further, in SPMD mode
      num_threads was simply ignored, for some reason beyond me.
      
      In any case, parallel_51 is designed to take the clause value directly,
      so let's do that instead.
      
      Reviewed By: tianshilei1992
      
      Differential Revision: https://reviews.llvm.org/D113623
      bc9c4d72
    • Carlo Bertolli's avatar
      [OpenMP][AMDGPU] Switch host-device memory copy to asynchronous version · cc8dc5e2
      Carlo Bertolli authored
      Prepare amdgpu plugin for asynchronous implementation. This patch switches to using HSA API for asynchronous memory copy.
      Moving away from hsa_memory_copy means that plugin is responsible for locking/unlocking host memory pointers.
      
      Reviewed By: JonChesterfield
      
      Differential Revision: https://reviews.llvm.org/D115279
      cc8dc5e2
  18. Dec 08, 2021
  19. Dec 07, 2021
  20. Dec 06, 2021
Loading