Skip to content
  1. Oct 26, 2021
  2. Oct 25, 2021
    • Georgios Rokos's avatar
      [libomptarget][NFC] Add comment explaining why we pass argument bases and · 2feafa2e
      Georgios Rokos authored
      offsets as two separate entities to the plugins.
      2feafa2e
    • Shilei Tian's avatar
      [OpenMP][Offloading] Only get trip count if team construct · 2a30c03c
      Shilei Tian authored
      Reviewed By: grokos
      
      Differential Revision: https://reviews.llvm.org/D112475
      2a30c03c
    • AndreyChurbanov's avatar
      [OpenMP] libomp: disable definitions of 5.1 atomics for non-x86 arch. · e38a1deb
      AndreyChurbanov authored
      Declarations of 5.1 atomic entries were added under
      "#if KMP_ARCH_X86 || KMP_ARCH_X86_64" in kmp_atomic.h,
      but definitions of the functions missed architecture guard in kmp_atomic.cpp.
      As a result mangled symbols were available on non-x86 architecture.
      The patch eliminates these unexpected symbols from the library.
      
      Differential Revision: https://reviews.llvm.org/D112261
      e38a1deb
    • Vladimir Inđić's avatar
      [OpenMP][OMPT] thread_num determination during execution of nested serialized parallel regions · f41d0854
      Vladimir Inđić authored
      __ompt_get_task_info_internal function is adapted to support thread_num
      determination during the execution of multiple nested serialized
      parallel regions enclosed by a regular parallel region.
      
      Consider the following program that contains parallel region R1 executed
      by two threads. Let the worker thread T of region R1 executes serialized
      parallel regions R2 that encloses another serialized parallel region R3.
      Note that the thread T is the master thread of both R2 and R3 regions.
      
      Assume that __ompt_get_task_info_internal function is called with the
      argument "ancestor_level == 1" during the execution of region R3.
      The function should determine the "thread_num" of the thread T inside
      the team of region R2, whose implicit task is at level 1 inside the
      hierarchy of active tasks. Since the thread T is the master thread of
      region R2, one should expected that "thread_num" takes a value 0.
      After the while loop finishes, the following stands: "lwt != NULL",
      "prev_lwt == NULL", "prev_team" represents the team information about
      the innermost serialized parallel region R3. This results in executing
      the assignment "thread_num = prev_team->t.t_master_tid". Note that
      "prev_team->t.t_master_tid" was initialized at the moment of
      R2’s creation and represents the "thread_num" of the thread T inside
      the region R1 which encloses R2. Since the thread T is the worker thread
      of the region R1, "the thread_num" takes value 1, which is a contradiction.
      
      This patch proposes to use "lwt" instead of "prev_lwt" when determining
      the "thread_num". If "lwt" exists, the task at the requested level belongs
      to the serialized parallel region. Since the serialized parallel region
      is executed by one thread only, the "thread_num" takes value 0.
      
      Similarly, assume that __ompt_get_task_info_internal function is called
      with the argument "ancestor_level == 2" during the execution of region R3.
      The function should determine the "thread_num" of the thread T inside the
      team of region R1. Since the thread is the worker inside the region R1,
      one should expected that "thread_num" takes value 1. After the loop finishes,
      the following stands: "lwt == NULL", "prev_lwt != NULL", "prev_team" represents
      the team information about the innermost serialized parallel region R3.
      This leads to execution of the assignment "thread_num = 0", which causes
      a contradiction.
      
      Ignoring the "prev_lwt" leads to executing the assignment
      "thread_num = prev_team->t.t_master_tid" instead. From the previous explanation,
      it is obvious that "thread_num" takes value 1.
      
      Note that the "prev_lwt" variable is marked as unnecessary and thus removed.
      
      This patch introduces the test case which represents the OpenMP program
      described earlier in the summary.
      
      Differential Revision: https://reviews.llvm.org/D110699
      f41d0854
    • Vladimir Inđić's avatar
      [OpenMP][OMPT][clang] task frame support fixed in __kmpc_fork_call · f2410bfb
      Vladimir Inđić authored
      __kmp_fork_call sets the enter_frame of the active task (th_curren_task)
      before new parallel region begins. After the region is finished, the
      enter_frame is cleared.
      
      The old implementation of __kmpc_fork_call didn’t clear the enter_frame of
      active task.
      
      Also, the way of initializing the enter_frame of the active task was wrong.
      Consider the following two OpenMP programs.
      
      The first program: Let R1 be the serialized parallel region that encloses
      another serialized parallel region R2. Assume that thread that executes R2 is
      going to create a new serialized parallel region R3 by executing
      __kmpc_fork_call. This thread is responsible to set enter_frame of R2's
      implicit task. Note that the information about R2's implicit task is present
      inside master_th->th.th_current_task at this moment, while lwt represents the
      information about R1's implicit task. The old implementation uses lwt and
      resets enter_frame of R1's implicit task instead of R2's implicit task. The
      new implementation uses master_th->th.th_current_task instead.
      
      The second program: Consider the OpenMP program that contains parallel region
      R1 which encloses an explicit task T. Assume that thread should create another
      parallel region R2 during the execution of the T. The __kmpc_fork_call is
      responsible to create R2 and set enter frame of T whose information is present
      inside the master_th->th.th_current_task.
      Old implementation tries to set the frame of
      parent_team->t.t_implicit_task_taskdata[tid] which corresponds to the implicit
      task of the R1, instead of T.
      
      Differential Revision: https://reviews.llvm.org/D112419
      f2410bfb
    • Joachim Protze's avatar
      [OpenMP][Tests] Test omp_get_wtime for invariants · 73682279
      Joachim Protze authored
      As discussed in D108488, testing for invariants of omp_get_wtime would be more
      reliable than testing for duration of sleep, as return from sleep might be
      delayed due to system load.
      
      Alternatively/in addition, we could compare the time measured by omp_get_wtime
       to time measured with C++11 chrono (for portability?).
      
      Differential Revision: https://reviews.llvm.org/D112458
      73682279
    • Joachim Protze's avatar
      [OpenMP][Tests][NFC] Actually check for test outcome · 3f229f42
      Joachim Protze authored
      The CHECK: line in the test had no effect, because the test does not
      pipe to FileCheck. Since the test only checks for a single value,
      encode the result in the return value of the test.
      3f229f42
    • Joachim Protze's avatar
      [OpenMP][Tests][NFC] Mark tests trying to link COI as unsupported · 047890bc
      Joachim Protze authored
      For some tests with target-related functionality icc 18/19 tries to link
      libioffload_target.so.5, which fails for missing COI symbols.
      047890bc
    • Joachim Protze's avatar
      [OpenMP][Tests][NFC] Replace atomic increment by reduction · d7fdd236
      Joachim Protze authored
      Also mark the test as unsupported by intel-21, because the test does
      not terminate
      d7fdd236
    • Joachim Protze's avatar
      [OpenMP][Tools][NFC] Fix C99-style declaration of iteration variables · 38f78dd2
      Joachim Protze authored
      Where possible change to declare the variable before the loop.
      Where not possible, specifically request -std=c99 (could be limited to
      specific compilers like icc).
      38f78dd2
    • Joachim Protze's avatar
      d29a7d23
  3. Oct 23, 2021
  4. Oct 22, 2021
    • Vladimir Inđić's avatar
      [OpenMP][OMPT][GOMP] task frame support in KMP_API_NAME_GOMP_PARALLEL_SECTIONS · ba02586f
      Vladimir Inđić authored
      KMP_API_NAME_GOMP_PARALLEL_SECTIONS function was missing the task frame support.
      This patch introduced a fix responsible to set properly the exit_frame of
      the innermost implicit task that corresponds to the parallel section construct,
      as well as the enter_frame of the task that encloses the mentioned implicit task.
      
      This patch also introduced a simple test case sections_serialized.c that contains
      serialized parallel section construct and validates whether the mentioned
      task frames are set correctly.
      
      Differential Revision: https://reviews.llvm.org/D112205
      ba02586f
  5. Oct 21, 2021
  6. Oct 20, 2021
  7. Oct 19, 2021
  8. Oct 18, 2021
    • AndreyChurbanov's avatar
      [OpenMP] libomp: add check of task function pointer for NULL. · 63f8099e
      AndreyChurbanov authored
      This patch allows to simplify compiler implementation on "taskwait nowait"
      construct. The "taskwait nowait" is semantically equivalent to the empty task.
      Instead of creating an empty routine as a task entry, compiler can just send
      NULL pointer to the runtime. Then the runtime will make all the work with
      dependences and return because of the absent task routine.
      
      Differential Revision: https://reviews.llvm.org/D112015
      63f8099e
    • Jon Chesterfield's avatar
      [libomptarget] Pass OMP_TARGET_OFFLOAD env variable through to tests · 251b1e7c
      Jon Chesterfield authored
      Useful for OMP_TARGET_OFFLOAD=MANDATORY when testing
      
      Reviewed By: Meinersbur
      
      Differential Revision: https://reviews.llvm.org/D111995
      251b1e7c
    • @vladaindjic's avatar
      [OpenMP][OMPT] thread_num determination for programs with explicit tasks · 59a994e8
      @vladaindjic authored
      __ompt_get_task_info_internal is now able to determine the right value of the
      “thread_num” argument during the execution of an explicit task.
      
      During the execution of a while loop that iterates over the ancestor tasks
      hierarchy, the “prev_team” variable was always set to “team” variable at the
      beginning of each loop iteration.
      
      Assume that the program contains a parallel region which encloses an explicit
      task executed by the worker thread of the region. Also assume that the tool
      inquires the “thread_num” of a worker thread for the implicit task that
      corresponds to the region (task at “ancestor_level == 1”) and expects to
      receive the value of “thread_num > 0”.
      After the loop finishes, both “team” and “prev_team” variables are equal and
      point to the team information of the parallel region.
      The “thread_num” is set to “prev_team->t.t_master_tid”, that is equal to
      “team->t.t_master_tid”. In this case, “team->t.t_master_tid” is 0, since
      the master thread of the region is the initial master thread of the program.
      This leads to a contradiction.
      
      To prevent this, “prev_team” variable is set to “team” variable only at the
      time when the loop that has already encountered the implicit task (“taskdata”
      variable contains the information about an implicit task) continues iterating
      over the implicit task’s ancestors, if any.
      
      After the mentioned loop finishes, the “prev_team” variable might be equal to
      NULL. This means that the task at requested “ancestor_level” belongs to the
      innermost parallel region, so the “thread_num” will be determined by calling
      the “__kmp_get_tid”.
      
      To prove that this patch works, the test case “explicit_task_thread_num.c” is
      provided.
      It contains the example of the program explained earlier in the summary.
      
      Differential Revision: https://reviews.llvm.org/D110473
      59a994e8
    • Joachim Protze's avatar
      [OpenMP][Tests][NFC] Work around ICC bug · c93fb143
      Joachim Protze authored
      Older intel compilers miss the privatization of nested loop variables for
      doacross loops. Declaring the variable in the loop makes the test more
      robust.
      c93fb143
    • Joachim Protze's avatar
      [OpenMP][Tests][NFC] Flagging OMPT tests as XFAIL for Intel compilers · 59186882
      Joachim Protze authored
      With Intel 19 compiler the teams tests fail to link while trying to link
      liboffload.
      59186882
  9. Oct 16, 2021
    • Shilei Tian's avatar
      [OpenMP][deviceRTLs] Fix wrong return value of `__kmpc_is_spmd_exec_mode` · 2c941fa2
      Shilei Tian authored
      D110279 introduced a bug to the device runtime. In `__kmpc_parallel_51`, we detect
      whether we are already in parallel region by `__kmpc_parallel_level() > __kmpc_is_spmd_exec_mode()`.
      It is based on the assumption that:
      - In SPMD mode, parallel level is initialized to 1.
      - In generic mode, parallel level is initialized to 0.
      - `__kmpc_is_spmd_exec_mode` returns `1` for SPMD mode, 0 otherwise.
      
      Because the return value type of `__kmpc_is_spmd_exec_mode` is `int8_t`, there
      was an implicit cast from `bool` to `int8_t`. We can make sure it is either 0 or
      1 since C++14. In D110279, the return value is the result of an `and` operation,
      which is 2 in SPMD mode. This breaks the assumption in `__kmpc_parallel_51`.
      
      Reviewed By: carlo.bertolli, dpalermo
      
      Differential Revision: https://reviews.llvm.org/D111905
      2c941fa2
  10. Oct 15, 2021
  11. Oct 14, 2021
  12. Oct 13, 2021
  13. Oct 11, 2021
  14. Oct 09, 2021
  15. Oct 08, 2021
Loading