Skip to content
  1. Nov 28, 2016
  2. Nov 14, 2016
    • Jonathan Peyton's avatar
      Update stats-gathering code · 5375fe82
      Jonathan Peyton authored
      Have developer timers use partitioning scheme which also required that some
      redundant developer timers be removed in favor of the already existing normal
      timers. Move per thread stats initialization to just after global thread id
      assignment which is as early as possible. Also put all global stats
      initialization code in __kmp_stats_init() and all global stats destruction code
      in __kmp_stats_fini().
      
      Differential Revision: https://reviews.llvm.org/D26361
      
      llvm-svn: 286892
      5375fe82
    • Jonathan Peyton's avatar
      Introduce dynamic affinity dispatch capabilities · 1cdd87ad
      Jonathan Peyton authored
      This set of changes enables the affinity interface (Either the preexisting
      native operating system or HWLOC) to be dynamically set at runtime
      initialization. The point of this change is that we were seeing performance
      degradations when using HWLOC. This allows the user to use the old affinity
      mechanisms which on large machines (>64 cores) makes a large difference in
      initialization time.
      
      These changes mostly move affinity code under a small class hierarchy:
      
      KMPAffinity
        class Mask {}
      KMPNativeAffinity : public KMPAffinity
        class Mask : public KMPAffinity::Mask
      KMPHwlocAffinity
        class Mask : public KMPAffinity::Mask
      
      Since all interface functions (for both affinity and the mask implementation)
      are virtual, the implementation can be chosen at runtime initialization.
      
      Differential Revision: https://reviews.llvm.org/D26356
      
      llvm-svn: 286890
      1cdd87ad
  3. Nov 07, 2016
    • Jonas Hahnfeld's avatar
      [OpenMP] Enable ThreadSanitizer to check OpenMP programs · 50fed047
      Jonas Hahnfeld authored
      This patch allows ThreadSanitizer (Tsan) to verify OpenMP programs.
      It means that no false positive will be reported by Tsan when
      verifying an OpenMP programs.
      This patch introduces annotations within the OpenMP runtime module to
      provide information about thread synchronization to the Tsan runtime.
      
      In order to enable the Tsan support when building the runtime, you must
      enable the TSAN_SUPPORT option with the following environment variable:
      
      -DLIBOMP_TSAN_SUPPORT=TRUE
      
      The annotations will be enabled in the main shared library
      (same mechanism of OMPT).
      
      Patch by Simone Atzeni and Joachim Protze!
      
      Differential Revision: https://reviews.llvm.org/D13072
      
      llvm-svn: 286115
      50fed047
  4. Sep 27, 2016
    • Jonathan Peyton's avatar
      Disable monitor thread creation by default. · b66d1aab
      Jonathan Peyton authored
      This change set disables creation of the monitor thread by default.  The global
      counter maintained by the monitor thread was replaced by logic that uses system
      time directly, and cyclic yielding on Linux target was also removed since there
      was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1)
      enables creation of monitor thread again if it is really necessary for some
      reasons.
      
      Differential Revision: https://reviews.llvm.org/D24739
      
      llvm-svn: 282507
      b66d1aab
  5. Sep 02, 2016
  6. Jun 22, 2016
  7. Jun 13, 2016
    • Jonathan Peyton's avatar
      Fix crash when libomp loaded/unloaded multiple times · 93495de2
      Jonathan Peyton authored
      The problem scenario is the following:
      A dynamic library, libfoo.so, depends on libomp.so (it creates parallel region
      and calls some omp functions).  An application has a loop where it dynamically
      loads libfoo.so, calls the function from it, unloads libfoo.so.  After several
      loop iterations application crashes with the message about lack of resources
      OMP: Error #34: System unable to allocate necessary resources for OMP thread:
      
      The problem is that pthread_kill() was not followed by pthread_join() in case
      of terminated thread. This patch fixes this problem for both worker and monitor
      threads.
      
      Differential Revision: http://reviews.llvm.org/D21200
      
      llvm-svn: 272567
      93495de2
  8. May 26, 2016
    • Hal Finkel's avatar
      Add an assembly __kmp_invoke_microtask for ppc64[le] · 91e19a3d
      Hal Finkel authored
      Clang no longer restricts itself to generating microtasks with a small number
      of arguments, and so an assembly implementation is required to prevent hitting
      the parameter limit present in the C implementation. This adds an
      implementation for ppc64[le].
      
      llvm-svn: 270821
      91e19a3d
  9. May 20, 2016
  10. May 16, 2016
    • Paul Osmialowski's avatar
      Clean all the mess around KMP_USE_FUTEX and kmp_lock.h · fb043fdf
      Paul Osmialowski authored
      KMP_USE_FUTEX preprocessor definition defined in kmp_lock.h is used
      inconsequently throughout LLVM libomp code.
      
      * some .c files that use this define do not include kmp_lock.h file,
        in effect guarded part of code are never compiled
      * some places in code use architecture-depending preprocessor
        logic expressions which effectively disable use of Futex for
        AArch64 architecture, all these places should use
        '#if KMP_USE_FUTEX' instead to avoid any further confusions
      * some places use KMP_HAS_FUTEX which is nowhere defined,
        KMP_USE_FUTEX should be used instead
      
      Differential Revision: http://reviews.llvm.org/D19629
      
      llvm-svn: 269642
      fb043fdf
  11. May 13, 2016
  12. May 05, 2016
    • Jonathan Peyton's avatar
      [STATS] Use partitioned timer scheme · 11dc82fa
      Jonathan Peyton authored
      This change removes the current timers with ones that partition time properly.
      The current timers are nested, so that if a new timer, B, starts when the
      current timer, A, is already timing, A's time will include B's. To eliminate
      this problem, the partitioned timers are designed to stop the current timer (A),
      let the new timer run (B), and when the new timer is finished, restart the
      previously running timer (A). With this partitioning of time, a threads' timers
      all sum up to the OMP_worker_thread_life time and can now easily show the
      percentage of time a thread is spending in different parts of the runtime or
      user code.
      
      There is also a new state variable associated with each thread which tells where
      it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if
      time is spent in OMP_task_taskwait, then that thread executed tasks inside a
      #pragma omp taskwait construct.
      
      The changes are mostly changing the MACROs to use the new PARITIONED_* macros,
      the new partitionedTimers class and its methods, and new state logic.
      
      Differential Revision: http://reviews.llvm.org/D19229
      
      llvm-svn: 268640
      11dc82fa
  13. Apr 18, 2016
    • Jonathan Peyton's avatar
      Fix for pthread_setspecific (TLS and shutdown) problem · f252010f
      Jonathan Peyton authored
      Some codes that use TLS fail intermittently because one thread tries to write
      TLS values after the TLS key has been destroyed by another thread. This happens
      when one thread executes library shutdown (and destroys TLS keys), while another
      thread starts to execute the TLS key destructor routine. Before this change, the
      kmp_init_runtime flag was checked before calling pthread_* TLS functions, but
      this flag is set to FALSE later than the destruction of the TLS keys, which
      leads to failure. The fix is to check kmp_init_gtid instead, as this flag is
      unset *before* the destruction of TLS keys.
      
      Differential Revision: http://reviews.llvm.org/D19022
      
      llvm-svn: 266674
      f252010f
  14. Apr 14, 2016
    • Jonathan Peyton's avatar
      Exponential back off logic for test-and-set lock · 377aa40d
      Jonathan Peyton authored
      This change adds back off logic in the test and set lock for better contended
      lock performance. It uses a simple truncated binary exponential back off
      function. The default back off parameters are tuned for x86.
      
      The main back off logic has a two loop structure where each is controlled by a
      user-level parameter:
      max_backoff - limits the outer loop number of iterations.
          This parameter should be a power of 2.
      min_ticks - the inner spin wait loop number of "ticks" which is system
          dependent and should be tuned for your system if you so choose.
          The "ticks" on x86 correspond to the time stamp counter,
          but on other architectures ticks is a timestamp derived
          from gettimeofday().
      
      The user can modify these via the environment variable:
      KMP_SPIN_BACKOFF_PARAMS=max_backoff[,min_ticks]
      Currently, since the default user lock is a queuing lock,
      one would have to also specify KMP_LOCK_KIND=tas to use the test-and-set locks.
      
      Differential Revision: http://reviews.llvm.org/D19020
      
      llvm-svn: 266329
      377aa40d
  15. Jan 27, 2016
  16. Dec 19, 2015
  17. Nov 30, 2015
    • Jonathan Peyton's avatar
      Adding Hwloc library option for affinity mechanism · 01dcf36b
      Jonathan Peyton authored
      These changes allow libhwloc to be used as the topology discovery/affinity
      mechanism for libomp.  It is supported on Unices. The code additions:
      * Canonicalize KMP_CPU_* interface macros so bitmask operations are
        implementation independent and work with both hwloc bitmaps and libomp
        bitmaps.  So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and
        the like. These are all in kmp.h and appropriately placed.
      * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc
        interface to create a libomp address2os object which the rest of libomp knows
        how to handle already.
      * To build, use -DLIBOMP_USE_HWLOC=on and
        -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake
        can't find the library or hwloc.h, then it will tell you and exit.
      
      Differential Revision: http://reviews.llvm.org/D13991
      
      llvm-svn: 254320
      01dcf36b
  18. Nov 09, 2015
    • Jonathan Peyton's avatar
      Fixes to wait-loop code · 3f5dfc25
      Jonathan Peyton authored
      1) Add get_ptr_type() method to all wait flag types.
      2) Flag in sleep_loc may change type by the time the resume is called from
         __kmp_null_resume_wrapper. We use get_ptr_type to obtain the real type
         and compare it to the casted object received. If they don't match, we know
         the flag has changed (already resumed and replaced by another flag). If they
         match, it doesn't hurt to go ahead and resume it.
      
      Differential Revision: http://reviews.llvm.org/D14458
      
      llvm-svn: 252487
      3f5dfc25
  19. Nov 04, 2015
  20. Oct 08, 2015
  21. Sep 21, 2015
  22. Aug 26, 2015
  23. Aug 11, 2015
    • Jonathan Peyton's avatar
      Tidy statistics collection · 45be4500
      Jonathan Peyton authored
      This removes some statistics counters and timers which were not used,
      adds new counters and timers for some language features that were not
      monitored previously and separates the counters and timers into those
      which are of interest for investigating user code and those which are
      only of interest to the developer of the runtime itself.
      The runtime developer statistics are now ony collected if the
      additional #define KMP_DEVELOPER_STATS is set.
      
      Additional user statistics which are now collected include:
      * Count of nested parallelism (omp parallel inside a parallel region)
      * Count of omp distribute occurrences
      * Count of omp teams occurrences
      * Counts of task related statistics (taskyield, task execution, task
        cancellation, task steal)
      * Values passed to omp_set_numtheads
      * Time spent in omp single and omp master
      
      None of this affects code compiled without stats gathering enabled,
      which is the normal library build mode.
      
      This also fixes the CMake build by linking to the standard c++ library
      when building the stats library as it is a requirement.  The normal library
      does not have this requirement and its link phase is left alone.
      
      Differential Revision: http://reviews.llvm.org/D11759
      
      llvm-svn: 244677
      45be4500
  24. Aug 05, 2015
  25. Jul 13, 2015
    • Jonathan Peyton's avatar
      Fix some bugs in OMPT support · 122dd76f
      Jonathan Peyton authored
      1.) in kmp_csupport.c, move computation of parameters only needed for OMPT tracing
      inside a conditional to reduce overhead if not receiving ompt_event_master_begin
      callbacks.
      2.) in kmp_gsupport.c, remove spurious reset of OMPT reenter_runtime_frame (which 
      is set in its caller, GOMP_parallel_start correct placement of #if OMP_TRACE so 
      that state is maintained even if tracing support not included.  
      3.) in z_Linux_util.c, add architecture independent support for OMPT by setting 
      and resetting OMPT's exit_frame_ptr before and after invoking a microtask.  
      4.) On the Intel MIC, the loader refuses to retain static symbols in the 
      libomp.so shared library, even though tools need them. The loader could not be
      bullied into doing so. To accommodate this, I changed the visibility of OMPT 
      placeholder functions to public. This required additions in exports.so.txt, 
      adding extern "C" scoping in ompt-general.c so that the public placeholder
      symbols won't be mangled.
      
      Patch by John Mellor-Crummey
      
      Differential Revision: http://reviews.llvm.org/D11062
      
      llvm-svn: 242052
      122dd76f
  26. Jun 08, 2015
  27. Jun 04, 2015
    • Jonathan Peyton's avatar
      Fix some sign compare warnings. · 1e7a1ddc
      Jonathan Peyton authored
      This change changes kmp_bstate.old_tid to sign integer instead of unsigned integer.
      It also defines two new macros KMP_NSEC_PER_SEC and KMP_USEC_PER_SEC which lets us take
      control of the sign (we want them to be longs).  Also, in kmp_wait_release.h, the byteref()
      function's return type is changed from char to unsigned char.
      
      llvm-svn: 239057
      1e7a1ddc
  28. Apr 02, 2015
  29. Mar 10, 2015
  30. Feb 20, 2015
Loading