Skip to content
  1. Nov 30, 2015
    • Jonathan Peyton's avatar
      Adding Hwloc library option for affinity mechanism · 01dcf36b
      Jonathan Peyton authored
      These changes allow libhwloc to be used as the topology discovery/affinity
      mechanism for libomp.  It is supported on Unices. The code additions:
      * Canonicalize KMP_CPU_* interface macros so bitmask operations are
        implementation independent and work with both hwloc bitmaps and libomp
        bitmaps.  So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and
        the like. These are all in kmp.h and appropriately placed.
      * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc
        interface to create a libomp address2os object which the rest of libomp knows
        how to handle already.
      * To build, use -DLIBOMP_USE_HWLOC=on and
        -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake
        can't find the library or hwloc.h, then it will tell you and exit.
      
      Differential Revision: http://reviews.llvm.org/D13991
      
      llvm-svn: 254320
      01dcf36b
  2. Nov 16, 2015
  3. Nov 12, 2015
  4. Nov 11, 2015
  5. Nov 09, 2015
    • Jonathan Peyton's avatar
      Fixes to wait-loop code · 3f5dfc25
      Jonathan Peyton authored
      1) Add get_ptr_type() method to all wait flag types.
      2) Flag in sleep_loc may change type by the time the resume is called from
         __kmp_null_resume_wrapper. We use get_ptr_type to obtain the real type
         and compare it to the casted object received. If they don't match, we know
         the flag has changed (already resumed and replaced by another flag). If they
         match, it doesn't hurt to go ahead and resume it.
      
      Differential Revision: http://reviews.llvm.org/D14458
      
      llvm-svn: 252487
      3f5dfc25
    • Jonathan Peyton's avatar
      Fixes and improvements to tasking in barriers · b0b83c8b
      Jonathan Peyton authored
      1) When the number of threads in a team increases, new threads need to have all
         their barrier struct fields initialized. We were missing the parent_bar and
         team fields.
      2) For non-forkjoin barriers, we now do the __kmp_task_team_setup before the
         gather. The setup now sets up the task_team that all the threads will switch
         to after the barrier, but it needs to be done before other threads do the
         switch.
      3) Remove an unneeded assignment of tt_found_tasks in task team free function.
      
      Differential Revision: http://reviews.llvm.org/D14456
      
      llvm-svn: 252486
      b0b83c8b
    • Jonathan Peyton's avatar
      Improvements to machine_hierarchy code for re-sizing · 7dee82e7
      Jonathan Peyton authored
      These changes include:
       1) Machine hierarchy now uses the base_num_threads field to indicate the 
          maximum number of threads the current hierarchy can handle without a resize.
       2) In __kmp_get_hierarchy, we need to get depth after any potential resize
          is done.
       3) Cleanup of hierarchy resize code to support 1 above.
      
      Differential Revision: http://reviews.llvm.org/D14455
      
      llvm-svn: 252475
      7dee82e7
    • Jonathan Peyton's avatar
      [OMPT] Add OMPT events for the OpenMP taskwait construct. · 960ea2f6
      Jonathan Peyton authored
      llvm-svn: 252472
      960ea2f6
  6. Nov 06, 2015
    • Jonathan Peyton's avatar
      Fix for zero chunk size · 70bda912
      Jonathan Peyton authored
      Setting dynamic schedule with chunk size 0 via omp_set_schedule(dynamic,0)
      and then using "schedule (runtime)" causes infinite loop because for the 
      chunked dynamic schedule we didn't correct zero chunk to the default (1).
      
      llvm-svn: 252338
      70bda912
  7. Nov 05, 2015
    • Jonathan Peyton's avatar
      Improve OMPT initialization code · 95246e7d
      Jonathan Peyton authored
      Use of #ifdef OMPT_DEBUG was causing messages to be generated under normal
      operation when the OpenMP library was compiled with KMP_DEBUG enabled.
      Elsewhere, KMP_DEBUG evaluates assertions, but never produces messages during
      normal operation. To avoid this inconsistency, set OMPT_DEBUG using a cmake
      variable LIBOMP_OMPT_DEBUG.
      
      While I was editing the associated ompt-specific.h and ompt-general.c files,
      make the spacing and comments consistent.
      
      Patch by John Mellor-Crummey
      
      Differential Revision: http://reviews.llvm.org/D14355
      
      llvm-svn: 252173
      95246e7d
  8. Nov 04, 2015
  9. Nov 02, 2015
  10. Oct 30, 2015
  11. Oct 29, 2015
    • Jonathan Peyton's avatar
      [OMPT] Windows Support for OMPT · 69e596a5
      Jonathan Peyton authored
      The problem is that the ompt_tool() function (which must be implemented by a
      performance tool) should be defined in the RTL as well to cover the case when
      the tool is not present in the address space of the process. This functionality
      is accomplished with weak symbols in Unices. Unfortunately, Windows does not
      support weak symbols.
      
      The solution in these changes is to grab the list of all modules loaded by the
      process and then search for symbol "ompt_tool()" within them. The function
      ompt_tool_windows() performs the search of the ompt_tool symbol. If ompt_tool is
      found, then its return value is used to initialize the tool. If ompt_tool is not
      found, then ompt_tool_windows() returns NULL and OMPT is thus, disabled.
      
      While doing these changes, the OMPT_SUPPORT detection in CMake was changed to
      test for the required featuers for OMPT_SUPPORT, namely: builtin_frame_address()
      existence, weak attribute existence and psapi.dll existence. For
      LIBOMP_HAVE_OMPT_SUPPORT to be true, it must be that the builtin_frame_address()
      intrinsic exists AND one of: either weak attributes exist or psapi.dll exists.
      
      Also, since Process Status API is used I had to add new dependency -- psapi.dll
      to the library dependency micro test.
      
      Differential Revision: http://reviews.llvm.org/D14027
      
      llvm-svn: 251654
      69e596a5
  12. Oct 20, 2015
  13. Oct 19, 2015
    • Jonathan Peyton's avatar
      Fix OMP_PLACES negation operator parsing (!place) · 6778c732
      Jonathan Peyton authored
      Just moved the *scan++ line up before the recursive call.  Otherwise,
      infinite recursion occurs and leads to a segmentation fault.
      
      llvm-svn: 250729
      6778c732
    • Jonathan Peyton's avatar
      Clean-up cancellation state flag between parallel regions · 45ca5dad
      Jonathan Peyton authored
      Without this fix, cancellation requests in one parallel region cause
      cancellation of the second region even though the second one was
      not intended to be cancelled.
      
      llvm-svn: 250727
      45ca5dad
    • Dimitry Andric's avatar
      On FreeBSD, PTHREADS_THREADS_MAX does not fit into an int, leading to · 9b8c353c
      Dimitry Andric authored
      warnings similar to the following:
      
          runtime/src/kmp_global.c:117:35: warning: implicit conversion from
          'unsigned long' to 'int' changes value from 18446744073709551615 to -1
          [-Wconstant-conversion]
          int           __kmp_sys_max_nth = KMP_MAX_NTH;
                        ~~~~~~~~~~~~~~~~~   ^~~~~~~~~~~
          runtime/src/kmp.h:849:34: note: expanded from macro 'KMP_MAX_NTH'
          #    define KMP_MAX_NTH          PTHREAD_THREADS_MAX
                                           ^~~~~~~~~~~~~~~~~~~
      
      Clamp KMP_MAX_NTH to INT_MAX to avoid these warnings.  Also use INT_MAX
      whenever PTHREAD_THREADS_MAX is not defined at all.
      
      Differential Revision: http://reviews.llvm.org/D13827
      
      llvm-svn: 250708
      9b8c353c
  14. Oct 16, 2015
    • Jonathan Peyton's avatar
      [OMPT] Add OMPT events for API locking · 0e6d4577
      Jonathan Peyton authored
      This fix implements the following OMPT events for the API locking routines:
      * ompt_event_acquired_lock
      * ompt_event_acquired_nest_lock_first
      * ompt_event_acquired_nest_lock_next
      * ompt_event_init_lock
      * ompt_event_init_nest_lock
      * ompt_event_destroy_lock
      * ompt_event_destroy_nest_lock
      
      For the acquired events the depths of the locks ist required, so a return value
      was added similiar to the return values we already have for the release lock
      routines.
      
      Patch by Tim Cramer
      
      Differential Revision: http://reviews.llvm.org/D13689
      
      llvm-svn: 250526
      0e6d4577
  15. Oct 13, 2015
  16. Oct 12, 2015
  17. Oct 09, 2015
    • Jonathan Peyton's avatar
      [OMPT] Reduce overhead of OMPT · f0344bb0
      Jonathan Peyton authored
      * Avoid computing state needed only by OMPT unless the ompt_enabled flag is set.
      * Properly handle a corner case in OMPT where team == NULL.
      
      Patch by John Mellor-Crummey
      
      Differential Revision: http://reviews.llvm.org/D13502
      
      llvm-svn: 249857
      f0344bb0
    • Jonathan Peyton's avatar
      [OMPT] Initialize task fields only if needed · b401db6d
      Jonathan Peyton authored
      Because __kmp_task_init_ompt is called for every initial task in each thread
      and always generated task ids, this was a big performance issue on bigger
      systems even without any tool attached.  After changing the initialization 
      interface to ompt_tool, we can now rely on already knowing whether a tool is
      attached and OMPT is enabled at this point.
      
      Patch by Jonas Hahnfeld
      
      Differential Revision: http://reviews.llvm.org/D13494
      
      llvm-svn: 249855
      b401db6d
  18. Oct 08, 2015
    • Jonathan Peyton's avatar
    • Jonathan Peyton's avatar
      Debug trace and assert statement changes for wait/release improvements. · e03b62f3
      Jonathan Peyton authored
      These changes improve/update the trace messages and debug asserts related to
      the previous wait/release checkin.
      
      llvm-svn: 249717
      e03b62f3
    • Jonathan Peyton's avatar
      OpenMP Wait/release improvements. · a0e159f7
      Jonathan Peyton authored
      These changes improve the wait/release mechanism for threads spinning in 
      barriers that are handling tasks while spinnin by providing feedback to the 
      barriers about any task stealing that occurs.
      
      Differential Revision: http://reviews.llvm.org/D13353
      
      llvm-svn: 249711
      a0e159f7
    • Jonathan Peyton's avatar
      Added sockets to the syntax of KMP_PLACE_THREADS environment variable. · dd4aa9b6
      Jonathan Peyton authored
      Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable.
      Some limitations:
      * The number of sockets and then optional offset should be specified first (before other parameters).
      * The letter designation is mandatory for sockets and then for other parameters.
      * If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped.
      * If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively.
      * The number of cores per socket cannot be specified before sockets or after threads per core.
      * The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset);
      * Parameters delimiter can be: empty, comma, lower-case x;
      * Spaces are allowed around numbers, around letters, around delimiter.
      Approximate shorthand specification:
      KMP_PLACE_THREADS="[num_sockets(S|s)[[delim]offset(O|o)][delim]][num_cores_per_socket(C|c)[[delim]offset(O|o)][delim]][num_threads_per_core(T|t)]"
      
      Differential Revision: http://reviews.llvm.org/D13175
      
      llvm-svn: 249708
      dd4aa9b6
  19. Sep 25, 2015
    • Jonathan Peyton's avatar
      Fix memory corruption in Windows debug library · 7edeef1b
      Jonathan Peyton authored
      This patch adjusts the buffer size when reducing the buffer used for printing.
      This solves the memory corruption in Windows debug library, and potential
      memory corruption in other builds.
      
      llvm-svn: 248588
      7edeef1b
  20. Sep 24, 2015
  21. Sep 23, 2015
    • Jonathan Peyton's avatar
      Update Reference.pdf files. · 1acc2dbf
      Jonathan Peyton authored
      This updates the Reference.pdf files to say LLVM OpenMP Runtime Library and
      also updates the build documentation to show how to build with CMake.
      
      llvm-svn: 248407
      1acc2dbf
Loading