Skip to content
  1. Nov 14, 2016
    • Jonathan Peyton's avatar
      Introduce dynamic affinity dispatch capabilities · 1cdd87ad
      Jonathan Peyton authored
      This set of changes enables the affinity interface (Either the preexisting
      native operating system or HWLOC) to be dynamically set at runtime
      initialization. The point of this change is that we were seeing performance
      degradations when using HWLOC. This allows the user to use the old affinity
      mechanisms which on large machines (>64 cores) makes a large difference in
      initialization time.
      
      These changes mostly move affinity code under a small class hierarchy:
      
      KMPAffinity
        class Mask {}
      KMPNativeAffinity : public KMPAffinity
        class Mask : public KMPAffinity::Mask
      KMPHwlocAffinity
        class Mask : public KMPAffinity::Mask
      
      Since all interface functions (for both affinity and the mask implementation)
      are virtual, the implementation can be chosen at runtime initialization.
      
      Differential Revision: https://reviews.llvm.org/D26356
      
      llvm-svn: 286890
      1cdd87ad
  2. Sep 12, 2016
    • Jonathan Peyton's avatar
      Fix bitmask upper bounds check · 7c465a5f
      Jonathan Peyton authored
      Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we
      use the get_max_proc() function which can vary based on the operating system.
      For example on Windows with multiple processor groups, it might be the case that
      the highest bit possible in the bitmask is not equal to the number of hardware
      threads on the machine but something higher than that.
      
      Differential Revision: https://reviews.llvm.org/D24206
      
      llvm-svn: 281245
      7c465a5f
  3. Sep 02, 2016
  4. Aug 05, 2016
  5. Jul 29, 2016
  6. Jul 08, 2016
  7. Jun 21, 2016
  8. Jun 16, 2016
  9. Jun 13, 2016
    • Jonathan Peyton's avatar
      Affinity mask processing improvements · c5304aa3
      Jonathan Peyton authored
      Remove static specifier from var fullMask and remove kmp_get_fullMask() routine.
      When iterating through procs in a mask, always check if proc is in fullMask
      (this check was missing in a few places).
      
      Patch by Brian Bliss.
      
      Differential Revision: http://reviews.llvm.org/D21300
      
      llvm-svn: 272589
      c5304aa3
    • Jonathan Peyton's avatar
      Hwloc refactoring patch · 202a24dd
      Jonathan Peyton authored
      These changes remove the hwloc_topology_ignore_type function which doesn't exist
      in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc
      has the cache levels stripped out and then assumes the final stripped topology
      follows the typical three-level topology: packages -> cores -> HW threads.
      But the code is doing unclean manipulations to determine at what level those
      resources are located and also assumes too much about what hwloc is detecting
      (there could be intermediate levels in between socket and core for instance).
      This new way of extracting the topology doesn't strip out any hardware objects
      that hwloc detects. It does not assume the three level topology, and instead
      searches for the relevant three levels within the topology for each bit of
      information using hwloc interface functions. i.e., the three level topology
      subset that our affinity code is interested in is extracted from the hwloc
      topology tree directly.
      
      For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the
      number of cores under a socket reliably without worrying if there are unexpected
      objects between the socket object and core object in the hwloc topology
      structure. Also, now that all topology information is kept, there are also
      possibilities of using the caches/numa nodes to determine more sophisticated
      affinity settings in the future.
      
      There is also some cleanup code added for the destruction of the
      __kmp_hwloc_topology object.
      
      Differential Revision: http://reviews.llvm.org/D21195
      
      llvm-svn: 272565
      202a24dd
  10. Apr 25, 2016
  11. Jan 12, 2016
    • Jonathan Peyton's avatar
      New API for restoring current thread's affinity to init affinity of application · 3076fa4c
      Jonathan Peyton authored
      This new API, int kmp_set_thread_affinity_mask_initial(), is available for use
      by other parallel runtime libraries inside a possibly OpenMP-registered thread.
      This entry point restores the current thread's affinity mask to the affinity
      mask of the application when it first began. If -1 is returned it can be assumed
      that either the thread hasn't called affinity initialization or that the thread
      isn't registered with the OpenMP library. If 0 is returned then, then the call
      was successful. Any return value greater than zero indicates an error occurred
      when setting affinity.
      
      Differential Revision: http://reviews.llvm.org/D15867
      
      llvm-svn: 257489
      3076fa4c
  12. Nov 30, 2015
    • Jonathan Peyton's avatar
      Adding Hwloc library option for affinity mechanism · 01dcf36b
      Jonathan Peyton authored
      These changes allow libhwloc to be used as the topology discovery/affinity
      mechanism for libomp.  It is supported on Unices. The code additions:
      * Canonicalize KMP_CPU_* interface macros so bitmask operations are
        implementation independent and work with both hwloc bitmaps and libomp
        bitmaps.  So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and
        the like. These are all in kmp.h and appropriately placed.
      * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc
        interface to create a libomp address2os object which the rest of libomp knows
        how to handle already.
      * To build, use -DLIBOMP_USE_HWLOC=on and
        -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake
        can't find the library or hwloc.h, then it will tell you and exit.
      
      Differential Revision: http://reviews.llvm.org/D13991
      
      llvm-svn: 254320
      01dcf36b
  13. Nov 09, 2015
    • Jonathan Peyton's avatar
      Improvements to machine_hierarchy code for re-sizing · 7dee82e7
      Jonathan Peyton authored
      These changes include:
       1) Machine hierarchy now uses the base_num_threads field to indicate the 
          maximum number of threads the current hierarchy can handle without a resize.
       2) In __kmp_get_hierarchy, we need to get depth after any potential resize
          is done.
       3) Cleanup of hierarchy resize code to support 1 above.
      
      Differential Revision: http://reviews.llvm.org/D14455
      
      llvm-svn: 252475
      7dee82e7
  14. Oct 19, 2015
  15. Oct 08, 2015
    • Jonathan Peyton's avatar
      Added sockets to the syntax of KMP_PLACE_THREADS environment variable. · dd4aa9b6
      Jonathan Peyton authored
      Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable.
      Some limitations:
      * The number of sockets and then optional offset should be specified first (before other parameters).
      * The letter designation is mandatory for sockets and then for other parameters.
      * If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped.
      * If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively.
      * The number of cores per socket cannot be specified before sockets or after threads per core.
      * The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset);
      * Parameters delimiter can be: empty, comma, lower-case x;
      * Spaces are allowed around numbers, around letters, around delimiter.
      Approximate shorthand specification:
      KMP_PLACE_THREADS="[num_sockets(S|s)[[delim]offset(O|o)][delim]][num_cores_per_socket(C|c)[[delim]offset(O|o)][delim]][num_threads_per_core(T|t)]"
      
      Differential Revision: http://reviews.llvm.org/D13175
      
      llvm-svn: 249708
      dd4aa9b6
  16. Sep 25, 2015
    • Jonathan Peyton's avatar
      Fix memory corruption in Windows debug library · 7edeef1b
      Jonathan Peyton authored
      This patch adjusts the buffer size when reducing the buffer used for printing.
      This solves the memory corruption in Windows debug library, and potential
      memory corruption in other builds.
      
      llvm-svn: 248588
      7edeef1b
  17. Sep 10, 2015
    • Jonathan Peyton's avatar
      Fix depth field bug and resize() function in hierarchical barrier · df4d3dd6
      Jonathan Peyton authored
      This is a follow up to the hierarchy cleanup patch.
      Added some clarifying comments to hierarchy_info.
      Fixed a bug with the depth field not being updated cleanly during a resize.
      Fixed resize to first check capacity as determined by maxLevels before actually doing the full resize.
      
      Differential Revision: http://reviews.llvm.org/D12562
      
      llvm-svn: 247333
      df4d3dd6
    • Jonathan Peyton's avatar
      Cleanup of affinity hierarchy code. · 1707836b
      Jonathan Peyton authored
      Some of this is improvement to code suggested by Hal Finkel. Four changes here:
      1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not
      2.Separated this and other classes and common functions out to a header file
      3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup)
      4.Remove some redundant code that is hopefully no longer needed
      
      Differential Revision: http://reviews.llvm.org/D12449
      
      llvm-svn: 247326
      1707836b
  18. Aug 25, 2015
    • Jonathan Peyton's avatar
      Fix machine topology pruning. · 62f3840c
      Jonathan Peyton authored
      This patch fixes a bug when eliminating layers in the machine topology (namely
      cores, and threads). Before this patch, if a user specifies using only one 
      thread per socket, then affinity is not set properly due to bad topology
      pruning.
      
      Differential Revision: http://reviews.llvm.org/D11158
      
      llvm-svn: 245966
      62f3840c
  19. Jun 22, 2015
    • Jonathan Peyton's avatar
      Allow machine hierarchy expansion · 7f09a98a
      Jonathan Peyton authored
      This fix allows the machine hierarchy to be expanded in case it needs to handle 
      more threads. It adds a resize function to accomplish this.
      
      Differential Revision: http://reviews.llvm.org/D9900
      
      llvm-svn: 240292
      7f09a98a
    • Jonathan Peyton's avatar
      Re-enable Visual Studio Builds. · 7be07533
      Jonathan Peyton authored
      I tried to compile with Visual Studio using CMake and found these two sections of code 
      causing problems for Visual Studio.  The first one removes the use of variable length 
      arrays by instead using KMP_ALLOCA().  The second part eliminates a redundant cpuid 
      assembly call by using the already existing __kmp_x86_cpuid() call instead.
      
      llvm-svn: 240290
      7be07533
  20. Jun 01, 2015
    • Jonathan Peyton's avatar
      Apply name change to src/* files. · 66338295
      Jonathan Peyton authored
      These changes are mostly in comments, but there are a few
      that aren't.  Change libiomp5 => libomp everywhere.  One internal
      function name is changed in kmp_gsupport.c, and in kmp_i18n.c, the
      static char[] variable 'name' is changed to "libomp".
      
      llvm-svn: 238712
      66338295
  21. May 28, 2015
  22. Apr 13, 2015
    • Andrey Churbanov's avatar
      The generation of the hierarchy used by hierarchical barrier improved in how... · aa1f2b63
      Andrey Churbanov authored
      The generation of the hierarchy used by hierarchical barrier improved in how the generation reacts to affinity set to none, or disabled, or no affinity available, or oversubscription. Some cleanup actions based on review comments to follow: need to use meaningful names instead of digital constants, e.g. use enumerators.
      
      llvm-svn: 234775
      aa1f2b63
  23. Apr 02, 2015
  24. Mar 10, 2015
  25. Mar 05, 2015
  26. Feb 10, 2015
  27. Jan 29, 2015
  28. Jan 27, 2015
Loading