Skip to content
  1. Nov 04, 2015
  2. Oct 19, 2015
    • Dimitry Andric's avatar
      On FreeBSD, PTHREADS_THREADS_MAX does not fit into an int, leading to · 9b8c353c
      Dimitry Andric authored
      warnings similar to the following:
      
          runtime/src/kmp_global.c:117:35: warning: implicit conversion from
          'unsigned long' to 'int' changes value from 18446744073709551615 to -1
          [-Wconstant-conversion]
          int           __kmp_sys_max_nth = KMP_MAX_NTH;
                        ~~~~~~~~~~~~~~~~~   ^~~~~~~~~~~
          runtime/src/kmp.h:849:34: note: expanded from macro 'KMP_MAX_NTH'
          #    define KMP_MAX_NTH          PTHREAD_THREADS_MAX
                                           ^~~~~~~~~~~~~~~~~~~
      
      Clamp KMP_MAX_NTH to INT_MAX to avoid these warnings.  Also use INT_MAX
      whenever PTHREAD_THREADS_MAX is not defined at all.
      
      Differential Revision: http://reviews.llvm.org/D13827
      
      llvm-svn: 250708
      9b8c353c
  3. Oct 09, 2015
    • Jonathan Peyton's avatar
      [OMPT] Initialize task fields only if needed · b401db6d
      Jonathan Peyton authored
      Because __kmp_task_init_ompt is called for every initial task in each thread
      and always generated task ids, this was a big performance issue on bigger
      systems even without any tool attached.  After changing the initialization 
      interface to ompt_tool, we can now rely on already knowing whether a tool is
      attached and OMPT is enabled at this point.
      
      Patch by Jonas Hahnfeld
      
      Differential Revision: http://reviews.llvm.org/D13494
      
      llvm-svn: 249855
      b401db6d
  4. Oct 08, 2015
    • Jonathan Peyton's avatar
      Added sockets to the syntax of KMP_PLACE_THREADS environment variable. · dd4aa9b6
      Jonathan Peyton authored
      Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable.
      Some limitations:
      * The number of sockets and then optional offset should be specified first (before other parameters).
      * The letter designation is mandatory for sockets and then for other parameters.
      * If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped.
      * If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively.
      * The number of cores per socket cannot be specified before sockets or after threads per core.
      * The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset);
      * Parameters delimiter can be: empty, comma, lower-case x;
      * Spaces are allowed around numbers, around letters, around delimiter.
      Approximate shorthand specification:
      KMP_PLACE_THREADS="[num_sockets(S|s)[[delim]offset(O|o)][delim]][num_cores_per_socket(C|c)[[delim]offset(O|o)][delim]][num_threads_per_core(T|t)]"
      
      Differential Revision: http://reviews.llvm.org/D13175
      
      llvm-svn: 249708
      dd4aa9b6
  5. Sep 21, 2015
  6. Sep 10, 2015
  7. Aug 31, 2015
  8. Aug 28, 2015
    • Jonathan Peyton's avatar
      [OpenMP] [CMake] Removing expand-vars.pl in favor of CMake's configure_file() · c0225ca2
      Jonathan Peyton authored
      Currently, the libomp CMake build system uses a Perl script to configure files
      (tools/expand-vars.pl). This patch replaces the use of the Perl script by using
      CMake's configure_file() function. The major changes include:
      1. *.var has every $KMP_* variable changed to @LIBOMP_*@
      2. kmp_config.h.cmake is a new file which contains all the feature macros and
         #cmakedefine lines
      3. Most of the -D lines have been moved from LibompDefinitions.cmake but some
         OS specific MACROs (e.g., _GNU_SOURCE) remain.
      4. All expand-vars.pl related logic is removed from the CMake files.
      
      One important note about this change is that it breaks the old Perl+Makefile
      build system because it can't create kmp_config.h properly.
      
      Differential Review: http://reviews.llvm.org/D12211
      
      llvm-svn: 246314
      c0225ca2
  9. Aug 13, 2015
    • Jonathan Peyton's avatar
      Remove unused KMP_SETVERSION macro · 221104be
      Jonathan Peyton authored
      This macro and the small amount of code along with it are unused and
      can be removed.  The macro is never defined in any build script or source file.
      
      llvm-svn: 244899
      221104be
  10. Jul 21, 2015
    • Jonathan Peyton's avatar
      Fix OMPT support for task frames, parallel regions, and parallel regions + loops · 3fdf3294
      Jonathan Peyton authored
      This patch makes it possible for a performance tool that uses call stack
      unwinding to map implementation-level call stacks from master and worker
      threads into a unified global view. There are several components to this patch.
      
      include/*/ompt.h.var
        Add a new enumeration type that indicates whether the code for a master task
          for a parallel region is invoked by the user program or the runtime system
        Change the signature for OMPT parallel begin/end callbacks to indicate whether
          the master task will be invoked by the program or the runtime system. This
          enables a performance tool using call stack unwinding to handle these two
          cases differently. For this case, a profiler that uses call stack unwinding
          needs to know that the call path prefix for the master task may differ from
          those available within the begin/end callbacks if the program invokes the
          master.
      
      kmp.h
        Change the signature for __kmp_join_call to take an additional parameter
        indicating the fork_context type. This is needed to supply the OMPT parallel
        end callback with information about whether the compiler or the runtime
        invoked the master task for a parallel region.
      
      kmp_csupport.c
        Ensure that the OMPT task frame field reenter_runtime_frame is properly set
          and cleared before and after calls to fork and join threads for a parallel
          region.
        Adjust the code for the new signature for __kmp_join_call.
        Adjust the OMPT parallel begin callback invocations to carry the extra
          parameter indicating whether the program or the runtime invokes the master
          task for a parallel region.
      
      kmp_gsupport.c
        Apply all of the analogous changes described for kmp_csupport.c for the GOMP
          interface
        Add OMPT support for the GOMP combined parallel region + loop API to
          maintain the OMPT task frame field reenter_runtime_frame.
      
      kmp_runtime.c:
        Use the new information passed by __kmp_join_call to adjust the OMPT
          parallel end callback invocations to carry the extra parameter indicating
          whether the program or the runtime invokes the master task for a parallel
          region.
      
      ompt_internal.h:
        Use the flavor of the parallel region API (GNU or Intel) to determine who
          invokes the master task.
      
      Differential Revision: http://reviews.llvm.org/D11259
      
      llvm-svn: 242817
      3fdf3294
  11. Jul 09, 2015
    • Jonathan Peyton's avatar
      Enable debugger support · 8fbb49ab
      Jonathan Peyton authored
      These changes enable external debuggers to conveniently interface with 
      the LLVM OpenMP Library.  Structures are added which describe the important
      internal structures of the OpenMP Library e.g., teams, threads, etc.
      This feature is turned on by default (CMake variable LIBOMP_USE_DEBUGGER)
      and can be turned off with -DLIBOMP_USE_DEBUGGER=off.
      
      Differential Revision: http://reviews.llvm.org/D10038
      
      llvm-svn: 241832
      8fbb49ab
  12. Jun 04, 2015
    • Jonathan Peyton's avatar
      Fix some sign compare warnings. · 1e7a1ddc
      Jonathan Peyton authored
      This change changes kmp_bstate.old_tid to sign integer instead of unsigned integer.
      It also defines two new macros KMP_NSEC_PER_SEC and KMP_USEC_PER_SEC which lets us take
      control of the sign (we want them to be longs).  Also, in kmp_wait_release.h, the byteref()
      function's return type is changed from char to unsigned char.
      
      llvm-svn: 239057
      1e7a1ddc
  13. Jun 03, 2015
  14. Jun 02, 2015
  15. May 07, 2015
  16. Apr 29, 2015
  17. Mar 10, 2015
  18. Feb 20, 2015
  19. Feb 10, 2015
  20. Jan 29, 2015
  21. Jan 27, 2015
  22. Jan 13, 2015
  23. Oct 07, 2014
    • Jim Cownie's avatar
      I apologise in advance for the size of this check-in. At Intel we do · 4cc4bb4c
      Jim Cownie authored
      understand that this is not friendly, and are working to change our
      internal code-development to make it easier to make development
      features available more frequently and in finer (more functional)
      chunks. Unfortunately we haven't got that in place yet, and unpicking
      this into multiple separate check-ins would be non-trivial, so please
      bear with me on this one. We should be better in the future.
      
      Apologies over, what do we have here?
      
      GGC 4.9 compatibility
      --------------------
      * We have implemented the new entrypoints used by code compiled by GCC
      4.9 to implement the same functionality in gcc 4.8. Therefore code
      compiled with gcc 4.9 that used to work will continue to do so.
      However, there are some other new entrypoints (associated with task
      cancellation) which are not implemented. Therefore user code compiled
      by gcc 4.9 that uses these new features will not link against the LLVM
      runtime. (It remains unclear how to handle those entrypoints, since
      the GCC interface has potentially unpleasant performance implications
      for join barriers even when cancellation is not used)
      
      --- new parallel entry points ---
      new entry points that aren't OpenMP 4.0 related
      These are implemented fully :-
            GOMP_parallel_loop_dynamic()
            GOMP_parallel_loop_guided()
            GOMP_parallel_loop_runtime()
            GOMP_parallel_loop_static()
            GOMP_parallel_sections()
            GOMP_parallel()
      
      --- cancellation entry points ---
      Currently, these only give a runtime error if OMP_CANCELLATION is true
      because our plain barriers don't check for cancellation while waiting
              GOMP_barrier_cancel()
              GOMP_cancel()
              GOMP_cancellation_point()
              GOMP_loop_end_cancel()
              GOMP_sections_end_cancel()
      
      --- taskgroup entry points ---
      These are implemented fully.
            GOMP_taskgroup_start()
            GOMP_taskgroup_end()
      
      --- target entry points ---
      These are empty (as they are in libgomp)
           GOMP_target()
           GOMP_target_data()
           GOMP_target_end_data()
           GOMP_target_update()
           GOMP_teams()
      
      Improvements in Barriers and Fork/Join
      --------------------------------------
      * Barrier and fork/join code is now in its own file (which makes it
      easier to understand and modify).
      * Wait/release code is now templated and in its own file; suspend/resume code is also templated
      * There's a new, hierarchical, barrier, which exploits the
      cache-hierarchy of the Intel(r) Xeon Phi(tm) coprocessor to improve
      fork/join and barrier performance.
      
      ***BEWARE*** the new source files have *not* been added to the legacy
      Cmake build system. If you want to use that fixes wil be required.
      
      Statistics Collection Code
      --------------------------
      * New code has been added to collect application statistics (if this
      is enabled at library compile time; by default it is not). The
      statistics code itself is generally useful, the lightweight timing
      code uses the X86 rdtsc instruction, so will require changes for other
      architectures.
      The intent of this code is not for users to tune their codes but
      rather 
      1) For timing code-paths inside the runtime
      2) For gathering general properties of OpenMP codes to focus attention
      on which OpenMP features are most used. 
      
      Nested Hot Teams
      ----------------
      * The runtime now maintains more state to reduce the overhead of
      creating and destroying inner parallel teams. This improves the
      performance of code that repeatedly uses nested parallelism with the
      same resource allocation. Set the new KMP_HOT_TEAMS_MAX_LEVEL
      envirable to a depth to enable this (and, of course, OMP_NESTED=true
      to enable nested parallelism at all).
      
      Improved Intel(r) VTune(Tm) Amplifier support
      ---------------------------------------------
      * The runtime provides additional information to Vtune via the
      itt_notify interface to allow it to display better OpenMP specific
      analyses of load-imbalance.
      
      Support for OpenMP Composite Statements
      ---------------------------------------
      * Implement new entrypoints required by some of the OpenMP 4.1
      composite statements.
      
      Improved ifdefs
      ---------------
      * More separation of concepts ("Does this platform do X?") from
      platforms ("Are we compiling for platform Y?"), which should simplify
      future porting.
      
      
      ScaleMP* contribution
      ---------------------
      Stack padding to improve the performance in their environment where
      cross-node coherency is managed at the page level.
      
      Redesign of wait and release code
      ---------------------------------
      The code is simplified and performance improved.
      
      Bug Fixes
      ---------
          *Fixes for Windows multiple processor groups.
          *Fix Fortran module build on Linux: offload attribute added.
          *Fix entry names for distribute-parallel-loop construct to be consistent with the compiler codegen.
          *Fix an inconsistent error message for KMP_PLACE_THREADS environment variable.
      
      llvm-svn: 219214
      4cc4bb4c
  24. Aug 07, 2014
  25. Mar 02, 2014
  26. Feb 28, 2014
    • Alp Toker's avatar
      Add support for FreeBSD · 763b9396
      Alp Toker authored
      Port the OpenMP runtime to FreeBSD along with associated build system changes.
      
      Also begin to generalize affinity capabilities so they aren't tied explicitly
      to Windows and Linux.
      
      The port builds with stock clang and gmake and has no additional runtime
      dependencies.
      
      All but a handful of the validation suite tests are now passing on FreeBSD 10
      x86_64.
      
      llvm-svn: 202478
      763b9396
  27. Feb 24, 2014
  28. Dec 23, 2013
    • Jim Cownie's avatar
      For your Christmas hacking pleasure. · 181b4bb3
      Jim Cownie authored
      This release use aligns with Intel(r) Composer XE 2013 SP1 Product Update 2 
      
      New features
      * The library can now be built with clang (though wiht some
        limitations since clang does not support 128 bit floats)
      * Support for Vtune analysis of load imbalance
      * Code contribution from Steven Noonan to build the runtime for ARM*
        architecture processors 
      * First implementation of runtime API for OpenMP cancellation
      
      Bug Fixes
      * Fixed hang on Windows (only) when using KMP_BLOCKTIME=0
      
      llvm-svn: 197914
      181b4bb3
Loading