Skip to content
  1. Jun 14, 2016
    • Jonathan Peyton's avatar
      Renaming change: 41 -> 45 and 4.1 -> 4.5 · df6818be
      Jonathan Peyton authored
      OpenMP 4.1 is now OpenMP 4.5.  Any mention of 41 or 4.1 is replaced with
      45 or 4.5.  Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that
      41 is deprecated and to use 45 instead.
      
      llvm-svn: 272687
      df6818be
  2. Jun 13, 2016
    • Jonathan Peyton's avatar
      Bug fix for Bugzilla bug 26602: Remove function bodies with KMP_ASSERT(0) · e1890e12
      Jonathan Peyton authored
      Fix for bugzilla https://llvm.org/bugs/show_bug.cgi?id=26602.  Removed functions
      body consisted of the only KMP_ASSERT(0) statement.  Thus possible runtime crash
      converted to compile-time error, which looks preferable (faster possible error
      detection).
      
      TODO: consider C++11 static assert as an alternative, that could
      make the diagnostics better.
      
      Patch by Andrey Churbanov
      
      Differential Revision: http://reviews.llvm.org/D21304
      
      llvm-svn: 272590
      e1890e12
    • Jonathan Peyton's avatar
      Affinity mask processing improvements · c5304aa3
      Jonathan Peyton authored
      Remove static specifier from var fullMask and remove kmp_get_fullMask() routine.
      When iterating through procs in a mask, always check if proc is in fullMask
      (this check was missing in a few places).
      
      Patch by Brian Bliss.
      
      Differential Revision: http://reviews.llvm.org/D21300
      
      llvm-svn: 272589
      c5304aa3
    • Jonathan Peyton's avatar
      Exclude untied tasks from task stealing constraint · 8cb45c83
      Jonathan Peyton authored
      If either current_task or new_task is untied then skip task scheduling
      constraint checks, because untied tasks are not affected by the task
      scheduling constraints.
      
      Differential Revision: http://reviews.llvm.org/D21196
      
      llvm-svn: 272570
      8cb45c83
    • Jonathan Peyton's avatar
      Fix crash when libomp loaded/unloaded multiple times · 93495de2
      Jonathan Peyton authored
      The problem scenario is the following:
      A dynamic library, libfoo.so, depends on libomp.so (it creates parallel region
      and calls some omp functions).  An application has a loop where it dynamically
      loads libfoo.so, calls the function from it, unloads libfoo.so.  After several
      loop iterations application crashes with the message about lack of resources
      OMP: Error #34: System unable to allocate necessary resources for OMP thread:
      
      The problem is that pthread_kill() was not followed by pthread_join() in case
      of terminated thread. This patch fixes this problem for both worker and monitor
      threads.
      
      Differential Revision: http://reviews.llvm.org/D21200
      
      llvm-svn: 272567
      93495de2
    • Jonathan Peyton's avatar
      Hwloc refactoring patch · 202a24dd
      Jonathan Peyton authored
      These changes remove the hwloc_topology_ignore_type function which doesn't exist
      in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc
      has the cache levels stripped out and then assumes the final stripped topology
      follows the typical three-level topology: packages -> cores -> HW threads.
      But the code is doing unclean manipulations to determine at what level those
      resources are located and also assumes too much about what hwloc is detecting
      (there could be intermediate levels in between socket and core for instance).
      This new way of extracting the topology doesn't strip out any hardware objects
      that hwloc detects. It does not assume the three level topology, and instead
      searches for the relevant three levels within the topology for each bit of
      information using hwloc interface functions. i.e., the three level topology
      subset that our affinity code is interested in is extracted from the hwloc
      topology tree directly.
      
      For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the
      number of cores under a socket reliably without worrying if there are unexpected
      objects between the socket object and core object in the hwloc topology
      structure. Also, now that all topology information is kept, there are also
      possibilities of using the caches/numa nodes to determine more sophisticated
      affinity settings in the future.
      
      There is also some cleanup code added for the destruction of the
      __kmp_hwloc_topology object.
      
      Differential Revision: http://reviews.llvm.org/D21195
      
      llvm-svn: 272565
      202a24dd
    • Jonathan Peyton's avatar
      Fix bitmask complement operation · 34c72c47
      Jonathan Peyton authored
      The bitmask complement operation doesn't consider the max proc id which means
      something like !{0} will be translated to {1,2,3,4,...,600,601,...,1023} on a
      Linux system even though there aren't 600 processors on said system. This
      change has the complement bitmask and-ed with the fullmask so that it will only
      contain valid processors.
      
      Differential Revision: http://reviews.llvm.org/D21245
      
      llvm-svn: 272561
      34c72c47
    • Jonathan Peyton's avatar
      [STATS] Add stats gathering for taskloop construct · 5a299da5
      Jonathan Peyton authored
      llvm-svn: 272560
      5a299da5
  3. Jun 09, 2016
    • Jonathan Peyton's avatar
      Fix spelling in comment · b6f0f521
      Jonathan Peyton authored
      llvm-svn: 272291
      b6f0f521
    • Jonathan Peyton's avatar
      Revert accidental commit to lit.cfg · 61fdddfd
      Jonathan Peyton authored
      llvm-svn: 272287
      61fdddfd
    • Jonathan Peyton's avatar
      Refactor __kmp_execute_tasks_template function · c4c722ac
      Jonathan Peyton authored
      Refactored __kmp_execute_tasks_template to shorten and remove code redundancy.
      The original code for __kmp_execute_tasks_template was very redundant with
      large sections of repeated code that needed to be kept consistent, and goto
      statements that made the control flow difficult to discern. This refactoring
      removes all gotos and redundancy.
      
      Patch by Terry Wilmarth
      
      Differential Revision: http://reviews.llvm.org/D20879
      
      llvm-svn: 272286
      c4c722ac
    • Hans Wennborg's avatar
      kmp_lock.h: Fix VS2013 build after r271324 · 5b89fbc8
      Hans Wennborg authored
      MSVC doesn't allow std::atomic<>s in a union since they don't have trivial
      copy constructor. Replacing them with e.g. std::atomic_int works, but that
      breaks the GCC build on Linux, because then calls to e.g. std::atomic_load_explicit
      fail, as they expect a real std::atomic<> pointer.
      
      Fixing this with an #ifdef to unbreak the build for now.
      
      llvm-svn: 272271
      5b89fbc8
  4. Jun 01, 2016
  5. May 31, 2016
    • Paul Osmialowski's avatar
      Use C++11 atomics for ticket locks implementation · f7cc6aff
      Paul Osmialowski authored
      This patch replaces use of compiler builtin atomics with
      C++11 atomics for ticket locks implementation. Ticket locks
      are used in critical places of the runtime, e.g. in the tasking
      mechanism.
      
      The main reason this change was introduced is the problem
      with work stealing function on ARM architecture which suffered
      from nasty race condition. It turned out that the root cause of
      the problem lies in the way ticket locks are implemented. Changing
      compiler builtins into C++11 atomics solves the problem.
      
      Two assertions were added into kmp_tasking.c which are useful
      for detecting early symptoms of something wrong going on with
      work stealing, which were among the possible outcomes of the
      race condition.
      
      Differential Revision: http://reviews.llvm.org/D19878
      
      llvm-svn: 271324
      f7cc6aff
    • Jonathan Peyton's avatar
      Addition of OpenMP 4.5 feature: schedule(simd:static) · ef734799
      Jonathan Peyton authored
      This patch implements the new kmp_sch_static_balanced_chunked schedule kind that
      the compiler will generate when it encounters schedule(simd: static). It just
      adds the new constant and the new switch case __kmp_for_static_init.
      
      Patch by Alex Duran.
      
      Differential Revision: http://reviews.llvm.org/D20699
      
      llvm-svn: 271320
      ef734799
    • Jonathan Peyton's avatar
      Avoid deadlock with COI · f4f96956
      Jonathan Peyton authored
      When an asynchronous offload task is completed, COI calls the runtime to queue
      a "destructor task".  When the task deques are full, a dead-lock situation
      arises where the OpenMP threads are inside but cannot progress because the COI
      thread is stuck inside the runtime trying to find a slot in a deque.
      
      This patch implements the solution where the task deques doubled in size when
      a task is being queued from a COI thread.
      
      Differential Revision: http://reviews.llvm.org/D20733
      
      llvm-svn: 271319
      f4f96956
    • Jonathan Peyton's avatar
      Offer API for setting number of loop dispatch buffers · 067325f9
      Jonathan Peyton authored
      The problem is the lack of dispatch buffers when thousands of loops with nowait,
      about 10 iterations each, are executed by hundreds of threads. We only have
      built-in 7 dispatch buffers, but there is a need in dozens or hundreds of
      buffers.
      
      The problem can be fixed by setting KMP_MAX_DISP_BUF to bigger value. In order
      to give users same possibility I changed build-time control into run-time one,
      adding API just in case.
      
      This change adds an environment variable KMP_DISP_NUM_BUFFERS and a new API
      function kmp_set_disp_num_buffers(int num_buffers).
      
      The KMP_DISP_NUM_BUFFERS envirable works only before serial initialization,
      because during the serial initialization we already allocate buffers for the hot
      team, so it is too late to change the number of buffers later (or we need to
      reallocate buffers for all teams which sounds too complicated). The
      kmp_set_defaults() routine does not work for this envirable, because it calls
      serial initialization before reading the parameter string. So a new routine,
      kmp_set_disp_num_buffers(), is created so that it can set our internal global
      variable before the library initialization. If both the envirable and API used
      the envirable wins.
      
      Differential Revision: http://reviews.llvm.org/D20697
      
      llvm-svn: 271318
      067325f9
  6. May 27, 2016
  7. May 26, 2016
    • Jonathan Peyton's avatar
      Fix for OMP_PROC_BIND=spread strategy · 7ba9baef
      Jonathan Peyton authored
      The OMP_PROC_BIND=spread strategy fails to assign the master thread the
      correct place partition after the first parallel region. Other threads in the
      hot team will remember their place_partition, but the master's place partition
      is restored to what it was before entering the parallel region. So when the hot
      team is used for subsequent parallel regions, the master has lost this info.
      This fix calls __kmp_partition_places to update only the master thread's place
      partition in the spread case when there are no other changes to the hot team.
      
      Patch by Terry Wilmarth
      
      Differential Revision: http://reviews.llvm.org/D20539
      
      llvm-svn: 270890
      7ba9baef
    • Jonathan Peyton's avatar
      Make LIBOMP_USE_ITT_NOTIFY a setting that can be enabled or disabled · 7abf9d59
      Jonathan Peyton authored
      On Blue Gene/Q, having LIBOMP_USE_ITT_NOTIFY support compiled into a
      statically-linked binary causes a failure at runtime because dlopen fails.
      This patch changes LIBOMP_USE_ITT_NOTIFY to a cacheable configuration setting
      that can be disabled.
      
      Patch by John Mellor-Crummey
      
      Differential Revision: http://reviews.llvm.org/D20517
      
      llvm-svn: 270884
      7abf9d59
    • Hal Finkel's avatar
      Add a test case for microtask dispatch with many arguments · 0a665a83
      Hal Finkel authored
      This is a cleaned-up version of the test case posted in the D19879 review.
      
      llvm-svn: 270867
      0a665a83
    • Hal Finkel's avatar
      Add an assembly __kmp_invoke_microtask for ppc64[le] · 91e19a3d
      Hal Finkel authored
      Clang no longer restricts itself to generating microtasks with a small number
      of arguments, and so an assembly implementation is required to prevent hitting
      the parameter limit present in the C implementation. This adds an
      implementation for ppc64[le].
      
      llvm-svn: 270821
      91e19a3d
  8. May 25, 2016
  9. May 23, 2016
  10. May 20, 2016
  11. May 18, 2016
  12. May 17, 2016
  13. May 16, 2016
    • Paul Osmialowski's avatar
      Clean all the mess around KMP_USE_FUTEX and kmp_lock.h · fb043fdf
      Paul Osmialowski authored
      KMP_USE_FUTEX preprocessor definition defined in kmp_lock.h is used
      inconsequently throughout LLVM libomp code.
      
      * some .c files that use this define do not include kmp_lock.h file,
        in effect guarded part of code are never compiled
      * some places in code use architecture-depending preprocessor
        logic expressions which effectively disable use of Futex for
        AArch64 architecture, all these places should use
        '#if KMP_USE_FUTEX' instead to avoid any further confusions
      * some places use KMP_HAS_FUTEX which is nowhere defined,
        KMP_USE_FUTEX should be used instead
      
      Differential Revision: http://reviews.llvm.org/D19629
      
      llvm-svn: 269642
      fb043fdf
  14. May 13, 2016
  15. May 12, 2016
    • Jonathan Peyton's avatar
      Fix team reuse with foreign threads · 2b749b33
      Jonathan Peyton authored
      After hot teams were enabled by default, the library started using levels kept
      in the team structure. The levels are broken in case foreign thread exits and
      puts its team into the pool which is then re-used by another foreign thread.
      The broken behavior observed is when printing the levels for each new team, one
      gets 1, 2, 1, 2, 1, 2, etc. This makes the library believe that every other
      team is nested which is incorrect. What is wanted is for the levels to be
      1, 1, 1, etc.
      
      Differential Revision: http://reviews.llvm.org/D19980
      
      llvm-svn: 269363
      2b749b33
Loading