- Oct 07, 2014
-
-
Jim Cownie authored
understand that this is not friendly, and are working to change our internal code-development to make it easier to make development features available more frequently and in finer (more functional) chunks. Unfortunately we haven't got that in place yet, and unpicking this into multiple separate check-ins would be non-trivial, so please bear with me on this one. We should be better in the future. Apologies over, what do we have here? GGC 4.9 compatibility -------------------- * We have implemented the new entrypoints used by code compiled by GCC 4.9 to implement the same functionality in gcc 4.8. Therefore code compiled with gcc 4.9 that used to work will continue to do so. However, there are some other new entrypoints (associated with task cancellation) which are not implemented. Therefore user code compiled by gcc 4.9 that uses these new features will not link against the LLVM runtime. (It remains unclear how to handle those entrypoints, since the GCC interface has potentially unpleasant performance implications for join barriers even when cancellation is not used) --- new parallel entry points --- new entry points that aren't OpenMP 4.0 related These are implemented fully :- GOMP_parallel_loop_dynamic() GOMP_parallel_loop_guided() GOMP_parallel_loop_runtime() GOMP_parallel_loop_static() GOMP_parallel_sections() GOMP_parallel() --- cancellation entry points --- Currently, these only give a runtime error if OMP_CANCELLATION is true because our plain barriers don't check for cancellation while waiting GOMP_barrier_cancel() GOMP_cancel() GOMP_cancellation_point() GOMP_loop_end_cancel() GOMP_sections_end_cancel() --- taskgroup entry points --- These are implemented fully. GOMP_taskgroup_start() GOMP_taskgroup_end() --- target entry points --- These are empty (as they are in libgomp) GOMP_target() GOMP_target_data() GOMP_target_end_data() GOMP_target_update() GOMP_teams() Improvements in Barriers and Fork/Join -------------------------------------- * Barrier and fork/join code is now in its own file (which makes it easier to understand and modify). * Wait/release code is now templated and in its own file; suspend/resume code is also templated * There's a new, hierarchical, barrier, which exploits the cache-hierarchy of the Intel(r) Xeon Phi(tm) coprocessor to improve fork/join and barrier performance. ***BEWARE*** the new source files have *not* been added to the legacy Cmake build system. If you want to use that fixes wil be required. Statistics Collection Code -------------------------- * New code has been added to collect application statistics (if this is enabled at library compile time; by default it is not). The statistics code itself is generally useful, the lightweight timing code uses the X86 rdtsc instruction, so will require changes for other architectures. The intent of this code is not for users to tune their codes but rather 1) For timing code-paths inside the runtime 2) For gathering general properties of OpenMP codes to focus attention on which OpenMP features are most used. Nested Hot Teams ---------------- * The runtime now maintains more state to reduce the overhead of creating and destroying inner parallel teams. This improves the performance of code that repeatedly uses nested parallelism with the same resource allocation. Set the new KMP_HOT_TEAMS_MAX_LEVEL envirable to a depth to enable this (and, of course, OMP_NESTED=true to enable nested parallelism at all). Improved Intel(r) VTune(Tm) Amplifier support --------------------------------------------- * The runtime provides additional information to Vtune via the itt_notify interface to allow it to display better OpenMP specific analyses of load-imbalance. Support for OpenMP Composite Statements --------------------------------------- * Implement new entrypoints required by some of the OpenMP 4.1 composite statements. Improved ifdefs --------------- * More separation of concepts ("Does this platform do X?") from platforms ("Are we compiling for platform Y?"), which should simplify future porting. ScaleMP* contribution --------------------- Stack padding to improve the performance in their environment where cross-node coherency is managed at the page level. Redesign of wait and release code --------------------------------- The code is simplified and performance improved. Bug Fixes --------- *Fixes for Windows multiple processor groups. *Fix Fortran module build on Linux: offload attribute added. *Fix entry names for distribute-parallel-loop construct to be consistent with the compiler codegen. *Fix an inconsistent error message for KMP_PLACE_THREADS environment variable. llvm-svn: 219214
-
Todd Fiala authored
See http://reviews.llvm.org/D5632 for details. Change by Shawn Best. llvm-svn: 219213
-
Manuel Klimek authored
A precondition of that was to run both the preprocessor checks and AST checks from the same FrontendAction, otherwise we'd have needed to duplicate all involved objects in order to not have any references to a deleted source manager. llvm-svn: 219212
-
Jon Roelofs authored
Patch by: Charlie Turner <charlie.turner@arm.com> llvm-svn: 219211
-
David Blaikie authored
DebugInfo+DeadArgElimination: Ensure llvm::Function*s from debug info are updated even when DAE removes both varargs and non-varargs arguments on the same function. After some stellar (& inspired) help from Reid Kleckner providing a test case for some rather unstable undefined behavior showing up as assertions produced by r214761, I was able to fix this issue in DAE involving the application of both varargs removal, followed by normal argument removal. Indeed I introduced this same bug into ArgumentPromotion (r212128) by copying the code from DAE, and when I fixed the bug in ArgPromo (r213805) and commented in that patch that I didn't need to address the same issue in DAE because it was a single pass. Turns out it's two pass, one for the varargs and one for the normal arguments, so the same fix is needed (at least during varargs removal). So here it is. (the observable/net effect of this bug, even when it didn't result in assertion failure, is that debug info would describe the DAE'd function in the abstract, but wouldn't provide high/low_pc, variable locations, line table, etc (it would appear as though the function had been entirely optimized away), see the original PR14016 for details of the general problem) I'm not recommitting the assertion just yet, as there's been another regression of it since I last tried. It might just be a few test cases weren't adequately updated after Adrian or Duncan's recent schema changes. llvm-svn: 219210
-
Daniel Jasper authored
Before: SomeFunction(a, a, // comment b + x); After: SomeFunction(a, a, // comment b + x); llvm-svn: 219209
-
Johannes Doerfert authored
In case the pieceweise affine function used to create an isl_ast_expr had empty cases (e.g., with contradicting constraints on the parameters), it was possible that the condition of the isl_ast_expr select was not a comparison but a constant (thus of type i64). This patch does two thing: 1) Handle the case the condition of a select is not a i1 type like C. 2) Try to simplify the pieceweise affine functions for the min/max access when we generate runtime alias checks. That step can often remove empty or redundant cases as well as redundant constrains. This fixes bug: http://llvm.org/PR21167 Differential Revision: http://reviews.llvm.org/D5627 llvm-svn: 219208
-
Johannes Doerfert authored
-Wcomment complained about a "multi-line comment" caused by the ascii art used in ScopHelper to describe the CFG. Differential Revision: http://reviews.llvm.org/D5618 llvm-svn: 219207
-
Rafael Espindola authored
We used to avoid these, but it looks like we did so just because we were not handling dllexport alias correctly. Dario Domizioli fixed that, so allow these aliases. Based on a patch by Dario Domizioli! llvm-svn: 219206
-
Daniel Jasper authored
Patch by Marek Kurdej, thanks! llvm-svn: 219204
-
Suyog Sarda authored
Differential Revision: http://reviews.llvm.org/D5644 llvm-svn: 219203
-
Suyog Sarda authored
NFC. Differential Revision: http://reviews.llvm.org/D5645 llvm-svn: 219202
-
Suyog Sarda authored
llvm-svn: 219201
-
Yuri Gorshenin authored
Summary: Fixed asan-asm-stacktrace-test.cc. Now it's supported on x86_64 and added test run when no debug info is generated. Differential Revision: http://reviews.llvm.org/D5547 llvm-svn: 219200
-
Yuri Gorshenin authored
Summary: CFI directives are generated for .S files. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5520 llvm-svn: 219199
-
Tilmann Scheller authored
Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D5643 llvm-svn: 219198
-
Alexey Bataev authored
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive. llvm-svn: 219197
-
Daniel Sanders authored
Summary: According to the ABI documentation, f128 and {f128} should both be returned in $f0 and $f2. However, this doesn't match GCC's behaviour which is to return f128 in $f0 and $f2, but {f128} in $f0 and $f1. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5578 llvm-svn: 219196
-
Alexander Musman authored
No functional changes intended. Renamed EmitOMPSimdLoop to EmitOMPInnerLoop, I plan to re-use it to emit inner loop in the future patches for CodeGen of the worksharing loop directives (omp for, omp for simd). llvm-svn: 219195
-
Craig Topper authored
[X86] Fix a bug where the disassembler was ignoring the VEX.W bit in 32-bit mode for certain instructions it shouldn't. Unfortunately, this isn't easy to fix since there's no simple way to figure out from the disassembler tables whether the W-bit is being used to select a 64-bit GPR or if its a required part of the opcode. The fix implemented here just looks for "64" in the instruction name and ignores the W-bit in 32-bit mode if its present. Fixes PR21169. llvm-svn: 219194
-
Craig Topper authored
llvm-svn: 219193
-
Craig Topper authored
llvm-svn: 219192
-
David Majnemer authored
If we require a single member of a comdat, require all of the other members as well. This fixes PR20981. llvm-svn: 219191
-
David Majnemer authored
llvm-svn: 219190
-
David Majnemer authored
Most Unix-like operating systems guarantee that the file descriptor is closed after a call to close(2), even if close comes back with EINTR. For these systems, calling close _again_ will either do nothing or close some other file descriptor open(2)'d by another thread. (Linux) However, some operating systems do not have this behavior. They require at least another call to close(2) before guaranteeing that the descriptor is closed. (HP-UX) And some operating systems have an unpredictable blend of the two behaviors! (xnu) Avoid this disaster by blocking all signals before we call close(2). This ensures that a signal will not be delivered to the thread and close(2) will not give us back EINTR. We restore the signal mask once the operation is done. N.B. This isn't a problem on Windows, it doesn't have a notion of EINTR because signals always get delivered to dedicated signal handling threads. llvm-svn: 219189
-
Rafael Espindola authored
The plugin API doesn't have the notion of linkonce, only weak. It is up to the plugin to figure out if a symbol used only for the symbol table can be dropped. In particular, it has to avoid dropping a linkonce_odr selected by gold if there is also a weak_odr. llvm-svn: 219188
-
Juergen Ributzka authored
The code already folds sign-/zero-extends, but only if they are arguments to mul and shift instructions. This extends the code to also fold them when they are direct inputs. llvm-svn: 219187
-
Juergen Ributzka authored
Tiny enhancement to the address computation code to also fold sub instructions if the rhs is constant and can be folded into the offset. llvm-svn: 219186
-
Juergen Ributzka authored
This commit fixes an issue with sign-/zero-extending loads that was discovered by Richard Barton. We use now the correct load instructions for sign-extending loads to 64bit. Also updated and added more unit tests. llvm-svn: 219185
-
Rafael Espindola authored
The call to copyAttributesFrom will copy the visibility, which might assert if it were to produce something invalid like "internal hidden". We avoid it by first creating the replacement with the original linkage and then setting it to internal affter the call to copyAttributesFrom. llvm-svn: 219184
-
Saleem Abdulrasool authored
The macro rework was missing a trailing SEPARATOR for the .thumb_func, resulting in assembly failures. llvm-svn: 219183
-
Saleem Abdulrasool authored
This is simply to help clarity of the code. The functions are built as thumb only if Thumb2 is available (__ARM_ARCH_ISA_THUMB == 2). Sink the selection into the location of the definition and make DEFINE_COMPILERRT_THUMB_FUNCTION always define a thumb function while DEFINE_COMPILERRT_FUNCTION always selects the default. Since the .thumb_func directive is always available (at least on Linux, Windows, and BSD), sinking the macro right into the macro works just as well. No functional change intended. llvm-svn: 219182
-
Ed Maste authored
llvm-svn: 219181
-
Rui Ueyama authored
If /machine option is omitted, the linker needs to infer that from input object files. This patch implements that. llvm-svn: 219180
-
Saleem Abdulrasool authored
Use x86 and x64 which is the canonical Microsoft vernacular for the targets. Addresses post-commit review comments from Rui. llvm-svn: 219179
-
Saleem Abdulrasool authored
Teach the reader about ARM NT relocation types. Although the writer cannot yet perform the actual application of these relocations, the reader can at least now identify the relocation types. llvm-svn: 219178
-
Rafael Espindola authored
When creating an internal function replacement for use in an alias we were not remapping the argument uses in the instructions to point to the new arguments. llvm-svn: 219177
-
Rui Ueyama authored
These lines can be reachable if we give a broken or unsupported input object file. llvm-svn: 219176
-
Gerolf Hoflehner authored
Takes care of the assert that caused build fails. Rather than asserting the code checks now that the definition and use are in the same block, and does not attempt to optimize when that is not the case. llvm-svn: 219175
-
David Majnemer authored
Utilize Process::FixupStandardFileDescriptors, introduced in r219170, to guard against files from being treated as one of the standard file descriptors. llvm-svn: 219174
-