- Oct 25, 2021
-
-
Craig Topper authored
I don't think these are needed with the way builtin_alias is implemented.
-
Philip Reames authored
The recently added logic to canonicalize exit conditions to unsigned relies on facts which hold about the use (i.e. exit test). Applying this blindly to the icmp is not legal, as there may be another use which never reaches the exit. Restrict ourselves to case where we have a single use.
-
Vladimir Inđić authored
__ompt_get_task_info_internal function is adapted to support thread_num determination during the execution of multiple nested serialized parallel regions enclosed by a regular parallel region. Consider the following program that contains parallel region R1 executed by two threads. Let the worker thread T of region R1 executes serialized parallel regions R2 that encloses another serialized parallel region R3. Note that the thread T is the master thread of both R2 and R3 regions. Assume that __ompt_get_task_info_internal function is called with the argument "ancestor_level == 1" during the execution of region R3. The function should determine the "thread_num" of the thread T inside the team of region R2, whose implicit task is at level 1 inside the hierarchy of active tasks. Since the thread T is the master thread of region R2, one should expected that "thread_num" takes a value 0. After the while loop finishes, the following stands: "lwt != NULL", "prev_lwt == NULL", "prev_team" represents the team information about the innermost serialized parallel region R3. This results in executing the assignment "thread_num = prev_team->t.t_master_tid". Note that "prev_team->t.t_master_tid" was initialized at the moment of R2’s creation and represents the "thread_num" of the thread T inside the region R1 which encloses R2. Since the thread T is the worker thread of the region R1, "the thread_num" takes value 1, which is a contradiction. This patch proposes to use "lwt" instead of "prev_lwt" when determining the "thread_num". If "lwt" exists, the task at the requested level belongs to the serialized parallel region. Since the serialized parallel region is executed by one thread only, the "thread_num" takes value 0. Similarly, assume that __ompt_get_task_info_internal function is called with the argument "ancestor_level == 2" during the execution of region R3. The function should determine the "thread_num" of the thread T inside the team of region R1. Since the thread is the worker inside the region R1, one should expected that "thread_num" takes value 1. After the loop finishes, the following stands: "lwt == NULL", "prev_lwt != NULL", "prev_team" represents the team information about the innermost serialized parallel region R3. This leads to execution of the assignment "thread_num = 0", which causes a contradiction. Ignoring the "prev_lwt" leads to executing the assignment "thread_num = prev_team->t.t_master_tid" instead. From the previous explanation, it is obvious that "thread_num" takes value 1. Note that the "prev_lwt" variable is marked as unnecessary and thus removed. This patch introduces the test case which represents the OpenMP program described earlier in the summary. Differential Revision: https://reviews.llvm.org/D110699
-
Vladimir Inđić authored
__kmp_fork_call sets the enter_frame of the active task (th_curren_task) before new parallel region begins. After the region is finished, the enter_frame is cleared. The old implementation of __kmpc_fork_call didn’t clear the enter_frame of active task. Also, the way of initializing the enter_frame of the active task was wrong. Consider the following two OpenMP programs. The first program: Let R1 be the serialized parallel region that encloses another serialized parallel region R2. Assume that thread that executes R2 is going to create a new serialized parallel region R3 by executing __kmpc_fork_call. This thread is responsible to set enter_frame of R2's implicit task. Note that the information about R2's implicit task is present inside master_th->th.th_current_task at this moment, while lwt represents the information about R1's implicit task. The old implementation uses lwt and resets enter_frame of R1's implicit task instead of R2's implicit task. The new implementation uses master_th->th.th_current_task instead. The second program: Consider the OpenMP program that contains parallel region R1 which encloses an explicit task T. Assume that thread should create another parallel region R2 during the execution of the T. The __kmpc_fork_call is responsible to create R2 and set enter frame of T whose information is present inside the master_th->th.th_current_task. Old implementation tries to set the frame of parent_team->t.t_implicit_task_taskdata[tid] which corresponds to the implicit task of the R1, instead of T. Differential Revision: https://reviews.llvm.org/D112419
-
Joachim Protze authored
As discussed in D108488, testing for invariants of omp_get_wtime would be more reliable than testing for duration of sleep, as return from sleep might be delayed due to system load. Alternatively/in addition, we could compare the time measured by omp_get_wtime to time measured with C++11 chrono (for portability?). Differential Revision: https://reviews.llvm.org/D112458
-
Joachim Protze authored
The CHECK: line in the test had no effect, because the test does not pipe to FileCheck. Since the test only checks for a single value, encode the result in the return value of the test.
-
Joachim Protze authored
For some tests with target-related functionality icc 18/19 tries to link libioffload_target.so.5, which fails for missing COI symbols.
-
Joachim Protze authored
Also mark the test as unsupported by intel-21, because the test does not terminate
-
Joachim Protze authored
Where possible change to declare the variable before the loop. Where not possible, specifically request -std=c99 (could be limited to specific compilers like icc).
-
Joachim Protze authored
-
Raphael Isemann authored
* clang-format test source. * Removed the dead setup code. * Using expect_expr etc. instead of raw expect. * Slightly expanded with tests for vtable pointers (which mostly just crash atm.) * Removed some other minor test guideline problems.
-
Kazu Hirata authored
-
Craig Topper authored
All but 2 of the vector builtins are only used by clang_builtin_alias. When using clang_builtin_alias, the type string of the builtin is never checked. Only the types in the function definition used for the alias are checked. This patch takes advantage of this to share a single builtin for many different types. We already used type overloads on the IR intrinsic so the codegen for the builtins that are being merge were already the same. This extends the type overloading to the builtins. I had to make a few tweaks to make this work. -Floating point vector-vector vmerge now uses the vmerge intrinsic instead of the vfmerge intrinsic. New isel patterns and tests are added to support this. -The SemaChecking for the immediate of vset_v/vget_v has been removed. Determining the valid range is harder now. I've added masking to ManualCodegen to ensure valid IR for invalid input. This reduces the number of builtins from ~25000 to ~1100. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D112102
-
Nikita Popov authored
-
MaheshRavishankar authored
This reverts commit c86f218f. Revert because it causes build failure.
-
Craig Topper authored
Reviewed By: frasercrmck, kito-cheng Differential Revision: https://reviews.llvm.org/D112342
-
Danila Malyutin authored
Differential Revision: https://reviews.llvm.org/D107582
-
MaheshRavishankar authored
Using callbacks for allocation/deallocation allows users to override the default. Also add an option to comprehensive bufferization pass to use `alloca` instead of `alloc`s. Note that this option is just for testing. The option to use `alloca` does not work well with the option to allow for returning memrefs. Differential Revision: https://reviews.llvm.org/D112166
-
Jeremy Morse authored
-
Pavel Labath authored
This has been in there since forever, but only started to matter once 40e4ac3e changed how we print the string.
-
Joe Loser authored
Several parts in the `chrono` synopsis for C++20 are not yet implemented. The current recommendation is that things are added to the synopsis when implemented -- not beforehand. As such, remove the not-yet-implemented parts to avoid confusion. Reviewed By: ldionne, Quuxplusone, #libc Differential Revision: https://reviews.llvm.org/D111922
-
Konstantin Varlamov authored
Also fix a few places in the `shared_ptr` implementation where `element_type` was passed to the `__is_compatible` helper. This could result in `remove_extent` being applied twice to the pointer's template type (first by the definition of `element_type` and then by the helper), potentially leading to somewhat less readable error messages for some incorrect code. Differential Revision: https://reviews.llvm.org/D112092
-
Yaxun (Sam) Liu authored
Fix assertion in UsedDeclVisitor where clang is trying to look up a destructor for a forward declared class. Fixes: https://bugs.llvm.org/show_bug.cgi?id=52250 Reviewed by: Artem Belevich, John McCall Differential Revision: https://reviews.llvm.org/D112235
-
Louis Dionne authored
Several of our C++20 and C++2b papers were missing the actual revision number that was voted in to the Standard. The revision number is quite important because in a few cases, a paper has a revision *after* the one that is voted into the Standard, which isn't voted into the Standard. Hence, if we simply followed the wg21.link blindly and implemented that, we'd end up implementing the latest revision of the paper, which might not have been voted. As a fly-by fix, I found out that P1664 had been withdrawn from the straw polls and had never been voted into the Standard. This commit removes that entry from our list. Differential Revision: https://reviews.llvm.org/D112339
-
Michał Górny authored
Disable non-blocking mode that's enabled only for file:// and serial:// protocols. All read operations should be going through the select(2) in ConnectionFileDescriptor::BytesAvaliable, which effectively erases (non-)blocking mode differences in reading. We do want to perform writes in the blocking mode. Differential Revision: https://reviews.llvm.org/D112442
-
Jeremy Morse authored
During register allocation, some instructions can have stack spills fused into them. It means that when vregs are allocated on the stack we can convert: SETCCr %0 DBG_VALUE %0 to SETCCm %stack.0 DBG_VALUE %stack.0 Unfortunately instruction referencing finds this harder: a store to the stack doesn't have a specific operand number, therefore we don't substitute the old operand for a new operand, and the location is dropped. This patch implements a solution: just recognise the memory operand attached to an instruction with a Special Number (TM), and record a substitution between the old value and the new one. This patch adds substitution code to InlineSpiller to record such fused spills, and tracking in InstrRefBasedLDV to recognise such values, and produce the value numbers for them. Everything to do with the movement of stack-defined values is already handled in InstrRefBasedLDV. Differential Revision: https://reviews.llvm.org/D111317
-
Danila Malyutin authored
Before the code would crash with "unhandled opcode in isAArch64FrameOffsetLegal" when there was a spill from extractelement. Fixes pr52249 Differential Revision: https://reviews.llvm.org/D112311
-
Pavel Labath authored
-
Vy Nguyen authored
`%t/basics` already exists - it would be nice to be able to examine it afterward Differential Revision: https://reviews.llvm.org/D112392
-
Nikita Popov authored
D109746 made BasicAA use range information to determine the minimum/maximum GEP offset. However, it was limited to the case of a single variable index. This patch extends support to multiple indices by adding all the ranges together. Differential Revision: https://reviews.llvm.org/D112378
-
Alexey Bataev authored
Need to change the order of the reduction/binops args pair vectorization attempts. Need to try to find the reduction at first and postpone vectorization of binops args. This may help to find more reduction patterns and vectorize them. Part of D111574. Differential Revision: https://reviews.llvm.org/D112224
-
Chris Bieneman authored
This patch adds a documentation note about the LLVM_USE_SPLIT_DWARF CMake option which is useful to reduce linker memory usage.
-
Jeremy Morse authored
This patch swaps two lines -- the CurSucc reference can be invalidated by the call to DFS.push_back, therefore that should happen last. The usual hat-tip to asan for catching this. This patch also swaps an ealier call to ToAdd.insert and DFS.push_back, where a stable iterator (from successors()) is being used. This isn't strictly necessary, but is good for consistency and avoiding readers asking themselves why the two code portions have a different order.
-
Kadir Cetinkaya authored
-
Sanjay Patel authored
(i8 X ^ 128) & (i8 X s>> 7) --> usubsat X, 128 As suggested in D112085, we can substitute 'xor' with 'add' in this pattern, and it is logically equivalent: https://alive2.llvm.org/ce/z/eJtWWC We canonicalize to 'xor' in IR, but SDAG does not do that (and it probably should not - https://llvm.org/PR52267 ), so it is possible to see either pattern in codegen. Note that 'sub' is a another potential pattern, but that is canonicalized to 'add' in DAGCombiner, so we don't need to worry about that variation. Differential Revision: https://reviews.llvm.org/D112377
-
Tim Northover authored
Unfortunately ToT has changed enough from the revision where this actually caused problems that the test no longer triggers an assertion failure.
-
Dmitry Vyukov authored
Trapping on CHECK failure makes it more convinient to use with gdb (no need to set a breakpoint each time). Without a debugger attached trap should terminate the program as well. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D112440
-
Dmitry Vyukov authored
PPC64 bot failed with the following error. The buildbot output is not particularly useful, but looking at other similar tests, it seems that there is something broken in free stacks on PPC64. Use the same hack as other tests use to expect an additional stray frame. /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:28:11: error: CHECK: expected string not found in input // CHECK: Previous write of size 4 at {{.*}} by thread T1{{.*}}: ^ <stdin>:13:9: note: scanning from here #1 main /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:17:3 (free_race3.c.tmp+0x1012fab8) ^ <stdin>:17:2: note: possible intended match here ThreadSanitizer: reported 1 warnings ^ Input file: <stdin> Check file: /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c -dump-input=help explains the following input dump. Input was: <<<<<< . . . 8: Previous write of size 4 at 0x7ffff4d01ab0 by thread T1: 9: #0 Thread /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:8:10 (free_race3.c.tmp+0x1012f9dc) 10: 11: Thread T1 (tid=3222898, finished) created by main thread at: 12: #0 pthread_create /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1001:3 (free_race3.c.tmp+0x100b9040) 13: #1 main /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:17:3 (free_race3.c.tmp+0x1012fab8) check:28'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found 14: check:28'0 ~ 15: SUMMARY: ThreadSanitizer: data race /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:19:3 in main check:28'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 16: ================== check:28'0 ~~~~~~~~~~~~~~~~~~~ 17: ThreadSanitizer: reported 1 warnings check:28'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:28'1 ? possible intended match >>>>>> Reviewed By: melver Differential Revision: https://reviews.llvm.org/D112444
-
Thomas Symalla authored
-
Nicolas Vasilache authored
This revision also moves some code around to improve overall structure. Differential Revision: https://reviews.llvm.org/D112437
-