- Mar 07, 2022
-
-
Jan Svoboda authored
Since D106876, PCM files don't report module maps as input files unless they contributed to the compilation. Reporting only module maps of (transitively) imported modules is not enough, though. For modules marked with `[no_undeclared_includes]`, other module maps affect the compilation by introducing anti-dependencies. This patch makes sure such module maps are being reported as input files. Depends on D120463. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D120464
-
Jan Svoboda authored
This patch simplifies a test that checks only used module map files are reported as input files in PCM files. Instead of using opaque `diff`, this patch uses `clang -module-file-info` and `FileCheck` to verify this. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D120463
-
Qiu Chaofan authored
Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/double scalar, they'll generate corresponding intrinsics. But for the vector version of builtin, the 3 op chain may be recognized as expensive by some passes (like early cse). We need some way to keep the fnmsub form until code generation. This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub intrinsics. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D116015
-
William S. Moses authored
The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease) ``` omp.parallel { %a = ... omp.parallel { use(%a) } } ``` As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this allocation to the inner parallel. For example, we would want to get something like the following: ``` omp.parallel { %a = ... %tmp = alloc store %tmp[] = %a kmpc_fork(outlined, %tmp) } ``` However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder, the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each region. ``` entry: ... parallel1: %a = ... ... parallel2: use(%a) ... endparallel2: ... endparallel1: ... ``` When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1. This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example. This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D121061
-
- Mar 05, 2022
-
-
Shao-Ce SUN authored
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112774
-
Thomas Lively authored
-
Thomas Lively authored
We previously had logic to disable pthreads, set the ThreadModel to Single, and disable thread-safe statics when the atomics target features is disabled, since that means that the resulting program will not be used in a threaded context. Similarly check for the presence of the bulk-memory feature, since that is also necessary to produce multithreaded programs. Differential Revision: https://reviews.llvm.org/D121014
-
- Mar 04, 2022
-
-
Yaxun (Sam) Liu authored
Update active offload kind of actions for OpenMP programs. The change is expected as of e5eb3650.
-
Yaxun (Sam) Liu authored
When both CUDA or HIP programs and C++ programs are passed to clang driver without -c, C++ programs are treated as CUDA or HIP program, which is incorrect. This is because action builder sets the offloading kind of input job actions to the linking action to be the union of offloading kind of the input job actions, i.e. if there is one HIP or CUDA input to the linker, then all the input to the linker is marked as HIP or CUDA. To fix this issue, the offload action builder tracks the originating input argument of each host action, which allows it to determine the active offload kind of each host action. Then the offload kind of each input action to the linker can be determined individually. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D120911
-
Yaxun (Sam) Liu authored
When both HIP and C++ programs are input files to clang with -c, clang treats C++ programs as HIP programs, which is incorrect. This is due to action builder does not set correct offloading kind for job actions for C++ programs. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D120910
-
Arthur O'Dwyer authored
Fixes #49188. Differential Revision: https://reviews.llvm.org/D119184
-
4vtomat authored
This commit divides the large test files(over 30k lines) under clang/test/CodeGen/RISCV including: rvv-intrinsics/vloxseg.c rvv-intrinsics/vluxseg.c rvv-intrinsics-overloaded/vloxseg.c rvv-intrinsics-overloaded/vluxseg.c into "non-masked" version and "masked" version which can reduce the test cases by 50% in a single file. Differential Revision: https://reviews.llvm.org/D120967
-
Florian Hahn authored
This test file has grown to the point where it takes a huge amount of time to run. At the moment, this test seems to consistently time out when running in the pre-commit checks in Phabricator with a 10 minute timeout. For example see https://reviews.llvm.org/harbormaster/unit/view/2832724/ While splitting up the test file is not ideal, it is even more undesirable to have huge test files that time out in common settings. This patch splits up the test file roughly in the middle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D120876
-
Florian Hahn authored
This test file has grown to the point where it takes a huge amount of time to run. At the moment, this test seems to consistently time out when running in the pre-commit checks in Phabricator with a 10 minute timeout. For example see https://reviews.llvm.org/harbormaster/unit/view/2832723/ While splitting up the test file is not ideal, it is even more undesirable to have huge test files that time out in common settings. This patch splits up the test file roughly in the middle. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D120875
-
Tim Northover authored
The baremetal-sysroot test fails when the toolchain is configured with DEFAULT_SYSROOT. So, to emulate not having passed one at all, let's pass an empty sysroot instead. https://reviews.llvm.org/D119144 Patch by Carlo Cabrera <carlo.antonio.cabrera@gmail.com>
-
- Mar 03, 2022
-
-
Shivam authored
Few weeks back I was experimenting with reading the uninitialized values from src , which is actually a bug but the CSA seems to give up at that point . I was curious about that and I pinged @steakhal on the discord and according to him this seems to be a genuine issue and needs to be fix. So I goes with fixing this bug and thanks to @steakhal who help me creating this patch. This feature seems to break some tests but this was the genuine problem and the broken tests also needs to fix in certain manner. I add a test but yeah we need more tests,I'll try to add more tests.Thanks Reviewed By: steakhal, NoQ Differential Revision: https://reviews.llvm.org/D120489
-
Corentin Jabot authored
Fixes https://github.com/llvm/llvm-project/issues/54151 Reviewed By: erichkeane, aaron.ballman Differential Revision: https://reviews.llvm.org/D120881
-
Roy Jacobson authored
The standard requires[0] member function constraints to be checked when explicitly instantiating classes. This patch adds this constraints check. This issue is tracked as #46029 [1]. Note that there's an related open CWG issue (2421[2]) about what to do when multiple candidates have satisfied constraints. This is particularly an issue because mangling doesn't contain function constraints, and so the following code still ICEs with definition with same mangled name '_ZN1BIiE1fEv' as another definition: template<class T> struct B { int f() requires std::same_as<T, int> { return 0; } int f() requires (std::same_as<T, int> && !std::same_as<T, char>) { return 1; } }; template struct B<int>; Also note that the constraints checking while instantiating *functions* is still not implemented. I started looking at it but It's a bit more complicated. I believe in such a case we have to consider the partial constraints order and potentially choose the best candidate out of the set of multiple valid ones. [0]: https://eel.is/c++draft/temp.explicit#10 [1]: https://github.com/llvm/llvm-project/issues/46029 [2]: https://cplusplus.github.io/CWG/issues/2421.html Differential Revision: https://reviews.llvm.org/D120255
-
Kristóf Umann authored
The problem with leak bug reports is that the most interesting event in the code is likely the one that did not happen -- lack of ownership change and lack of deallocation, which is often present within the same function that the analyzer inlined anyway, but not on the path of execution on which the bug occured. We struggle to understand that a function was responsible for freeing the memory, but failed. D105819 added a new visitor to improve memory leak bug reports. In addition to inspecting the ExplodedNodes of the bug pat, the visitor tries to guess whether the function was supposed to free memory, but failed to. Initially (in D108753), this was done by checking whether a CXXDeleteExpr is present in the function. If so, we assume that the function was at least party responsible, and prevent the analyzer from pruning bug report notes in it. This patch improves this heuristic by recognizing all deallocator functions that MallocChecker itself recognizes, by reusing MallocChecker::isFreeingCall. Differential Revision: https://reviews.llvm.org/D118880
-
Haojian Wu authored
Previously, we didin't build a DeclRefExpr which refers to an invalid declaration. In this patch, we handle this case by building an empty RecoveryExpr, which will preserve more broken code (AST parent nodes that contain the RecoveryExpr is preserved in the AST). Differential Revision: https://reviews.llvm.org/D120812
-
Aakanksha authored
Differential Revision: https://reviews.llvm.org/D120846
-
- Mar 02, 2022
-
-
Stanislav Mekhanoshin authored
This is target definition only. Differential Revision: https://reviews.llvm.org/D120688
-
Tong Zhang authored
Clang is crashing on the following statement char var[9]; __asm__ ("" : "=r" (var) : "0" (var)); This is similar to existing test: crbug_999160_regtest The issue happens when EmitAsmStmt is trying to convert input to match output type length. However, that is not guaranteed to be successful all the time and if the statement itself is invalid like having an array type in the example, we should give a regular error message here instead of using assert(). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120596
-
Saiyedul Islam authored
Intermediate file of one of the test was getting overwritten due to name clash.
-
Haojian Wu authored
The linear scan should not escape the TargetedStates range. Differential Revision: https://reviews.llvm.org/D120723
-
Saiyedul Islam authored
`hip-openmp-compatible` flag treats hip and hipv4 offload kinds as compatible with openmp offload kind while extracting code objects from a heterogenous archive library. Vice versa is also considered compatible if hip code was compiled with -fgpu-rdc. This flag only relaxes compatibility criteria on `OffloadKind`, rest of the components like `Triple` and `GPUArhc` still needs to be compatible. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D120697
-
Zhihao Yuan authored
C++20 non-type template parameter prints `MyType<{{116, 104, 105, 115}}>` when the code is as simple as `MyType<"this">`. This patch prints `MyType<{"this"}>`, with one layer of braces preserved for the intermediate structural type to trigger CTAD. `StringLiteral` handles this case, but `StringLiteral` inside `APValue` code looks like a circular dependency. The proposed patch implements a cheap strategy to emit string literals in diagnostic messages only when they are readable and fall back to integer sequences. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D115031
-
- Mar 01, 2022
-
-
Florian Mayer authored
Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D120437
-
Nicolas Miller authored
NOTE: this is a follow-up commit with the missing clang-side changes. This patch adds builtins and intrinsics for the f16 and f16x2 variants of the ex2 instruction. These two variants were added in PTX7.0, and are supported by sm_75 and above. Note that this isn't wired with the exp2 llvm intrinsic because the ex2 instruction is only available in its approx variant. Running ptxas on the assembly generated by the test f16-ex2.ll works as expected. Differential Revision: https://reviews.llvm.org/D119157
-
Jakub Chlanda authored
This patch adds builtins/intrinsics for the following variants of FMA: NOTE: follow-up commit with the missing clang-side changes. - f16, f16x2 - rn - rn_ftz - rn_sat - rn_ftz_sat - rn_relu - rn_ftz_relu - bf16, bf16x2 - rn - rn_relu ptxas (Cuda compilation tools, release 11.0, V11.0.194) is happy with the generated assembly. Differential Revision: https://reviews.llvm.org/D118977
-
Jakub Chlanda authored
Adds support for the following builtins: abs, neg: - .bf16, - .bf16x2 min, max - {.ftz}{.NaN}{.xorsign.abs}.f16 - {.ftz}{.NaN}{.xorsign.abs}.f16x2 - {.NaN}{.xorsign.abs}.bf16 - {.NaN}{.xorsign.abs}.bf16x2 - {.ftz}{.NaN}{.xorsign.abs}.f32 Differential Revision: https://reviews.llvm.org/D117887
-
Tong Zhang authored
Currently adding attribute no_sanitize("bounds") isn't disabling -fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang frontend handles fsanitize=array-bounds which can already be disabled by no_sanitize("bounds"). However, instrumentation added by the BoundsChecking pass in the middle-end cannot be disabled by the attribute. The fix is very similar to D102772 that added the ability to selectively disable sanitizer pass on certain functions. In this patch, if no_sanitize("bounds") is provided, an additional function attribute (NoSanitizeBounds) is attached to IR to let the BoundsChecking pass know we want to disable local-bounds checking. In order to support this feature, the IR is extended (similar to D102772) to make Clang able to preserve the information and let BoundsChecking pass know bounds checking is disabled for certain function. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D119816
-
Egor Zhdan authored
This new flag enables `__has_feature(cxx_unstable)` that would replace libc++ macros for individual unstable/experimental features, e.g. `_LIBCPP_HAS_NO_INCOMPLETE_RANGES` or `_LIBCPP_HAS_NO_INCOMPLETE_FORMAT`. This would make it easier and more convenient to opt-in into all libc++ unstable features at once. Differential Revision: https://reviews.llvm.org/D120160
-
Kristina Bessonova authored
NVVM IR specification defines them with i32 return type: declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value) declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value) ... The i32 return value is a 32-bit mask where bit position in mask corresponds to thread’s laneid. as well as PTX ISA: 9.7.12.8. Parallel Synchronization and Communication Instructions: match.sync match.any.sync.type d, a, membermask; match.all.sync.type d[|p], a, membermask; ... Destination d is a 32-bit mask where bit position in mask corresponds to thread’s laneid. Additionally, ptxas doesn't accept intructions, produced by NVPTX backend. After this patch, it compiles with no issues. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D120499
-
Iain Sandoe authored
Implementation partitions bring two extra cases where we have visibility of module-private data. 1) When we import a module implementation partition. 2) When a partition implementation imports the primary module intertace. We maintain a record of direct imports into the current module since partition decls from direct imports (but not trasitive ones) are visible. The rules on decl-reachability are much more relaxed (with the standard giving permission for an implementation to load dependent modules and for the decls there to be reachable, but not visible). Differential Revision: https://reviews.llvm.org/D118599
-
Balázs Kéri authored
Add a checker to maintain the system-defined value 'errno'. The value is supposed to be set in the future by existing or new checkers that evaluate errno-modifying function calls. Reviewed By: NoQ, steakhal Differential Revision: https://reviews.llvm.org/D120310
-
Zhihao Yuan authored
Given a dependent `T` (maybe an undeduced `auto`), Before: new T(z) --> new T((z)) # changes meaning with more args new T{z} --> new T{z} T(z) --> T(z) T{z} --> T({z}) # forbidden if T is auto After: new T(z) --> new T(z) new T{z} --> new T{z} T(z) --> T(z) T{z} --> T{z} Depends on D113393 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D120608
-
Zhihao Yuan authored
https://wg21.link/p0849 Reviewed By: aaron.ballman, erichkeane Differential Revision: https://reviews.llvm.org/D113393
-
Michael Kruse authored
Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads. This patch includes the following related changes: * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder. * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder. * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop. * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause. Differential Revision: https://reviews.llvm.org/D114413
-
- Feb 28, 2022
-
-
Yaxun (Sam) Liu authored
It should be oclc_abi_version* instead of abi_version*. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D120557
-