- Jul 22, 2021
-
-
Jon Chesterfield authored
AMDGPU plugin equivalent of D95155, build without HSA installed locally Compiles a new file, plugins/amdgpu/dynamic_hsa/hsa.cpp, to an object file that exposes the same symbols that the plugin presently uses from hsa. The object file contains dlopen of hsa and cached dlsym calls. Also provides header files corresponding to the subset that is used. This is behind a feature flag, LIBOMPTARGET_FORCE_DLOPEN_LIBHSA, default off. That allows developers to build against the dlopen/dlsym implementation, e.g. while testing this mode. Enabling by default will cause this plugin to build on a wider variety of machines than it does at present so may break some CI builds. That risk can be minimised by reviewing the header dependencies of the library and ensuring it doesn't use any libraries that are not already used by libomptarget. Separating the implementation from enabling by default in case the latter needs to be rolled back after wider CI results. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106559
-
Victor Huang authored
This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtin and intrinsic for "__stbcx". Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D106484
-
Anastasia Stulova authored
Fixed test to use predefined version marco instead of passing extra macro in the command line. Patch by Topotuna (Justas Janickas)! Differential Revision: https://reviews.llvm.org/D106254
-
Alexey Bataev authored
Added missed arguments in __tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime functions calls. Differential Revision: https://reviews.llvm.org/D106542
-
Nico Weber authored
treatUndefinedSymbol() was previously called before gatherInputSections() and markLive() for these special symbols, but after them for normal undefineds. For PR50760, treatUndefinedSymbol() will have to potentially create sections, so it's good to move treatUndefinedSymbol() for special undefineds later, so that it can assume that gatherInputSections() and markLive() has already been called always. No intended behavior change, but part of PR50760 (and covered in tests in the patch for the full feature). Differential Revision: https://reviews.llvm.org/D106552
-
Alexey Bataev authored
Revert "[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments." This reverts commit b455f7f2 to fix buildbots.
-
Raphael Isemann authored
[lldb] Remove a wrong assert in TestStructTypes that checks that empty structs in C always have size 0 D105471 fixes the way we assign sizes to empty structs in C mode. Instead of just giving them a size 0, we instead use the size we get from DWARF if possible. After landing D105471 the TestStructTypes test started failing on Windows. The tests checked that the size of an empty C struct is 0 while the size LLDB now reports is 4 bytes. It turns out that 4 bytes are the actual size Clang is using for C structs with the MicrosoftRecordLayoutBuilder. The commit that introduced that behaviour is 00a061dc. This patch removes that specific check from TestStructTypes. Note that D105471 added a series of tests that already cover this case (and the added checks automatically adjust to whatever size the target compiler chooses for empty structs).
-
Alexey Bataev authored
Added missed arguments in __tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime functions calls. Differential Revision: https://reviews.llvm.org/D106542
-
Aaron En Ye Shi authored
Allow standard header versions of malloc and free to be defined before introducing the device versions. Fixes: SWDEV-295901 Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D106463
-
Jon Chesterfield authored
Revision of D102858. Raise dlwrap arity argument to template argument so the correct value is given in the error message. E.g. '2 == 1' instead of '2 == trait<>::nargs'. Arity higher than it should be: Before diff ``` $/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: error: static_assert failed due to requirement '2 == trait<cudaError_enum (*)(unsigned int)>::nargs' "Arity Error" DLWRAP_INTERNAL(cuInit, 2); ^~~~~~~~~~~~~~~~~~~~~~~~~~ ... $/include/dlwrap.h:166:3: note: expanded from macro 'DLWRAP_COMMON' static_assert(ARITY == trait<decltype(&SYMBOL)>::nargs, "Arity Error"); \ ``` After diff In file included from $/plugins/cuda/dynamic_cuda/cuda.cpp:16: ``` $/include/dlwrap.h:131:3: error: static_assert failed due to requirement '2UL == 1UL' "Arity Error" static_assert(Requested == Required, "Arity Error"); ^ ~~~~~~~~~~~~~~~~~~~~~ $/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: note: in instantiation of function template specialization 'dlwrap::verboseAssert<2UL, 1UL>' requested here DLWRAP_INTERNAL(cuInit, 2); ``` Arity lower than it should be: Before diff ``` $/plugins/cuda/dynamic_cuda/cuda.cpp:131:10: error: no matching function for call to 'dlwrap_cuInit' return dlwrap_cuInit(X); ^~~~~~~~~~~~~ $/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: note: candidate function not viable: requires 0 arguments, but 1 was provided DLWRAP_INTERNAL(cuInit, 0); ``` After diff In file included from $/plugins/cuda/dynamic_cuda/cuda.cpp:16: ``` $/include/dlwrap.h:131:3: error: static_assert failed due to requirement '0UL == 1UL' "Arity Error" static_assert(Requested == Required, "Arity Error"); ^ ~~~~~~~~~~~~~~~~~~~~~ $/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: note: in instantiation of function template specialization 'dlwrap::verboseAssert<0UL, 1UL>' requested here DLWRAP_INTERNAL(cuInit, 0); ``` Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106543
-
Cullen Rhodes authored
Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D106540
-
Melanie Blower authored
This reverts commit b9b696bb. Buildbot failures see https://lab.llvm.org/buildbot#builders/118/builds/4138 and https://lab.llvm.org/buildbot#builders/110/builds/5112
-
Raphael Isemann authored
The test I added in commit 07800348 was using SIGINT for testing the tab completion. The idea is to have a signal that only has one possible completion and I ended up picking SIGIN -> SIGINT for the test. However on non-Linux systems there is SIGINFO which is a valid completion for `SIGIN' and so the test fails there. This replaces SIGIN -> SIGINT with SIGPIP -> SIGPIPE completion which according to LLDB's signal list in Host.cpp is the only valid completion.
-
Kazu Hirata authored
The last use was removed on Jan 16, 2019 in commit 81101de5.
-
Joseph Huber authored
Summary: Fixes some warning given for uninitialized block counts if the exection mode is not recognized. This shouldn't happen in practice because the execution mode is checked when it's read from the device.
-
Nico Weber authored
-
Aaron Ballman authored
Clang implemented the _ExtInt datatype as a bit-precise integer type, which was then proposed to WG14. WG14 has accepted the proposal (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf), but Clang requires some additional work as a result. In the original Clang implementation, we elected to disallow implicit conversions involving these types until after WG14 finalized the rules. This patch implements the rules decided by WG14: no integer promotion for bit-precise types, conversions prefer the larger of the two types and in the event of a tie (say _ExtInt(32) and a 32-bit int), the standard type wins. There are more changes still needed to conform to N2709, but those will be handled in follow-up patches.
-
Raphael Isemann authored
This patch adds the ability to get a DWARFDIE's children as an LLVM range. This way we can use for range loops to iterate over them and we can use LLVM's algorithms like `llvm::all_of` to query all children. The implementation has to do some small shenanigans as the iterator needs to store a DWARFDIE, but a DWARFDIE container is also a DWARFDIE so it can't return the iterator by value. I just made the `children` getter a templated function to avoid the cyclic dependency. Reviewed By: #lldb, werat, JDevlieghere Differential Revision: https://reviews.llvm.org/D103172
-
Med Ismail Bennani authored
This patch introduces Scripted Processes to lldb. The goal, here, is to be able to attach in the debugger to fake processes that are backed by script files (in Python, Lua, Swift, etc ...) and inspect them statically. Scripted Processes can be used in cooperative multithreading environments like the XNU Kernel or other real-time operating systems, but it can also help us improve the debugger testing infrastructure by writting synthetic tests that simulates hard-to-reproduce process/thread states. Although ScriptedProcess is not feature-complete at the moment, it has basic execution capabilities and will improve in the following patches. rdar://65508855 Differential Revision: https://reviews.llvm.org/D100384 Signed-off-by:
Med Ismail Bennani <medismail.bennani@gmail.com>
-
Melanie Blower authored
Change the ffp-model=precise to enables -ffp-contract=on (previously -ffp-model=precise enabled -ffp-contract=fast). This is a follow-up to Andy Kaylor's comments in the llvm-dev discussion "Floating Point semantic modes". From the same email thread, I put Andy's distillation of floating point options and floating point modes into UsersManual.rst Also fixes bugs.llvm.org/show_bug.cgi?id=50222 Reviewed By: rjmccall, andrew.kaylor Differential Revision: https://reviews.llvm.org/D74436
-
Raphael Isemann authored
`CompletionRequest::AddCompletion` adds the given string as completion of the current command token. `CompletionRequest::TryCompleteCurrentArg` only adds it if the current token is a prefix of the given string. We're using `AddCompletion` for the `process signal` handler which means that `process signal SIGIN` doesn't get uniquely completed to `process signal SIGINT` as we unconditionally add all other signals (such as `SIGABRT`) as possible completions. By using `TryCompleteCurrentArg` we actually do the proper filtering which will only add `SIGINT` (as that's the only signal with the prefix 'SIGIN' in the example above). Reviewed By: mib Differential Revision: https://reviews.llvm.org/D105028
-
Caroline Concatto authored
This patch avoids computing discounts for predicated instructions when the VF is scalable. There is no support for vectorization of loops with division because the vectorizer cannot guarantee that zero divisions will not happen. This loop now does not use VF scalable ``` for (long long i = 0; i < n; i++) if (cond[i]) a[i] /= b[i]; ``` Differential Revision: https://reviews.llvm.org/D101916
-
Paulo Matos authored
Opaque values (of zero size) can be stored in memory with the implemention of reference types in the WebAssembly backend. Since MachineMemOperand uses LLTs we need to be able to support zero-sized scalars types in LLTs. Differential Revision: https://reviews.llvm.org/D105423
-
Raphael Isemann authored
LLDB's DWARF parser has some heuristics for guessing and fixing up the accessibility of C++ class/struct members after they were already created in the internal Clang AST. The heuristic is that if a struct/class has a base class, then it's actually a class and it's members are private unless otherwise specified. From what I can see this heuristic isn't sound and also unnecessary. The idea that inheritance implies that the `class` keyword was used and the default visibility is `private` is incorrect. Also both GCC and Clang use `DW_TAG_structure_type` and `DW_TAG_class_type` for `struct` and `class` types respectively, so the default visibility we infer from that information is always correct and there is no need to fix it up. And finally, the access specifiers we set in the Clang AST are anyway unused within LLDB. The expression parser explicitly ignores them to give users access to private members and there is not SBAPI functionality that exposes this information. This patch removes all the heuristic code for the reasons above and instead just relies on the access values we infer from the tag kind and explicit annotations in DWARF. This patch is NFCI. Reviewed By: werat Differential Revision: https://reviews.llvm.org/D105463
-
Raphael Isemann authored
C doesn't allow empty structs but Clang/GCC support them and give them a size of 0. LLDB implements this by checking the tag kind and if it's `DW_TAG_structure_type` then we give it a size of 0 via an empty external RecordLayout. This is done because our internal TypeSystem is always in C++ mode (which means we would give them a size of 1). The current check for when we have this special case is currently too lax as types with `DW_TAG_structure_type` can also occur in C++ with types defined using the `struct` keyword. This means that in a C++ program with `struct Empty{};`, LLDB would return `0` for `sizeof(Empty)` even though the correct size is 1. This patch removes this special case and replaces it with a generic approach that just assigns empty structs the byte_size as specified in DWARF. The GCC/Clang special case is handles as they both emit an explicit `DW_AT_byte_size` of 0. And if another compiler decides to use a different byte size for this case then this should also be handled by the same code as long as that information is provided via `DW_AT_byte_size`. Reviewed By: werat, shafik Differential Revision: https://reviews.llvm.org/D105471
-
Florian Mayer authored
This reverts commit bde9415f.
-
Dawid Jurczak authored
The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns in similar way like GCC: https://godbolt.org/z/fh95e83od LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort. Differential Revision: https://reviews.llvm.org/D104464
-
Florian Mayer authored
This avoids unnecessary instrumentation. Reviewed By: eugenis, vitalybuka Differential Revision: https://reviews.llvm.org/D105703
-
Balázs Kéri authored
BindingDecl was added recently but the related DecompositionDecl is needed to make C++17 structured bindings importable. Import of BindingDecl was changed to avoid infinite import loop. Reviewed By: martong Differential Revision: https://reviews.llvm.org/D105354
-
Jan Svoboda authored
-
Simon Pilgrim authored
As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450
-
Jon Chesterfield authored
This class is instantiated once in rtl.cpp before hsa_init is called. The hsa_signal_create call therefore fails leaving the pool empty. This signal pool is a legacy from ATMI where it was constructed after hsa_init. Moving the state into the rtl.cpp global class disabled the initial populating of the pool without noticeably changing performance. Just rechecked with a fix that allocates the signals after hsa_init and that also doesn't noticeably change performance. This patch therefore drops the initialisation. Only change from main is to drop a DEBUG_PRINT statement that would say the pool initial size is zero. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106515
-
Simon Tatham authored
This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. !srcloc is generated in clang codegen, and pulled back out by llvm functions like AsmPrinter::emitInlineAsm that need to report errors in the inline asm. From there it goes to LLVMContext::emitError, is stored in DiagnosticInfoInlineAsm, and ends up back in clang, at BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true clang::SourceLocation from the integer cookie. Throughout this code path, it's now 64-bit rather than 32, which means that if SourceLocation is expanded to a 64-bit type, this error report won't lose half of the data. The compiler will tolerate both of i32 and i64 !srcloc metadata in input IR without faulting. Test added in llvm/MC. (The semantic accuracy of the metadata is another matter, but I don't know of any situation where that matters: if you're reading an IR file written by a previous run of clang, you don't have the SourceManager that can relate those source locations back to the original source files.) Original version of the patch by Mikhail Maltsev. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D105491
-
David Green authored
-
Dmitry Vyukov authored
The current (default) line length is 80 columns. That's based on old hardware and historical conventions. There are no existent reasons to keep line length that small, especially provided that our coding style uses quite lengthy identifiers. The Linux kernel recently switched to 100, let's start with 100 as well. This change intentionally does not re-format code. Re-formatting is intended to happen incrementally, or on dir-by-dir basis separately. Reviewed By: vitalybuka, melver, MaskRay Differential Revision: https://reviews.llvm.org/D106436
-
Fraser Cormack authored
Lowering certain float vectors without legal vector types could cause a crash due to a bad interaction between passing floats via GPRs and argument splitting. Split vector floats appear just like scalar floats. Under certain situations we choose to pass these float arguments via GPRs and use an XLenVT location and set the 'BCvt' info to track how they must be converted back to floating-point values. However, later logic for handling split arguments may take over, in which case we lose the previous information and set the 'Indirect' info, thus incorrectly lowering to integer types. I don't believe that we would have come across the notion of split floating-point arguments before. This patch addresses the issue by updating the lowering so that split arguments are only passed indirectly when they are scalar integer types. This has some change to how we lower some larger illegal float vectors, as can be seen in 'fastcc-float.ll' where the vector is now passed partly in registers and partly on the stack. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D102852
-
Fraser Cormack authored
This relands a6ca88e9 which was originally reverted due to overflow bugs in e3fa2b1e. This patch teaches the compiler to identify a wider variety of `BUILD_VECTOR`s which form integer arithmetic sequences, and to lower them to `vid.v` with modifications for non-unit steps and non-zero addends. The sequences handled by this optimization must either be monotonically increasing or decreasing. Consecutive elements holding the same value indicate a fractional step which, while simple mathematically, becomes more complex to handle both in the realm of lossy integer division and in the presence of `undef`s. For example, a common "interleaving" shuffle index will be lowered by LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR` nodes. Either of these would ideally be lowered to `vid.v` shifted right by 1. Detection of this sequence in presence of general `undef` values is more complicated, however: `<0,u,u,1,>` could match either `<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence. Both are possible, so backtracking or multiple passes is inevitable. Sticking to monotonic sequences keeps the logic simpler as it can be done in one pass. Fractional steps will likely be a separate optimization in a future patch. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104921
-
Whisperity authored
@vabridgers identified a way to crash the check by running on code that involve `AttributedType`s. This patch fixes the check to first and foremost not crash, but also improves the logic handling qualifiers. If the types contain any additional (not just CVR) qualifiers that are not the same, they will not be deemed mixable. The logic for CVR-Mixing and the `QualifiersMix` check option remain unchanged. Reviewed By: aaron.ballman, vabridgers Differential Revision: http://reviews.llvm.org/D106361
-
Jason Molenda authored
This patch adds code to process save-core for Mach-O files which embeds an "addrable bits" LC_NOTE when the process is using a code address mask (e.g. AArch64 v8.3 with ptrauth aka arm64e). Add code to ObjectFileMachO to read that LC_NOTE from corefiles, and ProcessMachCore to set the process masks based on it when reading a corefile back in. Also have "process status --verbose" print the current address masks that lldb is using internally to strip ptrauth bits off of addresses. Differential Revision: https://reviews.llvm.org/D106348 rdar://68630113
-
Timm Bäder authored
Differential Revision: https://reviews.llvm.org/D106430
-