- Sep 01, 2021
-
-
Yonghong Song authored
A new LLVM specific TAG DW_TAG_LLVM_annotation is added. The name is suggested by Paul Robinson ([1]). Currently, this tag is used to output __attribute__((btf_tag("string"))) annotations in dwarf. The following is an example for a global variable with two btf_tag attributes: 0x0000002a: DW_TAG_variable DW_AT_name ("g1") DW_AT_type (0x00000052 "int") DW_AT_external (true) DW_AT_decl_file ("/tmp/home/yhs/work/tests/llvm/btf_tag/t.c") DW_AT_decl_line (8) DW_AT_location (DW_OP_addr 0x0) 0x0000003f: DW_TAG_LLVM_annotation DW_AT_name ("btf_tag") DW_AT_const_value ("tag1") 0x00000048: DW_TAG_LLVM_annotation DW_AT_name ("btf_tag") DW_AT_const_value ("tag2") 0x00000051: NULL In the future, DW_TAG_LLVM_annotation may encode other type of non-string const value. [1] https://lists.llvm.org/pipermail/llvm-dev/2021-June/151250.html Differential Revision: https://reviews.llvm.org/D106621
-
Joel E. Denny authored
-
Michael Kruse authored
Change pass-by-const-ref to pass-by-value as objects are recreated due to custom up-/down-casting anwyway.
-
Michael Kruse authored
-
Wang, Pengfei authored
Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105799
-
Petr Hosek authored
When a nodeduplicate COMDAT group contains a weak symbol, choose a non-weak symbol (or one of the weak ones) rather than reporting an error. This should address issue PR51394. With the current IR representation, a generic comdat nodeduplicate semantics is not representable for LTO. In the linker, sections and symbols are separate concepts. A dropped weak symbol does not force the defining input section to be dropped as well (though it can be collected by GC). In the IR, when a weak linkage symbol is dropped, its associate section content is dropped as well. For InstrProfiling, which is where ran into this issue in PR51394, the deduplication semantic is a sufficient workaround. Differential Revision: https://reviews.llvm.org/D108689
-
Sanjay Patel authored
-
Alexandre Ganea authored
Differential Revision: https://reviews.llvm.org/D109030
-
Joseph Huber authored
Performance on GPU targets can be highly variable, sometimes inlining everything hurts performance and sometimes it greatly improves it. Add an option to toggle this behaviour to better investigate it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D109014
-
Alex Langford authored
It is easy to accidentally introduce a deadlock by having the callback passed to Language::ForEach also attempt to acquire the same lock. It is easy enough to disallow the callback from calling anything in Language directly, but it may happen through a series of other function/method calls. The solution I am proposing is to tighten the lock in Language::ForEach so that it is only held as we gather the currently loaded language plugins. We store them in a vector and then iterate through them with the callback so that the callback can't introduce a deadlock. Differential Revision: https://reviews.llvm.org/D109013
-
wren romano authored
Trying to reduce confusion by having the name of the public method match that of the private method for handling the recursion. Also adding some comments to SparseTensorStorage::fromCOO to help clarify what the recursive calls are doing in the dense case. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D108954
-
Andrew Browne authored
-
- Aug 31, 2021
-
-
Vy Nguyen authored
The current help msg isn't super clear on whether t prints the content of the files or just the list of files. (I'd certainly thought it'd print the list of files, and accidentally had a bunch of "gargabe" printed to my terminal). Similarly, t sounded like it'd do what p actually did. Differential Revision: https://reviews.llvm.org/D109018
-
Kevin Athey authored
This reverts commit 71d7fed3. Reason: broke sanitizer bots more info: https://reviews.llvm.org/D108770
-
Louis Dionne authored
This must have been meant to be friend-declaring operator!=, but it turns out that it's not even necessary to make it a friend since it does not access any private state. rdar://82568613
-
Stanislav Mekhanoshin authored
When RA eliminated a dead def it can either immediately delete the instruction itself or replace it with KILL to defer the actual removal. If this instruction has a virtual register use killing the register it will shrink the LI of the use. However, if the LI covers the instruction and extends beyond it the shrink will not happen. In fact that is impossible to shrink such use because of the KILL still using it. If later the LI of the use will be split at the KILL and the KILL itself is eliminated after that point the new live segment ends up at an invalid slot index. This extremely rare condition was hit after D106408 which has enabled rematerialization of such instructions. The replacement with KILL is only done for rematerialized defs which became dead and such rematerialization did not generally happen before. The patch deletes an instruction immediately if it is a result of rematerialization and has such use. An alternative would be to prohibit a split at a KILL instruction, but it looks like it is better to split a live range rather then keeping a killed instruction just in case it can be rematerialized further. Fixes PR51655. Differential Revision: https://reviews.llvm.org/D108951
-
Nikita Popov authored
SLPVectorizer currently uses AA::isNoAlias() to determine whether two locations alias. This does not work if one of the instructions is a call. Instead, we should check getModRefInfo(), which determines whether an arbitrary instruction modifies or references a given location. Among other things, this prevents @llvm.experimental.noalias.scope.decl() and other inaccessiblmemonly intrinsics from interfering with SLP vectorization. Differential Revision: https://reviews.llvm.org/D109012
-
wlei authored
This change aims at supporting LBR only sample perf script which is used for regular(Non-CS) profile generation. A LBR perf script includes a batch of LBR sample which starts with a frame pointer and a group of 32 LBR entries is followed. The FROM/TO LBR pair and the range between two consecutive entries (the former entry's TO and the latter entry's FROM) will be used to infer function profile info. An example of LBR perf script(created by `perf script -F ip,brstack -i perf.data`) ``` 40062f 0x40062f/0x4005b0/P/-/-/9 0x400645/0x4005ff/P/-/-/1 0x400637/0x400645/P/-/-/1 ... 4005d7 0x4005d7/0x4005e5/P/-/-/8 0x40062f/0x4005b0/P/-/-/6 0x400645/0x4005ff/P/-/-/1 ... ... ``` For implementation: - Extended a new child class `LBRPerfReader` for the sample parsing, reused all the functionalities in `extractLBRStack` except for an extension to parsing leading instruction pointer. - `HybridSample` is reused(just leave the call stack empty) and the parsed samples is still aggregated in `AggregatedSamples`. After that, range samples, branch sample, address samples are computed and recorded. - Reused `ContextSampleCounterMap` to store the raw profile, since it's no need to aggregation by context, here it just registered one sample counter with a fake context key. - Unified to use `show-raw-profile` instead of `show-unwinder-output` to dump the intermediate raw profile, see the comments of the format of the raw profile. For CS profile, it remains to output the unwinder output. Profile generation part will come soon. Differential Revision: https://reviews.llvm.org/D108153
-
peter klausler authored
Implement constant folding for the transformational intrinsic functions UNPACK and TRANSPOSE. Differential Revision: https://reviews.llvm.org/D109010
-
Joel E. Denny authored
This patch implements OpenMP runtime support for an original OpenMP extension we have developed to support OpenACC: the `ompx_hold` map type modifier. The previous patch in this series, D106509, implements Clang support and documents the new functionality in detail. Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D106510
-
Joel E. Denny authored
This patch implements Clang support for an original OpenMP extension we have developed to support OpenACC: the `ompx_hold` map type modifier. The next patch in this series, D106510, implements OpenMP runtime support. Consider the following example: ``` #pragma omp target data map(ompx_hold, tofrom: x) // holds onto mapping of x { foo(); // might have map(delete: x) #pragma omp target map(present, alloc: x) // x is guaranteed to be present printf("%d\n", x); } ``` The `ompx_hold` map type modifier above specifies that the `target data` directive holds onto the mapping for `x` throughout the associated region regardless of any `target exit data` directives executed during the call to `foo`. Thus, the presence assertion for `x` at the enclosed `target` construct cannot fail. (As usual, the standard OpenMP reference count for `x` must also reach zero before the data is unmapped.) Justification for inclusion in Clang and LLVM's OpenMP runtime: * The `ompx_hold` modifier supports OpenACC functionality (structured reference count) that cannot be achieved in standard OpenMP, as of 5.1. * The runtime implementation for `ompx_hold` (next patch) will thus be used by Flang's OpenACC support. * The Clang implementation for `ompx_hold` (this patch) as well as the runtime implementation are required for the Clang OpenACC support being developed as part of the ECP Clacc project, which translates OpenACC to OpenMP at the directive AST level. These patches are the first step in upstreaming OpenACC functionality from Clacc. * The Clang implementation for `ompx_hold` is also used by the tests in the runtime implementation. That syntactic support makes the tests more readable than low-level runtime calls can. Moreover, upstream Flang and Clang do not yet support OpenACC syntax sufficiently for writing the tests. * More generally, the Clang implementation enables a clean separation of concerns between OpenACC and OpenMP development in LLVM. That is, LLVM's OpenMP developers can discuss, modify, and debug LLVM's extended OpenMP implementation and test suite without directly considering OpenACC's language and execution model, which can be handled by LLVM's OpenACC developers. * OpenMP users might find the `ompx_hold` modifier useful, as in the above example. See new documentation introduced by this patch in `openmp/docs` for more detail on the functionality of this extension and its relationship with OpenACC. For example, it explains how the runtime must support two reference counts, as specified by OpenACC. Clang recognizes `ompx_hold` unless `-fno-openmp-extensions`, a new command-line option introduced by this patch, is specified. Reviewed By: ABataev, jdoerfert, protze.joachim, grokos Differential Revision: https://reviews.llvm.org/D106509
-
Nikita Popov authored
-
Fangrui Song authored
Found by redfast00
-
Louis Dionne authored
Differential Revision: https://reviews.llvm.org/D108940
-
Louis Dionne authored
All supported versions of GCC now do the right thing. Differential Revision: https://reviews.llvm.org/D108997
-
Sylvestre Ledru authored
-
Michał Górny authored
-
Mehdi Amini authored
-
Chih-Ping Chen authored
-
Arthur O'Dwyer authored
-
peter klausler authored
It may not be great practice to pass a procedure (or procedure pointer) with an implicit interface as an actual argument to correspond with a dummy procedure (pointer), but it's not an error. Change to a warning, and modify tests accordingly. Differential Revision: https://reviews.llvm.org/D108932
-
Nick Desaulniers authored
Similar to D108842, D108844, D108926, D108928, and D108936. __has_builtin(builtin_mul_overflow) returns true for 32b RISCV targets, but Clang is deferring to compiler RT when encountering long long types. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D108939
-
Nikita Popov authored
-
Fangrui Song authored
Available and good in Clang 3.5/GCC 5.
-
MaheshRavishankar authored
The output tensor was added for tiling purposes. With use of `TilingInterface` for tiling pad operations, there is no need for an explicit operand for the shape of result of `linalg.pad_tensor` op. The interface allows the tiling pattern to query the value that can be used for the "init" needed for tiling dynamically. Differential Revision: https://reviews.llvm.org/D108613
-
Nick Desaulniers authored
Similar to D108842, D108844, and D108926. __has_builtin(builtin_mul_overflow) returns true for 32b PPC targets, but Clang is deferring to compiler RT when encountering long long types. This breaks ppc44x_defconfig + CONFIG_BLK_DEV_NBD=y builds of the Linux kernel that are using builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support these builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: https://github.com/ClangBuiltLinux/linux/issues/1438 Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D108936
-
Philip Reames authored
Reduced from the ArchiveCommandLine.ll case seen in D108848.
-
Fangrui Song authored
For Clang, 3.5 is the minimum requirement which has fixed the bug. GCC 5 is good as well.
-
Joe Loser authored
Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D108967
-
Mehdi Amini authored
Currently the builtin dialect is the default namespace used for parsing and printing. As such module and func don't need to be prefixed. In the case of some dialects that defines new regions for their own purpose (like SpirV modules for example), it can be beneficial to change the default dialect in order to improve readability. Differential Revision: https://reviews.llvm.org/D107236
-