- Mar 31, 2021
-
-
Muhammad Omair Javaid authored
This is patch adds support for adding dynamic register sets for AArch64 dynamic features in LLDB. AArch64 has optional features like SVE, Pointer Authentication and MTE which means LLDB needs to decide at run time which registers it needs to pull in for the current executable based on underlying support for a certain feature. This patch makes necessary adjustments to make way for dynamic register infos and dynamic register sets. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D96458
-
Heejin Ahn authored
The number of events and the type index should be encoded in ULEB128, but they were incorrctly encoded in LEB128. The smallest number with which its LEB128 and ULEB128 encodings are different is 64. There's no way we can generate 64 events in the C++ toolchain implementation so we can't test that, but the attached test tests when the type index is 64. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D99627
-
Richard Smith authored
The documentation build rule will generate an up-to-date version of this if it's not checked in.
-
Richard Smith authored
-
Richard Smith authored
directory.
-
Zequan Wu authored
This allow safe-icf mode to work when linking with LTO. Differential Revision: https://reviews.llvm.org/D99613
-
Richard Smith authored
Clang 13 features as yellow not green.
-
Mehdi Amini authored
This allows for the conversion to match `A(B()) -> C()` with a pattern matching `A` and marking `B` for deletion. Also add better assertions when an operation is erased while still having uses. Differential Revision: https://reviews.llvm.org/D99442
-
- Mar 30, 2021
-
-
Greg McGary authored
Within `lld/macho/`, only `InputFiles.cpp` and `Symbols.h` require the `macho::` namespace qualifier to disambiguate references to `class Symbol`. Add braces to outer `for` of a 5-level single-line `if`/`for` nest. Differential Revision: https://reviews.llvm.org/D99555
-
Wei Mi authored
another one for distributed mode. Currently during module importing, ThinLTO opens all the source modules, collect functions to be imported and append them to the destination module, then leave all the modules open through out the lto backend pipeline. This patch refactors it in the way that one source module will be closed before another source module is opened. All the source modules will be closed after importing phase is done. It will save some amount of memory when there are many source modules to be imported. Note that this patch only changes the distributed thinlto mode. For in process thinlto mode, one source module is shared acorss different thinlto backend threads so it is not changed in this patch. Differential Revision: https://reviews.llvm.org/D99554
-
Jon Roelofs authored
-
Mike Rice authored
Added basic parsing/sema/serialization support for dispatch directive. Differential Revision: https://reviews.llvm.org/D99537
-
Louis Dionne authored
Prior to this patch, we would generate a fancy <__config> header by concatenating <__config_site> and <__config>. This complexifies the build system and also increases the difference between what's tested and what's actually installed. This patch removes that complexity and instead simply installs <__config_site> alongside the libc++ headers. <__config_site> is then included by <__config>, which is much simpler. Doing this also opens the door to having different <__config_site> headers depending on the target, which was impossible before. It does change the workflow for testing header-only changes to libc++. Previously, we would run `lit` against the headers in libcxx/include. After this patch, we run it against a fake installation root of the headers (containing a proper <__config_site> header). This makes use closer to testing what we actually install, which is good, however it does mean that we have to update that root before testing header changes. Thus, we now need to run `ninja check-cxx-deps` before running `lit` by hand. Differential Revision: https://reviews.llvm.org/D97572
-
David Green authored
Mark v6m/v8m-baseline cores as having no branch predictors. This should not alter very much on its own, but is more correct as the cores do not have branch predictors and can help in the future.
-
Alexey Bataev authored
Need to cast the argument for the debug wrapper function call to the corresponding parameter type to avoid crash. Differential Revision: https://reviews.llvm.org/D99617
-
Luís Marques authored
On 64-bit systems with small VMAs (e.g. 39-bit) we can't use SizeClassAllocator64 parameterized with size class maps containing a large number of classes, as that will make the allocator region size too small (< 2^32). Several tests were already disabled for Android because of this. This patch provides the correct allocator configuration for RISC-V (riscv64), generalizes the gating condition for tests that can't be enabled for small VMA systems, and tweaks the tests that can be made compatible with those systems to enable them. I think the previous gating on Android should instead be AArch64+Android, so the patch reflects that. Differential Revision: https://reviews.llvm.org/D97234
-
David Blaikie authored
-
Matheus Izvekov authored
See PR45088. Compound requirement type constraints were using decltype(E) instead of decltype((E)), as per `[expr.prim.req]p1.3.3`. Since neither instantiation nor type dependence should matter for the constraints, this uses an approach where a `decltype` type is not built, and just the canonical type of the expression after template instantiation is used on the requirement. Signed-off-by:
Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D98160
-
Fangrui Song authored
-
Sanjay Patel authored
This is one problem shown in https://llvm.org/PR49763 https://alive2.llvm.org/ce/z/cV6-4K https://alive2.llvm.org/ce/z/9_3g-L
-
Sanjay Patel authored
-
Huihui Zhang authored
Use SetVector instead of SmallPtrSet for external definitions created for VPlan. Doing this can help avoid non-determinism caused by iterating over unordered containers. This bug was found with reverse iteration turning on, --extra-llvm-cmake-variables="-DLLVM_REVERSE_ITERATION=ON". Failing LLVM-Unit test VPRecipeTest.dump. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D99544
-
Amara Emerson authored
This patch adds 3 methods, one for power-of-2 vectors which use tree reductions using vector ops, before a final reduction op. For non-pow-2 types it generates multiple narrow reductions and combines the values with scalar ops. Differential Revision: https://reviews.llvm.org/D97163
-
Amara Emerson authored
For imported pattern purposes, we have a custom rule that promotes the rotate amount to 64b as well. Differential Revision: https://reviews.llvm.org/D99463
-
Amara Emerson authored
Differential Revision: https://reviews.llvm.org/D99388
-
Sourabh Singh Tomar authored
Negative numbers are represented using DW_OP_consts along with signed representation of the number as the argument. Test case IR is generated using Fortran front-end. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99273
-
Eugene Zhulenev authored
Interchange options was missing in the pass flags. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D99397
-
spupyrev authored
Currently prof metadata with branch counts is added only for BranchInst and SwitchInst, but not for IndirectBrInst. As a result, BPI/BFI make incorrect inferences for indirect branches, which can be very hot. This diff adds metadata for IndirectBrInst, in addition to BranchInst and SwitchInst. Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D99550
-
Hongtao Yu authored
Use profiled call edges to augment the top-down order. There are cases that the top-down order computed based on the static call graph doesn't reflect real execution order. For example: 1. Incomplete static call graph due to unknown indirect call targets. Adjusting the order by considering indirect call edges from the profile can enable the inlining of indirect call targets by allowing the caller processed before them. 2. Mutual call edges in an SCC. The static processing order computed for an SCC may not reflect the call contexts in the context-sensitive profile, thus may cause potential inlining to be overlooked. The function order in one SCC is being adjusted to a top-down order based on the profile to favor more inlining. 3. Transitive indirect call edges due to inlining. When a callee function is inlined into into a caller function in LTO prelink, every call edge originated from the callee will be transferred to the caller. If any of the transferred edges is indirect, the original profiled indirect edge, even if considered, would not enforce a top-down order from the caller to the potential indirect call target in LTO postlink since the inlined callee is gone from the static call graph. 4. #3 can happen even for direct call targets, due to functions defined in header files. Header functions, when included into source files, are defined multiple times but only one definition survives due to ODR. Therefore, the LTO prelink inlining done on those dropped definitions can be useless based on a local file scope. More importantly, the inlinee, once fully inlined to a to-be-dropped inliner, will have no profile to consume when its outlined version is compiled. This can lead to a profile-less prelink compilation for the outlined version of the inlinee function which may be called from external modules. while this isn't easy to fix, we rely on the postlink AutoFDO pipeline to optimize the inlinee. Since the survived copy of the inliner (defined in headers) can be inlined in its local scope in prelink, it may not exist in the merged IR in postlink, and we'll need the profiled call edges to enforce a top-down order for the rest of the functions. Considering those cases, a profiled call graph completely independent of the static call graph is constructed based on profile data, where function objects are not even needed to handle case #3 and case 4. I'm seeing an average 0.4% perf win out of SPEC2017. For certain benchmark such as Xalanbmk and GCC, the win is bigger, above 2%. The change is an enhancement to https://reviews.llvm.org/D95988. Reviewed By: wmi, wenlei Differential Revision: https://reviews.llvm.org/D99351
-
John Brawn authored
The test Frontend/plugin-delayed-template.cpp is failing when asserts are enabled because it hits an assertion in denormalizeStringImpl when trying to round-trip OPT_plugin_arg. Fix this by adjusting how the option is handled, as the first part is joined to -plugin-arg and the second is separate. Differential Revision: https://reviews.llvm.org/D99606
-
Jessica Paquette authored
Basically a port of isBitfieldExtractOpFromSExtInReg in AArch64ISelDAGToDAG. This is only done post-legalization for now. Once the legalizer knows how to decompose these back into shifts, this requirement can probably be removed. Differential Revision: https://reviews.llvm.org/D99230
-
Jonas Devlieghere authored
-
Mehdi Amini authored
Add a "register_runtime" method to the mlir.execution_engine and show calling back from MLIR into Python This exposes the ability to register Python functions with the JIT and exposes them to the MLIR jitted code. The provided test case illustrates the mechanism. Differential Revision: https://reviews.llvm.org/D99562
-
Nick Lewycky authored
This flag allows the developer to see the result of linking even if it fails the verifier, as a step in debugging cases where the linked module fails the verifier. Differential Revision: https://reviews.llvm.org/D99382
-
Michał Górny authored
-
Craig Topper authored
[RISCV] Pass 'half' in the lower 16 bits of an f32 value when F extension is enabled, but Zfh is not. Without Zfh the half type isn't legal, but it could still be used as an argument/return in IR. Clang will not generate this today. Previously we promoted the half value to float for arguments and returns if the F extension is enabled but Zfh isn't. Then depending on which ABI is enabled we would pass it in either an FPR or a GPR in float format. If the F extension isn't enabled, it would get passed in the lower 16 bits of a GPR in half format. With this patch the value will always in half format and will be in the lower bits of a GPR or FPR. This should be consistent with where the bits are located when Zfh is enabled. I've based this implementation off of how this is done on ARM. I've manually nan-boxed the value to 32 bits using integer ops. It looks like flw, fsw, fmv.s, fmv.w.x, fmf.x.w won't canonicalize nans so should leave the value alone. I think those are the instructions that could get used on this value. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D98670
-
Amara Emerson authored
This is a straightforward port. Differential Revision: https://reviews.llvm.org/D99449
-
Tomas Matheson authored
Currently needsStackRealignment returns false if canRealignStack returns false. This means that the behavior of needsStackRealignment does not correspond to it's name and description; a function might need stack realignment, but if it is not possible then this function returns false. Furthermore, needsStackRealignment is not virtual and therefore some backends have made use of canRealignStack to indicate whether a function needs stack realignment. This patch attempts to clarify the situation by separating them and introducing new names: - shouldRealignStack - true if there is any reason the stack should be realigned - canRealignStack - true if we are still able to realign the stack (e.g. we can still reserve/have reserved a frame pointer) - hasStackRealignment = shouldRealignStack && canRealignStack (not target customisable) Targets can now override shouldRealignStack to indicate that stack realignment is required. This change will make it easier in a future change to handle the case where we need to realign the stack but can't do so (for example when the register allocator creates an aligned spill after the frame pointer has been eliminated). Differential Revision: https://reviews.llvm.org/D98716 Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
-
Nick Lewycky authored
-
Craig Topper authored
The shift amount should always be a vector or an XLen scalar. The SplatOperand flag is used to indicate we need to legalize non-XLen scalars including special handling for i64 on RV32. This will prevent us from silently adjusting these operands if the intrinsics are misused. I'll probably adjust the name of the SplatOperand flag slightly in a follow up patch. Reviewed By: khchen, frasercrmck Differential Revision: https://reviews.llvm.org/D99545
-