- May 03, 2020
-
-
Florian Hahn authored
-
Johannes Doerfert authored
This reduces memory consumption for IRPositions by eliminating the vtable pointer and the `KindOrArgNo` integer. Since each abstract attribute has an associated IRPosition, the 12-16 bytes we save add up quickly. No functional change is intended. --- Single run of the Attributor module and then CGSCC pass (oldPM) for SPASS/clause.c (~10k LLVM-IR loc): Before: ``` calls to allocation functions: 469545 (260135/s) temporary memory allocations: 77137 (42735/s) peak heap memory consumption: 30.50MB peak RSS (including heaptrack overhead): 119.50MB total memory leaked: 269.07KB ``` After: ``` calls to allocation functions: 468999 (274108/s) temporary memory allocations: 77002 (45004/s) peak heap memory consumption: 28.83MB peak RSS (including heaptrack overhead): 118.05MB total memory leaked: 269.07KB ``` Difference: ``` calls to allocation functions: -546 (5808/s) temporary memory allocations: -135 (1436/s) peak heap memory consumption: -1.67MB peak RSS (including heaptrack overhead): 0B total memory leaked: 0B ``` --- CTMark 15 runs Metric: compile_time Program lhs rhs diff test-suite...:: CTMark/sqlite3/sqlite3.test 25.07 24.09 -3.9% test-suite...Mark/mafft/pairlocalalign.test 14.58 14.14 -3.0% test-suite...-typeset/consumer-typeset.test 21.78 21.58 -0.9% test-suite :: CTMark/SPASS/SPASS.test 21.95 22.03 0.4% test-suite :: CTMark/lencod/lencod.test 25.43 25.50 0.3% test-suite...ark/tramp3d-v4/tramp3d-v4.test 23.88 23.83 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 60.24 60.11 -0.2% test-suite :: CTMark/kimwitu++/kc.test 15.69 15.69 -0.0% test-suite...:: CTMark/ClamAV/clamscan.test 25.43 25.42 -0.0% test-suite :: CTMark/Bullet/bullet.test 37.63 37.62 -0.0% Geomean difference -0.8% --- Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D78722
-
Johannes Doerfert authored
Since every AbstractAttribute so far, and for the foreseeable future, corresponds to a single IRPosition we can simplify the class structure. We already did this for IRAttribute but there is no reason to stop there.
-
Nico Weber authored
This reverts commit 53913a65. Breaks VFSFromYAMLTest.DirectoryIterationSameDirMultipleEntries in SupportTests on non-Windows.
-
Simon Pilgrim authored
-
Mircea Trofin authored
Discussion is in https://reviews.llvm.org/D79215
-
Reid Kleckner authored
LLD calls this on every source file string in every object file when writing PDBs, so it is somewhat hot. Avoid rewriting paths that do not contain path traversal components (./..). Use find_first_not_of(separators) directly instead of using the path iterators. The path component iterators appear to be slow, and directly searching for slashes makes it easier to find double separators that need to be canonicalized. I discovered that the VFS relies on remote_dots to not canonicalize early slashes (/foo or C:/foo) on Windows, so I had to leave that behavior behind with unit tests for it. This is undesirable, but I claim that my change is NFC.
-
Sanjay Patel authored
Cond ? (X & ~C) : (X | C) --> (X & ~C) | (Cond ? 0 : C) Cond ? (X | C) : (X & ~C) --> (X & ~C) | (Cond ? C : 0) The select-of-constants form results in better codegen. There's an existing test diff that shows a transform that results in an extra IR instruction, but that's an existing problem. This is motivated by code seen in LLVM itself - see PR37581: https://bugs.llvm.org/show_bug.cgi?id=37581 define i8 @src(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %or = or i8 %x, %C %cond = select i1 %b, i8 %or, i8 %and ret i8 %cond } define i8 @tgt(i8 %x, i8 %C, i1 %b) { %notC = xor i8 %C, -1 %and = and i8 %x, %notC %mul = select i1 %b, i8 %C, i8 0 %or = or i8 %mul, %and ret i8 %or } http://volta.cs.utah.edu:8080/z/Vt2WVm Differential Revision: https://reviews.llvm.org/D78880
-
Benjamin Kramer authored
This is a somewhat annoying API, but not without precedend in this low level API.
-
Simon Pilgrim authored
[X86] Use splitVector helper in truncateVectorWithPACK/splitVectorStore/combineHorizontalMinMaxResult/combineReductionToHorizontal. NFC. All these locations were performing the same type splitting/extractSubVector calls as the spltVector helper.
-
LLVM GN Syncbot authored
-
Nico Weber authored
-
Simon Pilgrim authored
It can handle EVT just as well (and so can the extractSubVector calls).
-
Alexey Lapshin authored
Summary: Current implementation of DWARFDie::getName(DINameKind Kind) could lead to double call to DWARFDie::find(DW_AT_name) in following scenario: getName(LinkageName); getName(ShortName); getName(LinkageName) calls find(DW_AT_name) if linkage name is not found. Then, it is called again in getName(ShortName). This patch alows to request LinkageName and ShortName separately to avoid extra call to find(DW_AT_name). It helps D74169 to parse clang debuginfo faster(~1%). Reviewers: clayborg, dblaikie Differential Revision: https://reviews.llvm.org/D79173
-
Nikita Popov authored
Duplicate some tests in preparation for D79294.
-
Simon Pilgrim authored
The splitVector helper uses extractSubVector which splits build vectors like we do here, so avoid reimplementing it. splitVector could easily be extended to peek through bitcasts as well but I'd prefer to keep this commit NFC.
-
Simon Pilgrim authored
Use the ISD::matchUnaryPredicate helper to check for inrange constants.
-
Nikita Popov authored
Test this directly, rather than going through InstSimplify.
-
Ten Tzen authored
This is a Test commit.
-
Reid Kleckner authored
The number of public symbols is very large, and each deserialization does a few heap allocations. The public symbols are serialized by the linker, so we can assume they have the expected layout and use it directly. Saves O(#publics) temporary heap allocations and shrinks some data structures.
-
Craig Topper authored
Don't use $noreg for instructions that take register inputs. Only allow $noreg for parts of memory operands. Don't use index register with $rip base. Use RETQ instead of the RET pseudo. This pass is after the ExpandPseudo pass that converts RET to RETQ.
-
Craig Topper authored
[PDB] Remove a couple asserts that are no longer valid now that C13Builders does not use unique_ptr. These asserts used to check that unique_ptr was not null. This fixes failures from 7af4bb16
-
Reid Kleckner authored
This accounts for a large portion of the memory allocations in LLD. This DebugSubsectionRecordBuilder object can be stored directly in C13Builders, it mostly wraps other subsections. Remove the container kind field from the object. It is always the same for all elements in the vector, and we can pass it in during writing.
-
Thomas Preud'homme authored
Summary: FileCheck documentation contains an example of a numeric variable defined and used on the same line. This is not currently supported by FileCheck so this commit fixes the example to use CHECK-SAME for the variable use. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D79253
-
LemonBoy authored
The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces. The technique employed by `ExpandLoad` is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines. Differential Revision: https://reviews.llvm.org/D79096
-
- May 02, 2020
-
-
Simon Pilgrim authored
Causing buildbot failures
-
Simon Pilgrim authored
rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).
-
Benjamin Kramer authored
std::pair has a trivial copy ctor, std::tuple doesn't.
-
Nico Weber authored
The failures only happened in fully clean builds. Also put all current dependencies of LibraryDependencies.inc in the build graph, so that this type of thing will cause a failure in incremental builds next time as well.
-
Reid Kleckner authored
This generalizes the main Windows command line tokenizer to be able to produce StringRef substrings as well as freshly copied C strings. The implementation is still shared with the normal tokenizer, which is important, because we have unit tests for that. .drective sections can be very long. They can potentially list up to every symbol in the object file by name. It is worth avoiding these string copies. This saves a lot of memory when linking chrome.dll with PGO instrumentation: BEFORE AFTER % IMP peak memory: 6657.76MB 4983.54MB -25% real: 4m30.875s 2m26.250s -46% The time improvement may not be real, my machine was noisy while running this, but that the peak memory usage improvement should be real. This change may also help apps that heavily use dllexport annotations, because those also use linker directives in object files. Apps that do not use many directives are unlikely to be affected. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D79262
-
Benjamin Kramer authored
We don't require the type to be trivially assignable. While the standard says that only is_trivially_copyable types may be memcpy'd, this seems overly strict. We never assign the type, so there's no way for the type to observe that the copy/move construction got elided. This is important for std::pair<POD, POD>, which is not trivially assignable and probably never will be because changing that would break ABI. As a side-effect this no longer allows types with deleted copy/move constructors in SmallVector. That's an unintended side-effect of is_trivially_copyable anyways. Shrinks Release+Asserts clang by 20k.
-
Benjamin Kramer authored
This seems to be working by accident.
-
Benjamin Kramer authored
This lets it use sized deallocation and make more efficient alignment decisions. Also adjust BumpPtrAllocator to always allocate at alignof(std::max_align_t).
-
Sam Elliott authored
Summary: The current lowering of `select` on RISC-V uses a branch instruction to load a register with one or other value. This is inefficient, especially in the case of small constants that can be computed easily. By implementing the TargetLowering::convertSelectOfConstantsToMath hook, some of the simpler cases are covered that let us avoid introducing a branch in these cases. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D79260
-
Sam Elliott authored
Summary: This just adds some simple cases for testing select of constants. There will be a follow-up patch that improves code generation in some of these cases. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D79259
-
Sam Elliott authored
Summary: This patch addresses some weird assembly sequences we were seeing during comparing floats. In particular, comparing a float to itself tells you whether it is NaN or not, which we were doing correctly, but with an extra unneeded `and` instruction. This patch specialises the existing patterns to remove the `and` instructions when both their operands are the same. Reviewed By: luismarques, asb Differential Revision: https://reviews.llvm.org/D78908
-
Sam Elliott authored
Summary: I worked on adding some SelectionDag patterns to address code generated by these examples, which came out of some differential testing against GCC. The pattern additions will be in a follow-up patch. Reviewers: luismarques, asb Reviewed By: luismarques, asb Differential Revision: https://reviews.llvm.org/D78907
-
Sam McCall authored
I've left out some cases where I wasn't totally sure this was right or whether the include was ok (compiler-rt) or idiomatic (flang).
-
Sam McCall authored
Use this in clangd, will follow up with replacements for isspace where locale-dependent is clearly not intended.
-
LLVM GN Syncbot authored
-