- Oct 22, 2021
-
-
Craig Topper authored
Instead of returning a bool to indicate success and a separate SDValue, return the SDValue and have the callers check if it is null. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112331
-
Quinn Pham authored
[NFC] This patch fixes a url in a testcase due to the renaming of the branch.
-
Nikita Popov authored
This follows up on D111023 by exporting the generic "load value from constant at given offset as given type" and using it in the store to load forwarding code. We now need to make sure that the load size is smaller than the store size, previously this was implicitly ensured by ConstantFoldLoadThroughBitcast(). Differential Revision: https://reviews.llvm.org/D112260
-
Nikita Popov authored
Make use of the getGEPIndicesForOffset() helper for creating GEPs. This handles arrays as well, uses correct GEP index types and reduces code duplication. Differential Revision: https://reviews.llvm.org/D112263
-
Craig Topper authored
[LegalizeTypes][RISCV][PowerPC] Expand CTLZ/CTTZ/CTPOP instead of promoting if they'll be expanded later. Expanding these requires multiple constants. If we promote during type legalization when they'll end up getting expanded in LegalizeDAG, we'll use larger constants. These constants may be harder to materialize. For example, 64-bit constants on 64-bit RISCV are very expensive. This is similar to what has already been done to BSWAP and BITREVERSE. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112268
-
Steven Wan authored
On AIX, the plugins are linked with `-WL,-G`, which produces shared objects enabled for use with the run-time linker. This patch sets the run-time linker at the main executable link step to allow symbols from the plugins shared objects to be properly bound. Reviewed By: daltenty Differential Revision: https://reviews.llvm.org/D112275
-
Craig Topper authored
These tests have nearly identical content the only difference is that the rv64 test has a signext attribute on some parameters. That attribute should be harmless on rv32. Merge them into a single test file with 2 RUN lines. Differential Revision: https://reviews.llvm.org/D112242
-
Kazu Hirata authored
-
Jonas Paulsson authored
This pseudo is expanded very late (AsmPrinter) and therefore has to have a correct size value, or the branch relaxation pass may make a wrong decision. Review: Ulrich Weigand
-
Piotr Sobczak authored
-
Bradley Smith authored
This will allow us to reuse existing interleaved load logic in lowerInterleavedLoad that exists for neon types, but for SVE fixed types. The goal eventually will be to replace the existing ld<n> intriniscs with these, once a migration path has been sorted out. Differential Revision: https://reviews.llvm.org/D112078
-
Zarko Todorovski authored
-
Roman Lebedev authored
[X86] `X86TTIImpl::getInterleavedMemoryOpCost()`: scale interleaving cost by the fraction of live members By definition, interleaving load of stride N means: load N*VF elements, and shuffle them into N VF-sized vectors, with 0'th vector containing elements `[0, VF)*stride + 0`, and 1'th vector containing elements `[0, VF)*stride + 1`. Example: https://godbolt.org/z/df561Me5E (i64 stride 4 vf 2 => cost 6) Now, not fully interleaved load, is when not all of these vectors is demanded. So at worst, we could just pretend that everything is demanded, and discard the non-demanded vectors. What this means is that the cost for not-fully-interleaved group should be not greater than the cost for the same fully-interleaved group, but perhaps somewhat less. Examples: https://godbolt.org/z/a78dK5Geq (i64 stride 4 (indices 012u) vf 2 => cost 4) https://godbolt.org/z/G91ceo8dM (i64 stride 4 (indices 01uu) vf 2 => cost 2) https://godbolt.org/z/5joYob9rx (i64 stride 4 (indices 0uuu) vf 2 => cost 1) Right now, for such not-fully-interleaved loads we just use the costs for fully-interleaved loads. But at least **in general**, that is obviously overly pessimistic, because **in general**, not all the shuffles needed to perform the full interleaving will end up being live. So what this does, is naively scales the interleaving cost by the fraction of the live members. I believe this should still result in the right ballpark cost estimate, although it may be over/under -estimate. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112307
-
Sylvestre Ledru authored
and fix some typos Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D112299
-
Simon Pilgrim authored
Pre-commit for D111530
-
Jay Foad authored
This doesn't have any effect on codegen now, but it might do in the future if we shrink instructions before post-RA scheduling, which is sensitive to live vs dead defs. Differential Revision: https://reviews.llvm.org/D112305
-
Florian Hahn authored
This patch adds more complex test cases with redundant stores of an existing memset, with other stores in between. It also makes a few of the existing tests more robust.
-
Roman Lebedev authored
This test is quite fragile WRT improvements to the interleaved load cost modelling. Let's bump the stride way up so that is no longer a concern.
-
Roman Lebedev authored
This reverts commit 8ae83a1b.
-
Simon Pilgrim authored
-
Roman Lebedev authored
The math here is: Cost of 1 load = cost of n loads / n Cost of live loads = num live loads * Cost of 1 load Cost of live loads = num live loads * (cost of n loads / n) Cost of live loads = cost of n loads * (num live loads / n) But, all the variables here are integers, and integer division rounds down, but this calculation clearly expects float semantics. Instead multiply upfront, and then perform round-up-division. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112302
-
Roman Lebedev authored
-
Simon Pilgrim authored
parseFunctionName allowed a default null pointer, despite it being dereferenced immediately to be used as a reference and that all callers were taking the address of an existing reference. Fixes static analyzer warning about potential dereferenced nulls
-
Michał Górny authored
Optimize the iterator comparison logic to compare Current.data() pointers. Use std::tie for assignments from std::pair. Replace the custom class with a function returning iterator_range. Differential Revision: https://reviews.llvm.org/D110535
-
Florian Hahn authored
IRBuilder has been updated to support preserving metdata in a more general manner. This patch adds `LLVMAddMetadataToInst` and deprecates `LLVMSetInstDebugLocation` in favor of the more general function. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D93454
-
Fraser Cormack authored
This patch fixes a codegen bug, the test for which was introduced in D112223. When merging VSETVLIInfo across blocks, if the 'exit' VSETVLIInfo produced by a block is found to be compatible with the VSETVLIInfo computed as the intersection of the 'exit' VSETVLIInfo produced by the block's predecessors, that blocks' 'exit' info is discarded and the intersected value is taken in its place. However, we have one authority on what constitutes VSETVLIInfo compatibility and we are using it in two different contexts. Compatibility is used in one context to elide VSETVLIs between straight-line vector instructions. But compatibility when evaluated between two blocks' exit infos ignores any info produced *inside* each respective block before the exit points. As such it does not guarantee that a block will not produce a VSETVLI which is incompatible with the 'previous' block. As such, we must ensure that any merging of VSETVLIInfo is performed using some notion of "strict" compatibility. I've defined this as a full vtype match, but this is perhaps too pessimistic. Given that test coverage in this regard is lacking -- the only change is in the failing test -- I think this is a good starting point. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112228
-
Chen Zheng authored
-
Chen Zheng authored
This is to improve compiling time. Differential Revision: https://reviews.llvm.org/D112196 Reviewed By: jsji
-
Chuanqi Xu authored
When I playing with Coroutines, I found that it is possible to generate following IR: ``` %struct = alloca ... %sub.element = getelementptr %struct, i64 0, i64 index ; index is not %zero lifetime.marker.start(%sub.element) % use of %sub.element lifetime.marker.end(%sub.element) store %struct to xxx ; %struct is escaping! <suspend points> ``` Then the AllocaUseVisitor would collect the lifetime marker for sub.element and treat it as the lifetime markers of the alloca! So it judges that the alloca could be put on the stack instead of the frame by judging the lifetime markers only. The root cause for the bug is that AllocaUseVisitor collects wrong lifetime markers. This patch fixes this. Reviewed By: lxfind Differential Revision: https://reviews.llvm.org/D112216
-
LLVM GN Syncbot authored
-
Vitaly Buka authored
Transformations may strip the attribute from the argument, e.g. for unused, which will result in shadow offsets mismatch between caller and callee. Stripping noundef for used arguments can be a problem, as TLS is not going to be set by caller. However this is not the goal of the patch and I am not aware if that's even possible. Differential Revision: https://reviews.llvm.org/D112197
-
Stanislav Mekhanoshin authored
In a kernel which does not have calls or AGPR usage we can allocate the whole vector register budget for VGPRs and have no AGPRs as long as VGPRs stay addressable (i.e. below 256). Differential Revision: https://reviews.llvm.org/D111764
-
Nico Weber authored
That way, the headers in llvm/utils/gn/secondary/compiler-rt/include are copied when running `ninja compiler-rt`. (Previously, they were only copied when running `check-hwasan` or when building the compiler-rt/include target.) (Since they should be copied only once, depend on the target in the host toolchain. I think default_toolchain should work just as well, it just needs to be a single fixed toolchain. check-hwasan depends through host_toolchain, so let's use that here too.) Prevents errors like testing/fuzzed_data_provider.h:8:10: fatal error: 'fuzzer/FuzzedDataProvider.h' file not found when building with locally-built clang. (For now, you still have to explicitly build the 'compiler-rt' target. Maybe we should make the clang target depend on that in the GN build?) Differential Revision: https://reviews.llvm.org/D112238
-
Luís Ferreira authored
This patch is a refactor to implement prepend afterwards. Since this changes a lot of files and to conform with guidelines, I will separate this from the implementation of prepend. Related to the discussion in https://reviews.llvm.org/D111414 , so please read it for more context. Reviewed By: #libc_abi, dblaikie, ldionne Differential Revision: https://reviews.llvm.org/D111947
-
Jack Anderson authored
Some dwarf loaders in LLVM are hard-coded to only accept 4-byte and 8-byte address sizes. This patch generalizes acceptance into `DWARFContext::isAddressSizeSupported` and provides a common way to generate rejection errors. The MSP430 target has been given new tests to cover dwarf loading cases that previously failed due to 2-byte addresses. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111953
-
Tom Stellard authored
Does anyone still use these? I want to make some changes to the sphinx html generation and I don't want to have to implement the changes in two places. Reviewed By: sylvestre.ledru, #libc, ldionne Differential Revision: https://reviews.llvm.org/D112030
-
Craig Topper authored
There is no need to return a bool and have an SDValue output parameter. Just return the SDValue and let the caller check if it is null. I have another patch to add more callers of these so I thought I'd clean up the interface first. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112267
-
Craig Topper authored
By expanding early it allows the shifts to be custom lowered in LegalizeVectorOps. Then a DAG combine is able to run on them before LegalizeDAG handles the BUILD_VECTORS for the masks used. v16Xi8 shift lowering on X86 requires a mask to be applied to a v8i16 shift. The BITREVERSE expansion applied an AND mask before SHL ops and after SRL ops. This was done to share the same mask constant for both shifts. It looks like this patch allows DAG combine to remove the AND mask added after v16i8 SHL by X86 lowering. This maintains the mask sharing that BITREVERSE was trying to achieve. Prior to this patch it looks like we kept the mask after the SHL instead which required an extra constant pool or a PANDN to invert it. This is dependent on D112248 because RISCV will end up scalarizing the BSWAP portion of the BITREVERSE expansion if we don't disable BSWAP scalarization in LegalizeVectorOps first. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112254
-
Stanislav Mekhanoshin authored
-
- Oct 21, 2021
-
-
David Blaikie authored
Differential Revision: https://reviews.llvm.org/D112265
-