- Feb 21, 2021
-
-
Kristina Bessonova authored
Currently, if there is a module that contains a strong definition of a global variable and a module that has both a weak definition for the same global and a reference to it, it may result in an undefined symbol error while linking with ThinLTO. It happens because: * the strong definition become internal because it is read-only and can be imported; * the weak definition gets replaced by a declaration because it's non-prevailing; * the strong definition failed to be imported because the destination module already contains another definition of the global yet this def is non-prevailing. The patch adds a check to computeImportForReferencedGlobals() that allows considering a global variable for being imported even if the module contains a definition of it in the case this def has an interposable linkage type. Note that currently the check is based only on the linkage type (and this seems to be enough at the moment), but it might be worth to account the information whether the def is prevailing or not. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D95943
-
Simon Pilgrim authored
This patch handles usubsat patterns hidden through zext/trunc and uses the getTruncatedUSUBSAT helper to determine if the USUBSAT can be correctly performed in the truncated form: zext(x) >= y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit))) zext(x) > y ? x - trunc(y) : 0 --> usubsat(x,trunc(umin(y,SatLimit))) Based on original examples: void foo(unsigned short *p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = *--p; *p = (unsigned short)(m >= max ? m-max : 0); } } Differential Revision: https://reviews.llvm.org/D25987
-
Simon Pilgrim authored
Fixes regression exposed by removing bitcasts across logic-ops in D96206. Differential Revision: https://reviews.llvm.org/D96206
-
Simon Pilgrim authored
Extend the existing combine that handles bitcasting for fp-logic ops to also help remove logic ops across bitcasts to/from the same integer types. This helps improve AVX512 predicate handling for D/Q logic ops and also allows DAGCombine's scalarizeExtractedBinop to remove some annoying gpr->simd->gpr transfers. The concat_vectors regression in pr40891.ll will be addressed in a followup commit on this patch. Differential Revision: https://reviews.llvm.org/D96206
-
Craig Topper authored
Largely copied from AArch64/arm64-xaluo.ll
-
Petr Hosek authored
The special root semantics for identifier-named sections is meant specifically for the metadata sections. In the context of group semantics, where group members are always retained or discarded as a unit, it's natural not to have this semantics apply to a section in a group, otherwise we would never discard the group defeating the purpose of using the group in the first place. This change modifies the GC behavior so that __start_/__stop_ references don't retain C identifier named sections in section groups which allows for these groups to be collected. This matches the behavior of BFD ld. The only kind of existing case that might break is interdependent metadata sections that are all in a group together, but that group doesn't contain any other sections referenced by anything except implicit inclusion in a `__start_` and/or `__stop_`-referenced identifier-named section, but such cases should be unlikely. Differential Revision: https://reviews.llvm.org/D96753
-
Kazu Hirata authored
-
Kazu Hirata authored
-
Jianzhou Zhao authored
-
Brad Smith authored
-
Dave Lee authored
Adjust `ShouldAutoContinue` to be available to any thread plan previous to the plan that explains a stop, not limited to the parent to the plan that explains the stop. Before this change, `Thread::ShouldStop` did the following: 1. find the plan that explains the stop 2. if it's not a master plan, continue processing previous (aka parent) plans 3. first, call `ShouldAutoContinue` on the immediate parent of the explaining plan 4. then loop over previous plans, calling `ShouldStop` and `MischiefManaged` Of note, the iteration in step 4 does not call `ShouldAutoContinue`, so again only the plan just prior to the explaining plan is given the opportunity to override whether to continue or stop. This commit changes the loop call `ShouldAutoContinue`, giving each plan the opportunity to override `ShouldStop` of previous plans. Why? This allows a plan to do the following: 1. mark itself done and be popped off the stack 2. allow parent plans to finish their work, and to also be popped off the stack 3. and finally, have the thread continue, not stop This is useful for stepping into async functions. A plan will would step far enough enough to set a breakpoint on the async target, and then use `ShouldAutoContinue` to unwind the necessary stepping, and then have the calling thread continue. Differential Revision: https://reviews.llvm.org/D97076
-
Jacques Pienaar authored
-
Jacques Pienaar authored
Move over to ODS & use pass options.
-
Nathan James authored
-
- Feb 20, 2021
-
-
Martin Storsjö authored
This makes the symlinks work properly on windows. A similar round of cleanup was done in c41bda7f, but these tests were added after that. Differential Revision: https://reviews.llvm.org/D97089
-
Martin Storsjö authored
The spec doesn't declare it as an enum class, and being declared as an enum class breaks referring to the values as e.g. path::auto_format. Differential Revision: https://reviews.llvm.org/D97084
-
Petr Hosek authored
This can reduce the binary size because counters will no longer occupy space in the binary, instead they will be allocated by dynamic linker. Differential Revision: https://reviews.llvm.org/D97110
-
Stephen Kelly authored
Extend test to verify that it does not match in template instantiations. Differential Revision: https://reviews.llvm.org/D96132
-
Stephen Kelly authored
Update test to note use of lambda instead of the invisible operator(). Differential Revision: https://reviews.llvm.org/D96131
-
Craig Topper authored
[RISCV] Add another test case showing failure to use remw when the RHS has been zero extended from less than i32. NFC
-
Stephen Kelly authored
Diagnose the problem in templates in the context of the template declaration instead of in the context of all of the (possibly very many) template instantiations. Differential Revision: https://reviews.llvm.org/D96224
-
Nikita Popov authored
When one of the inputs is a wrapping range, intersect with the union of the two inputs. The union of the two inputs corresponds to the result we would get if we treated the min/max as a simple select. This fixes PR48643.
-
Sanjay Patel authored
Follow-up to: D96648 / b40fde06 ...for the special-case base calls. From the earlier commit: This is unusual in the general (non-reciprocal) case because we need an extra instruction, but that should be better for general FP reassociation and codegen. We conservatively check for "arcp" FMF here as we do with existing fdiv folds, but it is not strictly necessary to have that.
-
Sanjay Patel authored
-
Nikita Popov authored
We don't need any special handling for wrapping ranges (or empty ranges for that matter). The sub() call will already compute a correct and precise range. We only need to adjust the test expectation: We're now computing an optimal result, rather than an unsigned envelope.
-
AndreyChurbanov authored
Close mutexattr and condattr local objects to eliminate resource leaks. Differential Revision: https://reviews.llvm.org/D96892
-
Craig Topper authored
This adds the IR for this C code int32_t foo(uint16_t x, int16_t y) { x %= y; return x; } Note the dividend is unsigned and the divisor is signed. C type promotion rules will extend them and use a 32-bit srem and the function returns a 32-bit result. We fail to use remw for this case. The zero extended input has enough sign bits, but we won't consider (i64 AssertZext X, i16) in the sexti32 isel pattern. We also end up with a extra shifts to zero upper bits on the result. computeKnownBits knew the result was positive before type legalization and allowed the SIGN_EXTEND to become ZERO_EXTEND. But after promoting to i64 we no longer know that bit 31 (and all bits above it) should be 0.
-
Shilei Tian authored
`sm_35` is the minimum requirement for OpenMP offloading on NVPTX device. Current driver test case is using `sm_20`. D97003 is going to switch the minimum CUDA version to 9.2, which only supports `sm_30+`. This patch makes step for the change. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D97120
-
Stephen Kelly authored
The normalization of matchers means that this now works in all language modes. Differential Revision: https://reviews.llvm.org/D96135
-
Daan De Meyer authored
When compiling UEFI applications, the main function is named efi_main() instead of main(). Let's exclude efi_main() from -Wmissing-prototypes as well to avoid warnings when working on UEFI applications. Differential Revision: https://reviews.llvm.org/D95746
-
Nikita Popov authored
When the optimality check fails, print the inputs, the computed range and the better range that was found. This makes it much simpler to identify the cause of the failure. Make sure that full ranges (which, unlikely all the other cases, have multiple ways to construct them that all result in the same range) only print one message by handling them separately.
-
Nico Weber authored
See discussion on https://reviews.llvm.org/D93263 -flat_namespace isn't implemented yet, and neither is -undefined dynamic, so this makes -undefined pretty pointless in lld/MachO for now. But once we implement -flat_namespace (which we need to do anyways to get check-llvm to pass with lld as host linker), the code's already there. Follow-up to https://reviews.llvm.org/D93263#2491865 Differential Revision: https://reviews.llvm.org/D96963
-
Stephen Kelly authored
Differential Revision: https://reviews.llvm.org/D97095
-
Teresa Johnson authored
Refines the fix in 3c4c2050 to only put globals whose defs were cloned into the split regular LTO module on the cloned llvm*.used globals. This avoids an issue where one of the attached values was a local that was promoted in the original module after the module was cloned. We only need to have the values defined in the new module on those globals. Fixes PR49251. Differential Revision: https://reviews.llvm.org/D97013
-
Shilei Tian authored
Same script as D95318. Test files are excluded. Reviewed By: AndreyChurbanov Differential Revision: https://reviews.llvm.org/D97088
-
Stephen Kelly authored
This reverts commit 9148302a (2019-08-22) which broke the pre-existing unit test for the matcher. Also revert commit 518b2266 (Fix the nullPointerConstant() test to get bots back to green., 2019-08-22) which incorrectly changed the test to expect the broken behavior. Differential Revision: https://reviews.llvm.org/D96665
-
Fraser Cormack authored
This patch extends the support for RVV EXTRACT_SUBVECTOR to cover those which don't align to a vector register boundary. It accomplishes this by extracting the nearest register-sized subvector (a subregister operation), then sliding the vector down with VSLIDEDOWN and extracting the subvector from the first position (a COPY operation). Since this procedure involves the use of VSCALE and multiplication, the handling of such operations is done during lowering to simplify the implementation and make use of DAG combining. This necessitated moving some helper functions from RISCVISelDAGToDAG to RISCVTargetLowering. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D96959
-
Fraser Cormack authored
With vector mask registers only allocatable to V0 (VMV0Regs) it is relatively simple to generate code which uses multiple masks and naively requires spilling. This patch aims to improve codegen in such cases by telling LLVM it can use VRRegs to hold masks. This will prevent spilling in many cases by having LLVM copy to an available VR register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D97055
-
David Zarzycki authored
-
Simon Pilgrim authored
recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time. This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created). Differential Revision: https://reviews.llvm.org/D97056
-