- May 12, 2020
-
-
David Blaikie authored
Based on @djtodoro's 2552dc53
-
zoecarver authored
Implements P0414R2: * Adds support for array types in std::shared_ptr. * Adds reinterpret_pointer_cast for shared_ptr. Re-committing now that the leaking tests are fixed. Differential Revision: https://reviews.llvm.org/D62259
-
Kamau Bridgeman authored
This patch folds redundant load immediates into a zero for instructions which recognise this as the value zero and not the register. If the load immediate is no longer in use it is then deleted. This is already done in earlier passes but the ppc-mi-peephole allows for a more general implementation. Differential Revision: https://reviews.llvm.org/D69168
-
Jonas Devlieghere authored
While debugging why TestProcessList.py failed during passive replay, I remembered that we don't serialize the arguments for ProcessInfo. This is necessary to make the test pass and to make platform process list -v behave the same during capture and replay. Differential revision: https://reviews.llvm.org/D79646
-
Jan Korous authored
Differential Revision: https://reviews.llvm.org/D78961
-
Juneyoung Lee authored
Summary: This patch makes propagatesPoison be more accurate by returning true on more bin ops/unary ops/casts/etc. The changed test in ScalarEvolution/nsw.ll was introduced by https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 . IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has no-overflow flags even if the loop isn't in the wanted form. It becomes more accurate with this patch, so think this is okay. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy Reviewed By: spatel, nikic Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D78615
-
Craig Topper authored
We have a couple main strategies for legalizing MULH. -If the vXi16 type is legal, extend to do the full i16 multiply and then shift and truncate the results. -Use unpcks to split each 128 bit lane into high and low halves.a For signed we have an extra case to split a v32i8 to v16i8 and then use the extending to v16i16 strategy. This patch proposes to use the unpck strategy instead. Which is what we already do for unsigned. This seems to be 1 instruction shorter when the RHS is constant like the idiv case. It's 1 instruction longer for the smulo case. But we're trading cross lane shuffles for inlane shuffles and a shift. Differential Revision: https://reviews.llvm.org/D79652
-
Dimitry Andric authored
Summary: In 2e24219d, a number of ARM pcrel fixups were resolved at assembly time, to solve PR44929. This only covered little-endian ARM however, so add similar fixups for big-endian ARM. Also extend the test case to cover big-endian ARM. Reviewers: hans, psmith, MaskRay Reviewed By: psmith, MaskRay Subscribers: kristof.beyls, hiraditya, danielkiss, emaste, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79774
-
Austin Kerbow authored
Differential Revision: https://reviews.llvm.org/D79761
-
Craig Topper authored
-
Sanjay Patel authored
-
Thomas Lively authored
Summary: As proposed in https://github.com/WebAssembly/simd/pull/122. Since these instructions are not yet merged to the SIMD spec proposal, this patch makes them entirely opt-in by surfacing them only through LLVM intrinsics and clang builtins. If these instructions are made official, these intrinsics and builtins should be replaced with simple instruction patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79742
-
Fangrui Song authored
-
Fangrui Song authored
gcov 4.8 (r189778) moved the exit block from the last to the second. The .gcda format is compatible with 4.7 but * decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings, and print wrong `" returned %s` for branch statistics (-b). * decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues. Also, rename "return block" to "exit block" because the latter is the appropriate term.
-
Whitney Tsang authored
Summary: As commented in the code, ProfileSummaryAnalysis is required for inliner pass to query, so this patch moved RequireAnalysisPass<ProfileSummaryAnalysis> in the recently created buildInlinerPipeline. Reviewer: mtrofin, davidxl, tejohnson, dblaikie, jdoerfert, sstefan1 Reviewed By: mtrofin, davidxl, jdoerfert Subscribers: hiraditya, steven_wu, dexonsmith, wuzish, llvm-commits, jsji Tag: LLVM Differential Revision: https://reviews.llvm.org/D79696
-
Jay Foad authored
Summary: ConstantExprs involving operations on <1 x Ty> could translate into MIR that failed to verify with: *** Bad machine code: Reading virtual register without a def *** The problem was that translate(const Constant &C, Register Reg) had recursive calls that passed the same Reg in for the translation of a subexpression, but without updating VMap for the subexpression first as translate(const Constant &C, Register Reg) expects. Fix this by using the same translateCopy helper function that we use for translating Instructions. In some cases this causes extra G_COPY MIR instructions to be generated. Fixes https://bugs.llvm.org/show_bug.cgi?id=45576 Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78378
-
Jay Foad authored
Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78377
-
Florian Hahn authored
We should check non-dependent element types before creating a DependentSizedMatrixType. Otherwise we do not generate an error message for dependent-sized matrix types with invalid non-dependent element types, if the template is never instantiated. See the make5 struct in the tests. It also moves the SEMA template tests to clang/test/SemaTemplate/matrix-type.cpp and introduces a few more test cases.
-
Michael Kruse authored
Changed the language in LLVM_USE_LINKER to more strongly recommend LLD and to specify that the GNU gold linker is only useful if LLD is unavailable in binary form and it is the first build of LLVM. Added that LLD will help when used on ELF-based platforms. Corrected information in CMAKE_BUILD_TYPE regarding the Release build type and enabling assertions. Added option LLVM_ENABLE_ASSERTIONS and mentioned enabling this option with a Release build as an alternative to using a Debug build. Specified that the LLVM_OPTIMIZED_TABLEGEN option is only for Debug builds, that the LLVM_USE_SPLIT_DWARF option is only available on ELF host platforms, and that setting CLANG_ENABLE_STATIC_ANALYZER to OFF only slightly improves build time. These changes address comments made in D75425. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D77346
-
Jez Ng authored
This unblocks the linking of real programs, since many core system functions are only available as sub-libraries of libSystem. Differential Revision: https://reviews.llvm.org/D79228
-
James Y Knight authored
-
Haojian Wu authored
Reviewers: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79701
-
Carl Ritson authored
Summary: Modify export clustering DAG mutation to move position exports before other exports types. Reviewers: foad, arsenm, rampitec, nhaehnle Reviewed By: foad Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79670
-
Matt Arsenault authored
Merge with the new --rocm-path handling used for OpenCL. This looks for a usable set of device libraries upfront, rather than giving a generic "no such file or directory error". If any of the required bitcode libraries are missing, this will now produce a "cannot find ROCm installation." error. This differs from the existing hip specific flags by pointing to a rocm root install instead of a single directory with bitcode files. This tries to maintain compatibility with the existing the --hip-device-lib and --hip-device-lib-path flags, as well as the HIP_DEVICE_LIB_PATH environment variable, or at least the range of uses with testcases. The existing range of uses and behavior doesn't entirely make sense to me, so some of the untested edge cases change behavior. Currently the two path forms seem to have the double purpose of a search path for an arbitrary --hip-device-lib, and for finding the stock set of libraries. Since the stock set of libraries This also changes the behavior when multiple paths are specified, and only takes the last one (and the environment variable only handles a single path). If --hip-device-lib is used, it now only treats --hip-device-lib-path as the search path for it, and does not attempt to find the rocm installation. If not, --hip-device-lib-path and the environment variable are used as the directory to search instead of the rocm root based path. This should also automatically fix handling of the options to use wave64.
-
Matt Arsenault authored
The current install situation is a mess, but I'm working on fixing it. Search for the target layout instead of one of the N options that exist today.
-
Reid Kleckner authored
The variable renaming change did not handle this variable well.
-
Benjamin Kramer authored
This avoids unused variable warnings in Release builds.
-
Kristof Beyls authored
Differential Revision: https://reviews.llvm.org/D79623
-
Sam McCall authored
This reverts commit 80d133b2. Per Stephan Herhut: The canonicalizer pattern that was added creates forms of the subview op that cannot be lowered. This is shown by failing Tensorflow XLA tests such as: tensorflow/compiler/xla/service/mlir_gpu/tests:abs.hlo.test Will provide more details offline, they rely on logs from private CI.
-
Melanie Blower authored
Summary: Erroneous error diagnostic observed in VS2017 <numeric> header Also correction to propagate usesFPIntrin from template func to instantiation. Reviewers: rjmccall, erichkeane (no feedback received) Differential Revision: https://reviews.llvm.org/D79631
-
Simon Pilgrim authored
narrowShuffleMaskElts already has the fast-path for scale == 1, no need to reimplement it here.
-
Yaxun (Sam) Liu authored
recommit c77a4078 with fix https://reviews.llvm.org/D77954 caused regressions due to diagnostics in implicit host device functions. For now, it seems the most feasible workaround is to treat implicit host device function and explicit host device function differently. Basically in device compilation for implicit host device functions, keep the old behavior, i.e. give host device candidates and wrong-sided candidates equal preference. For explicit host device functions, favor host device candidates against wrong-sided candidates. The rationale is that explicit host device functions are blessed by the user to be valid host device functions, that is, they should not cause diagnostics in both host and device compilation. If diagnostics occur, user is able to fix them. However, there is no guarantee that implicit host device function can be compiled in device compilation, therefore we need to preserve its overloading resolution in device compilation. Differential Revision: https://reviews.llvm.org/D79526
-
Sam Parker authored
Don't use truncs are users because sometimes they're free too.
-
Simon Pilgrim authored
Last part of PR22984 - avoid the zero-register dependency if optimizing for size
-
Simon Pilgrim authored
-
Simon Pilgrim authored
Added explicit StringRef.h include as we need the full definition for several inline functions in DebugCounter.h.
-
Pierre-vh authored
getARMVPTBlockMask was an outdated function that only handled basic block masks: T, TT, TTT and TTTT. This worked fine before the MVE VPT Block Insertion Pass improvements as it was the only kind of masks that it could generate, but now it can generate more complex masks that uses E predicates, so it's dangerous to use that function to calculate VPT/VPST block masks. I replaced it with 2 different functions: - expandPredBlockMask, in ARMBaseInfo. This adds an "E" or "T" at the end of an existing PredBlockMask. - recomputeVPTBlockMask, in Thumb2InstrInfo. This takes an iterator to a VPT/VPST instruction and recomputes its block mask by looking at the predicated instructions that follows it. This should be used to recompute a block mask after removing/adding a predicated instruction to the block. The expandPredBlockMask function is pretty much imported from the MVE VPT Blocks pass. I had to change the ARMLowOverheadLoops and MVEVPTBlocks passes as well so they could use these new functions. Differential Revision: https://reviews.llvm.org/D78201
-
Pierre-vh authored
Differential Revision: https://reviews.llvm.org/D76847
-
David Zarzycki authored
Operating systems are best effort by default, so we cannot assume that sleep-like APIs return as soon as we'd like. Even if a sleep-like API returns when we want it to, the potential for preemption means that attempts to measure time are subject to delays.
-