- May 13, 2020
-
-
David Green authored
Under MVE a vdup will always take a gpr register, not a floating point value. During DAG combine we convert the types to a bitcast to an integer in an attempt to fold the bitcast into other instructions. This is OK, but only works inside the same basic block. To do the same trick across a basic block boundary we need to convert the type in codegenprepare, before the splat is sunk into the loop. This adds a convertSplatType function to codegenprepare to do that, putting bitcasts around the splat to force the type to an integer. There is then some adjustment to the code in shouldSinkOperands to handle the extra bitcasts. Differential Revision: https://reviews.llvm.org/D78728
-
Carl Ritson authored
Summary: When removing barrier edges on exports then dependencies need to be propagated. Reviewers: foad Reviewed By: foad Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79855
-
David Green authored
Similar to fmul/fadd, we can sink a splat into a loop containing a fma in order to use more register instruction variants. For that there are also adjustments to the sinking code to handle more than 2 arguments. Differential Revision: https://reviews.llvm.org/D78386
-
Pierre-vh authored
This patch adds a new TTI hook to allow targets to tell LSR that a chain including some instruction is already profitable and should not be optimized. This patch also adds an implementation of this TTI hook for ARM so LSR doesn't optimize chains that include the VCTP intrinsic. Differential Revision: https://reviews.llvm.org/D79418
-
Simon Wallis authored
Summary: In the assembler or inline assembler, attempting to use an invalid fixup type gives a crash with a segmentation fault. __attribute__((naked)) void foo(void) { __asm__("mov r9, :lower16:bar(prel31)"); } This should give a proper error message when building for ARM or Thumb. This brings it in line with AARCH64. This fixes all 8 instances of llvm_unreachable("Unsupported Modifier"); in ARM/MCTargetDesc/ARMELFObjectWriter.cpp. A test is provided for each instance. Reviewers: llvm-commits, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, danielkiss Tags: #llvm Differential Revision: https://reviews.llvm.org/D79782 Change-Id: I6971ba37f129cc453568fe71514ccb2ac9d16831
-
Sjoerd Meijer authored
This was reverted because of a miscompilation. At closer inspection, the problem was actually visible in a changed llvm regression test too. This one-line follow up fix/recommit will splat the IV, which is what we are trying to avoid if unnecessary in general, if tail-folding is requested even if all users are scalar instructions after vectorisation. Because with tail-folding, the splat IV will be used by the predicate of the masked loads/stores instructions. The previous version omitted this, which caused the miscompilation. The original commit message was: If tail-folding of the scalar remainder loop is applied, the primary induction variable is splat to a vector and used by the masked load/store vector instructions, thus the IV does not remain scalar. Because we now mark that the IV does not remain scalar for these cases, we don't emit the vector IV if it is not used. Thus, the vectoriser produces less dead code. Thanks to Ayal Zaks for the direction how to fix this.
-
Ehud Katz authored
This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037
-
Dmitry Preobrazhensky authored
See bug 45830: https://bugs.llvm.org/show_bug.cgi?id=45830 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D79585
-
Sourabh Singh Tomar authored
This patch extends DIModule Debug metadata in LLVM to support Fortran modules. DIModule is extended to contain File and Line fields, these fields will be used by Flang FE to create debug information necessary for representing Fortran modules at IR level. Furthermore DW_TAG_module is also extended to contain these fields. If these fields are missing, debuggers like GDB won't be able to show Fortran modules information correctly. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D79484
-
Yevgeny Rouban authored
Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396
-
Qiu Chaofan authored
xsnegdp, xsabsdp and xsnabsdp can be used to operate on f32 operand. This patch adds the missing patterns since we prefer VSX instructions when available. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D75344
-
Qiu Chaofan authored
Legalizer should respect both command-line options or SDNode-level fast-math flags. Also, this patch propagates other flags during custom simplifying. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D79074
-
Kang Zhang authored
Summary: The ppc-early-ret pass use the addReg() to add operand to the new instruction, it can't reserve the flag of old operand. This has caused machine verfications failed. This patch use add() to instead of addReg(). Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77997
-
KAWASHIMA Takahiro authored
Fixes PR41696 The loop-reroll pass generates an invalid IR (or its assertion fails in debug build) if values of the base instruction and other root instructions (terms used in the loop-reroll pass) are used outside the loop block. See IRs written in PR41696 as examples. The current implementation of the loop-reroll pass can reroll only loops that don't have values that are used outside the loop, except reduced values (the last values of reduction chains). This is described in the comment of the `LoopReroll::reroll` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600 This is checked in the `LoopReroll::DAGRootTracker::validate` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393 However, the base instruction and other root instructions skip this check in the validation loop. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229 Moving the check in front of the skip is the logically simplest fix. However, inserting the check in an earlier stage is better in terms of compilation time of unrerollable loops. This fix inserts the check for the base instruction into the function to validate possible base/root instructions. Check for other root instructions is unnecessary because they don't match any base instructions if they have uses outside the loop. Differential Revision: https://reviews.llvm.org/D79549
-
Johannes Doerfert authored
For AAReturnedValues we treated new and existing information differently in the updateImpl. Only the latter was properly analyzed and categorized. The former was thought to be analyzed in the subsequent update. Since the Attributor does not support "self-updates" we need to make sure the state is "stable" after each updateImpl invocation. That is, if the surrounding information does not change, the state is valid. Now we make sure all return values have been handled and properly categorized each iteration. We might not update again if we have not requested a non-fix attribute so we cannot "wait" for the next update to analyze a new return value. Bug reported by @sdmitriev.
-
Juneyoung Lee authored
Summary: This fixes PR45885 by fixing isGuaranteedNotToBeUndefOrPoison so it does not look into dominating branch conditions of V when V is an instruction in an unreachable block. Reviewers: spatel, nikic, lebedev.ri Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79790
-
Zequan Wu authored
We want to add a way to avoid merging identical calls so as to keep the separate debug-information for those calls. There is also an asan usecase where having this attribute would be beneficial to avoid alternative work-arounds. Here is the link to the feature request: https://bugs.llvm.org/show_bug.cgi?id=42783. `nomerge` is different from `noline`. `noinline` prevents function from inlining at callsites, but `nomerge` prevents multiple identical calls from being merged into one. This patch adds `nomerge` to disable the optimization in IR level. A followup patch will be needed to let backend understands `nomerge` and avoid tail merge at backend. Reviewed By: asbirlea, rnk Differential Revision: https://reviews.llvm.org/D78659
-
Stanislav Mekhanoshin authored
We can produce such vectors in the Promote Alloca pass, but we are unable to use movrel to operate it and lower via scratch. Making it legal makes SI_INDIRECT patterns work. There is more work to do in subsequent changes: 1. We initialize m0 twice to access each dword. It shall be possible to only do it once and increment base register number instead. 2. We also need v16i64/v16f64 but these first need to be added to tablegen. Differential Revision: https://reviews.llvm.org/D79808
-
Jan Korous authored
Differential Revision: https://reviews.llvm.org/D79809
-
Sanjay Patel authored
SDAG suffers when it can't see that a funnel operand is a splat value (due to single-basic-block visibility), so invert the normal loop hoisting rules to move a splat op closer to its use. This would be part 1 of an enhancement similar to D63233. This is needed to re-fix PR37426: https://bugs.llvm.org/show_bug.cgi?id=37426 ...because we got better at canonicalizing IR to funnel shift intrinsics. The existing CGP code for shift opcodes is likely overstepping what it was intended to do, so that will be fixed in a follow-up. Differential Revision: https://reviews.llvm.org/D79718
-
Alexey Lapshin authored
-
Justin Hibbits authored
Summary: The SPE doesn't have a 'fma' instruction, so the intrinsic becomes a libcall. It really should become an expansion to two instructions, but for some reason the compiler doesn't think that's as optimal as a branch. Since this lowering is done after CTR is allocated for loops, tell the optimizer that CTR may be used in this case. This prevents a "Invalid PPC CTR loop!" assertion in the case that a fma() function call is used in a C/C++ file, and clang converts it into an intrinsic. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D78668
-
Alexey Lapshin authored
-
- May 12, 2020
-
-
Alexey Lapshin authored
Summary: This patch refactors handling of VarArgs in X86TargetLowering::LowerFormalArguments. That refactoring was requested while reviewing D69372. Code related to varargs handling is removed from X86TargetLowering::LowerFormalArguments and is divided into smaller routines. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D74794
-
Fangrui Song authored
[TargetLoweringObjectFileImpl] Produce .text.hot. instead of .text.hot for -fno-unique-section-names GNU ld's internal linker script uses (https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=add44f8d5c5c05e08b11e033127a744d61c26aee) .text : { *(.text.unlikely .text.*_unlikely .text.unlikely.*) *(.text.exit .text.exit.*) *(.text.startup .text.startup.*) *(.text.hot .text.hot.*) *(SORT(.text.sorted.*)) *(.text .stub .text.* .gnu.linkonce.t.*) /* .gnu.warning sections are handled specially by elf.em. */ *(.gnu.warning) } Because `*(.text.exit .text.exit.*)` is ordered before `*(.text .text.*)`, in a -ffunction-sections build, the C library function `exit` will be placed before other functions. gold's `-z keep-text-section-prefix` has the same problem. In lld, `-z keep-text-section-prefix` recognizes `.text.{exit,hot,startup,unlikely,unknown}.*`, but not `.text.{exit,hot,startup,unlikely,unknown}`, to avoid the strange placement problem. In -fno-function-sections or -fno-unique-section-names mode, a function whose `function_section_prefix` is set to `.exit"` will go to the output section `.text` instead of `.text.exit` when linked by lld. To address the problem, append a dot to become `.text.exit.` Reviewed By: grimar Differential Revision: https://reviews.llvm.org/D79600
-
Sergey Dmitriev authored
Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: jdoerfert Subscribers: hiraditya, uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79801
-
David Blaikie authored
Based on @djtodoro's 2552dc53
-
Kamau Bridgeman authored
This patch folds redundant load immediates into a zero for instructions which recognise this as the value zero and not the register. If the load immediate is no longer in use it is then deleted. This is already done in earlier passes but the ppc-mi-peephole allows for a more general implementation. Differential Revision: https://reviews.llvm.org/D69168
-
Juneyoung Lee authored
Summary: This patch makes propagatesPoison be more accurate by returning true on more bin ops/unary ops/casts/etc. The changed test in ScalarEvolution/nsw.ll was introduced by https://github.com/llvm/llvm-project/commit/a19edc4d15b0dae0210b90615775edd76f021008 . IIUC, the goal of the tests is to show that iv.inc's SCEV expression still has no-overflow flags even if the loop isn't in the wanted form. It becomes more accurate with this patch, so think this is okay. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, sanjoy Reviewed By: spatel, nikic Subscribers: regehr, nlopes, efriedma, fhahn, javed.absar, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D78615
-
Craig Topper authored
We have a couple main strategies for legalizing MULH. -If the vXi16 type is legal, extend to do the full i16 multiply and then shift and truncate the results. -Use unpcks to split each 128 bit lane into high and low halves.a For signed we have an extra case to split a v32i8 to v16i8 and then use the extending to v16i16 strategy. This patch proposes to use the unpck strategy instead. Which is what we already do for unsigned. This seems to be 1 instruction shorter when the RHS is constant like the idiv case. It's 1 instruction longer for the smulo case. But we're trading cross lane shuffles for inlane shuffles and a shift. Differential Revision: https://reviews.llvm.org/D79652
-
Dimitry Andric authored
Summary: In 2e24219d, a number of ARM pcrel fixups were resolved at assembly time, to solve PR44929. This only covered little-endian ARM however, so add similar fixups for big-endian ARM. Also extend the test case to cover big-endian ARM. Reviewers: hans, psmith, MaskRay Reviewed By: psmith, MaskRay Subscribers: kristof.beyls, hiraditya, danielkiss, emaste, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79774
-
Austin Kerbow authored
Differential Revision: https://reviews.llvm.org/D79761
-
Craig Topper authored
-
Thomas Lively authored
Summary: As proposed in https://github.com/WebAssembly/simd/pull/122. Since these instructions are not yet merged to the SIMD spec proposal, this patch makes them entirely opt-in by surfacing them only through LLVM intrinsics and clang builtins. If these instructions are made official, these intrinsics and builtins should be replaced with simple instruction patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79742
-
Fangrui Song authored
gcov 4.8 (r189778) moved the exit block from the last to the second. The .gcda format is compatible with 4.7 but * decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings, and print wrong `" returned %s` for branch statistics (-b). * decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues. Also, rename "return block" to "exit block" because the latter is the appropriate term.
-
Whitney Tsang authored
Summary: As commented in the code, ProfileSummaryAnalysis is required for inliner pass to query, so this patch moved RequireAnalysisPass<ProfileSummaryAnalysis> in the recently created buildInlinerPipeline. Reviewer: mtrofin, davidxl, tejohnson, dblaikie, jdoerfert, sstefan1 Reviewed By: mtrofin, davidxl, jdoerfert Subscribers: hiraditya, steven_wu, dexonsmith, wuzish, llvm-commits, jsji Tag: LLVM Differential Revision: https://reviews.llvm.org/D79696
-
Jay Foad authored
Summary: ConstantExprs involving operations on <1 x Ty> could translate into MIR that failed to verify with: *** Bad machine code: Reading virtual register without a def *** The problem was that translate(const Constant &C, Register Reg) had recursive calls that passed the same Reg in for the translation of a subexpression, but without updating VMap for the subexpression first as translate(const Constant &C, Register Reg) expects. Fix this by using the same translateCopy helper function that we use for translating Instructions. In some cases this causes extra G_COPY MIR instructions to be generated. Fixes https://bugs.llvm.org/show_bug.cgi?id=45576 Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78378
-
Jay Foad authored
Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78377
-
James Y Knight authored
-
Carl Ritson authored
Summary: Modify export clustering DAG mutation to move position exports before other exports types. Reviewers: foad, arsenm, rampitec, nhaehnle Reviewed By: foad Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79670
-