- Mar 18, 2020
-
-
Florian Hahn authored
The latest improvements to VPValue printing make this mapping clear when printing the operand. Printing the mapping separately is not required any longer. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76375
-
Florian Hahn authored
Now that printing VPValues uses the underlying IR value name, if available, recording the underlying value here improves printing. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76374
-
Sanjay Patel authored
-
Sanjay Patel authored
This is copied from the suggested text by @regehr in: https://bugs.llvm.org/show_bug.cgi?id=20895 The way forward was not clear for several years, but now that we have 'freeze' and Alive2, the behavior should be documented. Also see comments in D76332.
-
Simon Pilgrim authored
-
Eli Friedman authored
The existence of the class is more confusing than helpful, I think; the commonality is mostly just "GEP is legal", which can be queried using APIs on GetElementPtrInst. Differential Revision: https://reviews.llvm.org/D75660
-
Sanjay Patel authored
-
Craig Topper authored
[SelectionDAGBuilder][FPEnv] Take into account SelectionDAG continuous CSE when setting the nofpexcept flag for constrained intrinsics SelectionDAG CSEs nodes based on their result type and operands, but not their flags. The flags are expected to be intersected when they are CSEd. In SelectionDAGBuilder, for FP nodes we manage both the fast math flags and the nofpexcept flag after the nodes have already been CSEd when they were created with getNode. The management of the fastmath flags before the constrained nodes prevents the nofpexcept management from working correctly. This commit moves the FMF handling for constrained intrinsics into their visitor and disables the common FMF handling for these nodes. Differential Revision: https://reviews.llvm.org/D75224
-
Simon Pilgrim authored
-
Simon Pilgrim authored
-
Simon Pilgrim authored
-
lewis-revill authored
This patch generates TableGen descriptions for the specified register banks which contain a list of register sizes corresponding to the available HwModes. The appropriate size is used during codegen according to the current HwMode. As this HwMode was not available on generation, it is set upon construction of the RegisterBankInfo class. Targets simply need to provide the HwMode argument to the <target>GenRegisterBankInfo constructor. The RISC-V RegisterBankInfo constructor has been updated accordingly (plus an unused argument removed). Differential Revision: https://reviews.llvm.org/D76007
-
lewis-revill authored
This patch rewrites the RegisterBankEmitter class to derive RegisterClassHierarchy from CodeGenTarget::getRegBank() rather than constructing our own copy. All are now accessed through a const reference. Differential Revision: https://reviews.llvm.org/D76006
-
Eli Friedman authored
This is fixing up various places that use the implicit TypeSize->uint64_t conversion. The new overloads in MemoryLocation.h are already used in various places that construct a MemoryLocation from a TypeSize, including MemorySSA. (They were using the implicit conversion before.) Differential Revision: https://reviews.llvm.org/D76249
-
Simon Pilgrim authored
[ValueTracking] Add computeKnownBits DemandedElts support to EXTRACTELEMENT/OR/BSWAP/BITREVERSE instructions (PR36319) These are all covered by the bswap/bitreverse vector tests.
-
Nemanja Ivanovic authored
As pointed out in https://bugs.llvm.org/show_bug.cgi?id=45232 this code can end up shifting a 64-bit unsigned value left by 64 bits. Althought this works as expected on some platforms it is definitely UB. This patch removes the UB and adds the associated test case. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45232
-
Simon Pilgrim authored
Fixes deprecation warning in EXPENSIVE_CHECKS builds.
-
Simon Pilgrim authored
Shows missing DemandedElts support (PR36319)
-
Jessica Paquette authored
This ports some combines from DAGCombiner.cpp which perform some trivial transformations on instructions with undef operands. Not having these can make it extremely annoying to find out where we differ from SelectionDAG by looking at existing lit tests. Without them, we tend to produce pretty bad code generation when we run into instructions which use undef operands. Also remove the nonpow2_store_narrowing testcase from arm64-fallback.ll, since we no longer fall back on the add. Differential Revision: https://reviews.llvm.org/D76339
-
Jakub Kuderski authored
Reviewers: asbirlea, brzycki, NutshellySima, grosser Reviewed By: asbirlea, NutshellySima Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76340
-
Jin Lin authored
Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019. Reviewers: aschwaighofer, tellenbach, paquette Reviewed By: paquette Subscribers: tellenbach, hiraditya, llvm-commits, jinlin Tags: #llvm Differential Revision: https://reviews.llvm.org/D71027
-
Florian Hahn authored
When the an underlying value is available, we can use its name for printing, as discussed in D73078. Reviewers: rengolin, hsaito, Ayal, gilr Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D76200
-
Simon Tatham authored
Summary: This is another set of instructions too complicated to be sensibly expressed in IR by anything short of a target-specific intrinsic. Given input vectors a,b, the instruction generates intermediate values 2*(a[0]*b[0]+a[1]+b[1]), 2*(a[2]*b[2]+a[3]+b[3]), etc; takes the high half of each double-width values, and overwrites half the lanes in the output vector c, which you therefore have to provide the input value of. Optionally you can swap the elements of b so that the are things like a[0]*b[1]+a[1]*b[0]; optionally you can round to nearest when taking the high half; and optionally you can take the difference rather than sum of the two products. Finally, saturation is applied when converting back to a single-width vector lane. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76359
-
Nico Weber authored
-
Sam Parker authored
Run the update script on one of the loop unroll tests.
-
Matt Arsenault authored
This isn't really usable, and requires using the -amdgpu-fixed-function-abi flag to work. Assumes a uniform call target, and will hit a verifier error if the call target ends up in a VGPR. Also doesn't attempt to do anything sensible for the reported register/stack usage.
-
Matt Arsenault authored
This reverts commit 9bca8fc4. Rearrange handling to avoid changing the instruction in the case where it's going to be erased and replaced with undef.
-
Piotr Sobczak authored
Summary: For the case where "done" bits on existing exports are removed by unifyReturnBlockSet(), unify all return blocks - even the uniformly reached ones. We do not want to end up with a non-unified, uniformly reached block containing a normal export with the "done" bit cleared. That case is believed to be rare - possible with infinite loops in pixel shaders. This is a fix for D71192. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76364
-
Nico Weber authored
-
Simon Pilgrim authored
-
Nico Weber authored
This reverts commit 4060016f and re-merges c5b81466.
-
Sander de Smalen authored
Avoid transforming: %0 = bitcast i8* %base to <vscale x 16 x i8>* %1 = getelementptr <vscale x 16 x i8>, <vscale x 16 x i8>* %0, i64 1 into: %0 = getelementptr i8, i8* %base, i64 16 %1 = bitcast i8* %0 to <vscale x 16 x i8>* Reviewers: efriedma, ctetreau Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D76236
-
Chris Bowler authored
This is the first of a series of patches that adds caller support for by-value arguments. This patch add support for arguments that are passed in a single GPR. There are 3 limitation cases: -The by-value argument is larger than a single register. -There are no remaining GPRs even though the by-value argument would otherwise fit in a single GPR. -The by-value argument requires alignment greater than register width. Future patches will be required to add support for these cases as well as for the callee handling (in LowerFormalArguments_AIX) that corresponds to this work. Differential Revision: https://reviews.llvm.org/D75863
-
Simon Pilgrim authored
[InstCombine][X86] Add additional demandedelts style test for in-range variable per-element shift amounts (PR40391) If we've shuffled the shift amount some of the (undemanded) elements may have become undef - this should be handled by the missing support in PR36319.
-
Mehdi Amini authored
-
Mehdi Amini authored
The constructor of Expected<T> expects as T&&, but gcc-7.5 does not infer an rvalue in this context apparently.
-
Roman Lebedev authored
Summary: As noted in [[ https://bugs.llvm.org/show_bug.cgi?id=45201 | PR45201 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=10090 | PR10090 ]] SCEV doesn't always avoid recursive algorithms, and that causes issues with large expression depths and/or smaller stack sizes. In `SCEVExpander::isHighCostExpansion*()` case, the refactoring to avoid recursion is rather idiomatic. We simply need to place the root expr into a vector, and iterate over vector elements accounting for the cost of each one, adding new exprs at the end of the vector, thus achieving recursion-less traversal. The order in which we will visit exprs doesn't matter here, so we will be fine with the most basic approach of using SmallVector and inserting/extracting from the back, which accidentally is the same depth-first traversal that we were doing previously recursively. Reviewers: mkazantsev, reames, wmi, ekatz Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76273
-
Oliver Stannard authored
When optimising for code size at the expense of performance, it is often worth saving and restoring some of r0-r3, if IPRA will be able to take advantage of them. This doesn't cost any extra code size if we already have a PUSH/POP pair, and increases the number of available registers across any calls to the function. We already have an optimisation which tries fold the subtract/add of the SP into the PUSH/POP by using extra registers, which somewhat conflicts with this. I've made the new optimisation less aggressive in cases where the existing one is likely to trigger, which gives better results than either of these optimisations by themselves. Differential revision: https://reviews.llvm.org/D69936
-
Guillaume Chatelet authored
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76348
-
Kang Zhang authored
-