- Feb 03, 2018
-
-
David Green authored
This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 324174
-
- Feb 02, 2018
-
-
Sanjay Patel authored
Without extra instructions and uses, swapMayExposeCSEOpportunities() would change the icmp (as seen in the check lines), so we were not actually testing patterns that should be handled by D41480. llvm-svn: 324143
-
Sanjay Patel authored
llvm-svn: 324109
-
- Feb 01, 2018
-
-
Sanjay Patel authored
This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value. AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled. Differential Revision: https://reviews.llvm.org/D42739 llvm-svn: 324014
-
David Green authored
Looks like it's causing timeouts out on at least ppc64le buildbots. llvm-svn: 323959
-
David Green authored
This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 323951
-
- Jan 31, 2018
-
-
Marek Olsak authored
Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
-
Sanjay Patel authored
llvm-svn: 323882
-
Sanjay Patel authored
llvm-svn: 323881
-
- Jan 26, 2018
-
-
Vedant Kumar authored
A cast from A to B is eliminable if its result is casted to C, and if the pair of casts could just be expressed as a single cast. E.g here, %c1 is eliminable: %c1 = zext i16 %A to i32 %c2 = sext i32 %c1 to i64 InstCombine optimizes away eliminable casts. This patch teaches it to insert a dbg.value intrinsic pointing to the final result, so that local variables pointing to the eliminable result are preserved. Differential Revision: https://reviews.llvm.org/D42566 llvm-svn: 323570
-
- Jan 25, 2018
-
-
Sanjay Patel authored
This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437
-
Sanjay Patel authored
llvm-svn: 323436
-
- Jan 24, 2018
-
-
Sanjay Patel authored
The only part of the datalayout that should matter for these tests is the part that specifies the legal int widths ('n*'). But there was a bug - that part of the string was not correctly separated with the expected '-' character, so we were testing as if there were no legal int widths at all. Removed the leading cruft so we have some legal ints to test with. I noticed this while testing a potential change to the way we transform shifts and sexts in D42424. llvm-svn: 323377
-
- Jan 21, 2018
-
-
Sanjay Patel authored
...when the shift is known to not overflow with the matching signed-ness of the division. This closes an optimization gap caused by canonicalizing mul by power-of-2 to shl as shown in PR35709: https://bugs.llvm.org/show_bug.cgi?id=35709 Patch by Anton Bikineev! Differential Revision: https://reviews.llvm.org/D42032 llvm-svn: 323068
-
- Jan 20, 2018
-
-
Sanjay Patel authored
This fold is proposed in D42032. llvm-svn: 323043
-
- Jan 19, 2018
-
-
Daniel Neilson authored
Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
-
John Brawn authored
Three (or more) operand getelementptrs could plausibly also be handled, but handling only two-operand fits in easily with the existing BinaryOperator handling. Differential Revision: https://reviews.llvm.org/D39958 llvm-svn: 322930
-
- Jan 17, 2018
-
-
Sanjay Patel authored
llvm-svn: 322733
-
Sanjay Patel authored
I was comparing the demanded-bits implementations between InstCombine and TargetLowering as part of investigating questions in D42088 and noticed that this was wrong in IR. We were losing all of the prior known bits when we got back to the 'zext'. llvm-svn: 322662
-
Sanjay Patel authored
llvm-svn: 322660
-
- Jan 11, 2018
-
-
Benjamin Kramer authored
llvm-svn: 322285
-
Benjamin Kramer authored
parent function Ideally we should merge the attributes from the functions somehow, but this is obviously an improvement over taking random attributes from the caller which will trip up the verifier if they're nonsensical for an unary intrinsic call. llvm-svn: 322284
-
Sanjay Patel authored
This was originally planned as the fix for: https://bugs.llvm.org/show_bug.cgi?id=35834 ...but simpler transforms handled that case, so I implemented a lesser solution. It turns out we need to handle the case with 'not' ops too because the real code example that we are trying to solve: https://bugs.llvm.org/show_bug.cgi?id=35875 ...has extra uses of the intermediate values, so we can't rely on smaller canonicalizations to get us to the goal. As with rL321672, I've tried to show every possibility in the codegen tests because that's the simplest way to prove we're doing the right thing in the wide variety of permutations of this pattern. We can also show an InstCombine win because we added a fold for this case in: rL321998 / D41603 An Alive proof for one variant of the pattern to show that the InstCombine and codegen results are correct: https://rise4fun.com/Alive/vd1 Name: min3_nots %nx = xor i8 %x, -1 %ny = xor i8 %y, -1 %nz = xor i8 %z, -1 %cmpxz = icmp slt i8 %nx, %nz %minxz = select i1 %cmpxz, i8 %nx, i8 %nz %cmpyz = icmp slt i8 %ny, %nz %minyz = select i1 %cmpyz, i8 %ny, i8 %nz %cmpyx = icmp slt i8 %y, %x %r = select i1 %cmpyx, i8 %minxz, i8 %minyz => %cmpxyz = icmp slt i8 %minxz, %ny %r = select i1 %cmpxyz, i8 %minxz, i8 %ny Name: min3_nots_alt %nx = xor i8 %x, -1 %ny = xor i8 %y, -1 %nz = xor i8 %z, -1 %cmpxz = icmp slt i8 %nx, %nz %minxz = select i1 %cmpxz, i8 %nx, i8 %nz %cmpyz = icmp slt i8 %ny, %nz %minyz = select i1 %cmpyz, i8 %ny, i8 %nz %cmpyx = icmp slt i8 %y, %x %r = select i1 %cmpyx, i8 %minxz, i8 %minyz => %xz = icmp sgt i8 %x, %z %maxxz = select i1 %xz, i8 %x, i8 %z %xyz = icmp sgt i8 %maxxz, %y %maxxyz = select i1 %xyz, i8 %maxxz, i8 %y %r = xor i8 %maxxyz, -1 llvm-svn: 322283
-
Sanjay Patel authored
llvm-svn: 322281
-
Dmitry Venikov authored
Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag Reviewers: hfinkel, spatel Reviewed By: spatel Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits Differential Revision: https://reviews.llvm.org/D41286 llvm-svn: 322255
-
- Jan 10, 2018
-
-
Sanjay Patel authored
D41353 / D41233 are proposing to alter the shl/and canonicalization, but I think that would just move an existing pattern-matching hole to a different place. llvm-svn: 322206
-
- Jan 09, 2018
-
-
Sanjay Patel authored
Because of potential UB (known bits conflicts with an llvm.assume), we have to check rather than assert here because InstSimplify doesn't kill the compare: https://bugs.llvm.org/show_bug.cgi?id=35846 llvm-svn: 322104
-
Simon Pilgrim authored
Reduced from oss-fuzz #5032 test case llvm-svn: 322078
-
Simon Pilgrim authored
llvm-svn: 322072
-
- Jan 08, 2018
-
-
Sanjay Patel authored
The test is derived from a failing fuzz test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5008 Credit to @rksimon for pointing out the problem. llvm-svn: 322016
-
Sanjay Patel authored
There is precedence for factorization transforms in instcombine for FP ops with fast-math. We also have similar logic in foldSPFofSPF(). It would take more work to add this to reassociate because that's specialized for binops, and min/max are not binops (or even single instructions). Also, I don't have evidence that larger min/max trees than this exist in real code, but if we find that's true, we might want to reorganize where/how we do this optimization. In the motivating example from https://bugs.llvm.org/show_bug.cgi?id=35717 , we have: int test(int xc, int xm, int xy) { int xk; if (xc < xm) xk = xc < xy ? xc : xy; else xk = xm < xy ? xm : xy; return xk; } This patch solves that problem because we recognize more min/max patterns after rL321672 https://rise4fun.com/Alive/Qjne https://rise4fun.com/Alive/3yg Differential Revision: https://reviews.llvm.org/D41603 llvm-svn: 321998
-
- Jan 06, 2018
-
-
Sanjay Patel authored
In the minimal case, this won't remove instructions, but it still improves uses of existing values. In the motivating example from PR35834, it does remove instructions, and sets that case up to be optimized by something like D41603: https://reviews.llvm.org/D41603 llvm-svn: 321936
-
Sanjay Patel authored
llvm-svn: 321935
-
- Jan 05, 2018
-
-
Sanjay Patel authored
Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b), the use checking and operand creation were off. We were potentially creating repeated identical instructions of existing values. This led to infinite looping after I added the extra folds. By using the simpler m_Not matcher and not creating new 'not' ops for a and b, we avoid that problem. It's possible that not using IsFreeToInvert() here is more limiting than the simpler matcher, but there are no tests for anything more exotic. It's also possible that we should relax the use checking further to handle a case like PR35834: https://bugs.llvm.org/show_bug.cgi?id=35834 ...but we can make that a follow-up if it is needed. llvm-svn: 321882
-
- Jan 04, 2018
-
-
Sanjay Patel authored
llvm-svn: 321801
-
- Jan 03, 2018
-
-
Simon Pilgrim authored
Reduced from oss-fuzz #4871 test case llvm-svn: 321748
-
Florian Hahn authored
llvm-svn: 321706
-
- Jan 02, 2018
-
-
Dmitry Venikov authored
Summary: This patch enables folding under -ffast-math flag sqrt(a) * sqrt(b) -> sqrt(a*b) Reviewers: hfinkel, spatel, davide Reviewed By: spatel, davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D41322 llvm-svn: 321637
-
- Jan 01, 2018
-
-
Simon Pilgrim authored
Reduced (as best I could...) from oss-fuzz #4857 test case llvm-svn: 321634
-
Simon Pilgrim authored
llvm-svn: 321633
-