Skip to content
  1. Jan 25, 2018
  2. Jan 24, 2018
    • Sanjay Patel's avatar
      [InstCombine] fix datalayout in test file · 60c13c77
      Sanjay Patel authored
      The only part of the datalayout that should matter for these tests
      is the part that specifies the legal int widths ('n*'). But there
      was a bug - that part of the string was not correctly separated with
      the expected '-' character, so we were testing as if there were no
      legal int widths at all. Removed the leading cruft so we have some 
      legal ints to test with.
      
      I noticed this while testing a potential change to the way we 
      transform shifts and sexts in D42424.
      
      llvm-svn: 323377
      60c13c77
  3. Jan 21, 2018
  4. Jan 20, 2018
  5. Jan 19, 2018
    • Daniel Neilson's avatar
      Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) · 1e68724d
      Daniel Neilson authored
      Summary:
       This is a resurrection of work first proposed and discussed in Aug 2015:
         http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
      and initially landed (but then backed out) in Nov 2015:
         http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
      
       The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
      which is required to be a constant integer. It represents the alignment of the
      dest (and source), and so must be the minimum of the actual alignment of the
      two.
      
       This change is the first in a series that allows source and dest to each
      have their own alignments by using the alignment attribute on their arguments.
      
       In this change we:
      1) Remove the alignment argument.
      2) Add alignment attributes to the source & dest arguments. We, temporarily,
         require that the alignments for source & dest be equal.
      
       For example, code which used to read:
        call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
      will now read
        call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
      
       Downstream users may have to update their lit tests that check for
      @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
      may help with updating the majority of your tests, but it does not catch all possible
      patterns so some manual checking and updating will be required.
      
      s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
      s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
      s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
      s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
      s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
      s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
      s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
      s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
      s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
      s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
      s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
      s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g
      
       The remaining changes in the series will:
      Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
         source and dest alignments.
      Step 3) Update Clang to use the new IRBuilder API.
      Step 4) Update Polly to use the new IRBuilder API.
      Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
              and those that use use MemIntrinsicInst::[get|set]Alignment() to use
              getDestAlignment() and getSourceAlignment() instead.
      Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
              MemIntrinsicInst::[get|set]Alignment() methods.
      
      Reviewers: pete, hfinkel, lhames, reames, bollu
      
      Reviewed By: reames
      
      Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D41675
      
      llvm-svn: 322965
      1e68724d
    • John Brawn's avatar
      [InstCombine] Make foldSelectOpOp able to handle two-operand getelementptr · 2867bd72
      John Brawn authored
      Three (or more) operand getelementptrs could plausibly also be handled, but
      handling only two-operand fits in easily with the existing BinaryOperator
      handling.
      
      Differential Revision: https://reviews.llvm.org/D39958
      
      llvm-svn: 322930
      2867bd72
  6. Jan 17, 2018
  7. Jan 11, 2018
    • Benjamin Kramer's avatar
      [InstCombine] Apply the fix from r322284 for sin / cos -> tan too · 738e6e7c
      Benjamin Kramer authored
      llvm-svn: 322285
      738e6e7c
    • Benjamin Kramer's avatar
      [InstCombine] For cos/sin -> tan copy attributes from cos instead of the · 44993ede
      Benjamin Kramer authored
      parent function
      
      Ideally we should merge the attributes from the functions somehow, but
      this is obviously an improvement over taking random attributes from the
      caller which will trip up the verifier if they're nonsensical for an
      unary intrinsic call.
      
      llvm-svn: 322284
      44993ede
    • Sanjay Patel's avatar
      [ValueTracking] recognize min/max-of-min/max with notted ops (PR35875) · e63d8dda
      Sanjay Patel authored
      This was originally planned as the fix for:
      https://bugs.llvm.org/show_bug.cgi?id=35834
      ...but simpler transforms handled that case, so I implemented a 
      lesser solution. It turns out we need to handle the case with 'not'
      ops too because the real code example that we are trying to solve:
      https://bugs.llvm.org/show_bug.cgi?id=35875
      ...has extra uses of the intermediate values, so we can't rely on 
      smaller canonicalizations to get us to the goal.
      
      As with rL321672, I've tried to show every possibility in the
      codegen tests because that's the simplest way to prove we're doing
      the right thing in the wide variety of permutations of this pattern.
      
      We can also show an InstCombine win because we added a fold for
      this case in:
      rL321998 / D41603
      
      An Alive proof for one variant of the pattern to show that the 
      InstCombine and codegen results are correct:
      https://rise4fun.com/Alive/vd1
      
      Name: min3_nots
        %nx = xor i8 %x, -1
        %ny = xor i8 %y, -1
        %nz = xor i8 %z, -1
        %cmpxz = icmp slt i8 %nx, %nz
        %minxz = select i1 %cmpxz, i8 %nx, i8 %nz
        %cmpyz = icmp slt i8 %ny, %nz
        %minyz = select i1 %cmpyz, i8 %ny, i8 %nz
        %cmpyx = icmp slt i8 %y, %x
        %r = select i1 %cmpyx, i8 %minxz, i8 %minyz
      =>
        %cmpxyz = icmp slt i8 %minxz, %ny
        %r = select i1 %cmpxyz, i8 %minxz, i8 %ny
      
      Name: min3_nots_alt
        %nx = xor i8 %x, -1
        %ny = xor i8 %y, -1
        %nz = xor i8 %z, -1
        %cmpxz = icmp slt i8 %nx, %nz
        %minxz = select i1 %cmpxz, i8 %nx, i8 %nz
        %cmpyz = icmp slt i8 %ny, %nz
        %minyz = select i1 %cmpyz, i8 %ny, i8 %nz
        %cmpyx = icmp slt i8 %y, %x
        %r = select i1 %cmpyx, i8 %minxz, i8 %minyz
      =>
        %xz = icmp sgt i8 %x, %z
        %maxxz = select i1 %xz, i8 %x, i8 %z
        %xyz = icmp sgt i8 %maxxz, %y
        %maxxyz = select i1 %xyz, i8 %maxxz, i8 %y
        %r = xor i8 %maxxyz, -1
      
      llvm-svn: 322283
      e63d8dda
    • Sanjay Patel's avatar
      [InstCombine] add min3-with-nots test (PR35875); NFC · e0df4650
      Sanjay Patel authored
      llvm-svn: 322281
      e0df4650
    • Dmitry Venikov's avatar
      [InstCombine] Missed optimization in math expression: sin(x) / cos(x) => tan(x) · e5fbf591
      Dmitry Venikov authored
      Summary: This patch enables folding sin(x) / cos(x) -> tan(x), cos(x) / sin(x) -> 1 / tan(x) under -ffast-math flag
      
      Reviewers: hfinkel, spatel
      
      Reviewed By: spatel
      
      Subscribers: andrew.w.kaylor, efriedma, scanon, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D41286
      
      llvm-svn: 322255
      e5fbf591
  8. Jan 10, 2018
  9. Jan 09, 2018
  10. Jan 08, 2018
  11. Jan 06, 2018
  12. Jan 05, 2018
    • Sanjay Patel's avatar
      [InstCombine] add folds for min(~a, b) --> ~max(a, b) · 5b6aacf2
      Sanjay Patel authored
      Besides the bug of omitting the inverse transform of max(~a, ~b) --> ~min(a, b),
      the use checking and operand creation were off. We were potentially creating 
      repeated identical instructions of existing values. This led to infinite
      looping after I added the extra folds.
      
      By using the simpler m_Not matcher and not creating new 'not' ops for a and b,
      we avoid that problem. It's possible that not using IsFreeToInvert() here is
      more limiting than the simpler matcher, but there are no tests for anything
      more exotic. It's also possible that we should relax the use checking further
      to handle a case like PR35834:
      https://bugs.llvm.org/show_bug.cgi?id=35834
      ...but we can make that a follow-up if it is needed. 
      
      llvm-svn: 321882
      5b6aacf2
  13. Jan 04, 2018
  14. Jan 03, 2018
  15. Jan 02, 2018
  16. Jan 01, 2018
  17. Dec 30, 2017
    • Philip Reames's avatar
      [instsimplify] consistently handle undef and out of bound indices for... · e499bc30
      Philip Reames authored
      [instsimplify] consistently handle undef and out of bound indices for insertelement and extractelement
      
      In one case, we were handling out of bounds, but not undef indices.  In the other, we were handling undef (with the comment making the analogy to out of bounds), but not out of bounds.  Be consistent and treat both undef and constant out of bounds indices as producing undefined results.
      
      As a side effect, this also protects instcombine from having to handle large constant indices as we always simplify first.
      
      llvm-svn: 321575
      e499bc30
    • Philip Reames's avatar
      Add another test case for r321489 · 8e1abe4a
      Philip Reames authored
      Went to reduce another fuzzer failure to find it's already been fixed, but the test case is slightly different so it's worth adding anyways.
      
      Reduced from oss-fuzz #4768 test case
      
      llvm-svn: 321573
      8e1abe4a
    • Philip Reames's avatar
      Move tests associated with transforms moved in r321467 · 3e9c6719
      Philip Reames authored
      llvm-svn: 321572
      3e9c6719
  18. Dec 28, 2017
  19. Dec 27, 2017
  20. Dec 26, 2017
Loading