Skip to content
  1. Mar 23, 2015
  2. Mar 21, 2015
    • Benjamin Kramer's avatar
    • Benjamin Kramer's avatar
      [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check. · 7857d723
      Benjamin Kramer authored
      strchr("123!", C) != nullptr is a common pattern to check if C is one
      of 1, 2, 3 or !. If the largest element of the string is smaller than
      the target's register size we can easily create a bitfield and just
      do a simple test for set membership.
      
      int foo(char C) { return strchr("123!", C) != nullptr; } now becomes
      
      	cmpl	$64, %edi ## range check
      	sbbb	%al, %al
      	movabsq	$0xE000200000001, %rcx
      	btq	%rdi, %rcx ## bit test
      	sbbb	%cl, %cl
      	andb	%al, %cl ## and the two conditions
      	andb	$1, %cl
      	movzbl	%cl, %eax ## returning an int
      	ret
      
      (imho the backend should expand this into a series of branches, but
      that's a different story)
      
      The code is currently limited to bit fields that fit in a register, so
      usually 64 or 32 bits. Sadly, this misses anything using alpha chars
      or {}. This could be fixed by just emitting a i128 bit field, but that
      can generate really ugly code so we have to find a better way. To some
      degree this is also recreating switch lowering logic, but we can't
      simply emit a switch instruction and thus change the CFG within
      instcombine.
      
      llvm-svn: 232902
      7857d723
    • Benjamin Kramer's avatar
      SimplifyLibCalls: Add basic optimization of memchr calls. · 691363e7
      Benjamin Kramer authored
      This is just memchr(x, y, 0) -> nullptr and constant folding.
      
      llvm-svn: 232896
      691363e7
    • Kostya Serebryany's avatar
      [sanitizer] experimental tracing for cmp instructions · f4e35cc4
      Kostya Serebryany authored
      llvm-svn: 232873
      f4e35cc4
  3. Mar 20, 2015
  4. Mar 19, 2015
    • Duncan P. N. Exon Smith's avatar
      Verifier: Remove the separate -verify-di pass · ab58a568
      Duncan P. N. Exon Smith authored
      Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
      Instead, call into the `DebugInfoVerifier` from inside
      `VerifierLegacyPass::finalizeModule()`.  This better matches the logic
      in `verifyModule()` (used by the new PassManager), avoids requiring two
      separate passes to verify the IR, and makes the API for "add a pass to
      verify the IR" simple.
      
      Note: the `-verify-debug-info` flag still works (for now, at least;
      eventually it might make sense to just remove it).
      
      llvm-svn: 232772
      ab58a568
    • Peter Collingbourne's avatar
      LowerBitSets: Avoid reusing byte set addresses. · 994ba3d2
      Peter Collingbourne authored
      Each use of the byte array uses a different alias. This makes the
      backend less likely to reuse previously computed byte array addresses,
      improving the security of the CFI mechanism based on this pass.
      
      Differential Revision: http://reviews.llvm.org/D8455
      
      llvm-svn: 232770
      994ba3d2
    • Peter Collingbourne's avatar
      libLTO, llvm-lto, gold: Introduce flag for controlling optimization level. · 070843d6
      Peter Collingbourne authored
      This change also introduces a link-time optimization level of 1. This
      optimization level runs only the globaldce pass as well as cleanup passes for
      passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets.
      
      http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html
      
      llvm-svn: 232769
      070843d6
    • Duncan P. N. Exon Smith's avatar
      PassManagerBuilder: Remove effectively dead 'StripDebug' option · 0a93e2db
      Duncan P. N. Exon Smith authored
      `StripDebug` was only used by tools/opt/opt.cpp in
      `AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
      command-line flag before it calls `AddStandardLinkPasses()`.  Stripping
      debug info twice isn't very useful.
      
      llvm-svn: 232765
      0a93e2db
    • Peter Collingbourne's avatar
      GlobalDCE: Improve performance for large modules containing comdats. · 0dbc7088
      Peter Collingbourne authored
      When we encounter a global with a comdat, rather than iterating over
      every global in the module to find globals in the same comdat, store the
      members in a multimap. This effectively lowers the complexity to O(N log N),
      improving performance significantly for large modules such as might be
      encountered during LTO.
      
      It looks like we used to do something like this until r219191.
      
      No functional change.
      
      Differential Revision: http://reviews.llvm.org/D8431
      
      llvm-svn: 232743
      0dbc7088
    • Daniel Jasper's avatar
      [InstCombine] Don't fold a GEP into itself through a PHI node · 5add63f2
      Daniel Jasper authored
      This can only occur (I think) through the back-edge of the loop.
      
      However, folding a GEP into itself means that the value of the previous
      iteration needs to be stored in the meantime, thus requiring an
      additional register variable to be live, but not actually achieving
      anything (the gep still needs to be executed once per loop iteration).
      
      The attached test case is derived from:
        typedef unsigned uint32;
        typedef unsigned char uint8;
        inline uint8 *f(uint32 value, uint8 *target) {
          while (value >= 0x80) {
            value >>= 7;
            ++target;
          }
          ++target;
          return target;
        }
        uint8 *g(uint32 b, uint8 *target) {
          target = f(b, f(42, target));
          return target;
        }
      
      What happens is that the GEP stored in incptr2 is folded into itself
      through the loop's back-edge and the phi-node stored in loopptr,
      effectively incrementing the ptr by "2" in each iteration instead of "1".
      
      In this case, it is actually increasing the number of GEPs required as
      the GEP before the loop can't be folded away anymore. For comparison:
      
      With this patch:
        define i8* @test4(i32 %value, i8* %buffer) {
        entry:
          %cmp = icmp ugt i32 %value, 127
          br i1 %cmp, label %loop.header, label %exit
      
        loop.header:                                      ; preds = %entry
          br label %loop.body
      
        loop.body:                                        ; preds = %loop.body, %loop.header
          %buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
          %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
          %loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
          %shr = lshr i32 %newval, 7
          %cmp2 = icmp ugt i32 %newval, 16383
          br i1 %cmp2, label %loop.body, label %loop.exit
      
        loop.exit:                                        ; preds = %loop.body
          br label %exit
      
        exit:                                             ; preds = %loop.exit, %entry
          %0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
          %incptr3 = getelementptr inbounds i8, i8* %0, i64 2
          ret i8* %incptr3
        }
      
      Without this patch:
        define i8* @test4(i32 %value, i8* %buffer) {
        entry:
          %incptr = getelementptr inbounds i8, i8* %buffer, i64 1
          %cmp = icmp ugt i32 %value, 127
          br i1 %cmp, label %loop.header, label %exit
      
        loop.header:                                      ; preds = %entry
          br label %loop.body
      
        loop.body:                                        ; preds = %loop.body, %loop.header
          %0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
          %loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
          %newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
          %shr = lshr i32 %newval, 7
          %incptr2 = getelementptr inbounds i8, i8* %0, i64 2
          %cmp2 = icmp ugt i32 %newval, 16383
          br i1 %cmp2, label %loop.body, label %loop.exit
      
        loop.exit:                                        ; preds = %loop.body
          br label %exit
      
        exit:                                             ; preds = %loop.exit, %entry
          %ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
          %incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
          ret i8* %incptr3
        }
      
      Review: http://reviews.llvm.org/D8245
      llvm-svn: 232718
      5add63f2
  5. Mar 18, 2015
    • Sanjoy Das's avatar
      [ConstantRange] Split makeICmpRegion in two. · 7182d36f
      Sanjoy Das authored
      Summary:
      This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
      `makeSatisfyingICmpRegion` with slightly different contracts.  The first
      one is useful for determining what values some expression //may// take,
      given that a certain `icmp` evaluates to true.  The second one is useful
      for determining what values are guaranteed to //satisfy// a given
      `icmp`.
      
      Reviewers: nlewycky
      
      Reviewed By: nlewycky
      
      Subscribers: llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D8345
      
      llvm-svn: 232575
      7182d36f
  6. Mar 17, 2015
  7. Mar 16, 2015
    • Gabor Horvath's avatar
      [llvm] Replacing asserts with static_asserts where appropriate · fee04343
      Gabor Horvath authored
      Summary:
      This patch consists of the suggestions of clang-tidy/misc-static-assert check.
      
      
      Reviewers: alexfh
      
      Reviewed By: alexfh
      
      Subscribers: xazax.hun, llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D8343
      
      llvm-svn: 232366
      fee04343
    • Dmitry Vyukov's avatar
      asan: fix overflows in isSafeAccess · ee842385
      Dmitry Vyukov authored
      As pointed out in http://reviews.llvm.org/D7583
      The current checks can cause overflows when object size/access offset cross Quintillion bytes.
      
      http://reviews.llvm.org/D8193
      
      llvm-svn: 232358
      ee842385
    • Michael Gottesman's avatar
      One more try with unused. · d63436fb
      Michael Gottesman authored
      llvm-svn: 232357
      d63436fb
    • Michael Gottesman's avatar
      a0d2d337
    • Michael Gottesman's avatar
      c219dd1d
    • Michael Gottesman's avatar
    • Michael Gottesman's avatar
      [objc-arc] Make the ARC optimizer more conservative by forcing it to be... · dd60f9bb
      Michael Gottesman authored
      [objc-arc] Make the ARC optimizer more conservative by forcing it to be non-safe in both direction, but mitigate the problem by noting that we just care if there was a further use.
      
      The problem here is the infamous one direction known safe. I was
      hesitant to turn it off before b/c of the potential for regressions
      without an actual bug from users hitting the problem. This is that bug ;
      ).
      
      The main performance impact of having known safe in both directions is
      that often times it is very difficult to find two releases without a use
      in-between them since we are so conservative with determining potential
      uses. The one direction known safe gets around that problem by taking
      advantage of many situations where we have two retains in a row,
      allowing us to avoid that problem. That being said, the one direction
      known safe is unsafe. Consider the following situation:
      
      retain(x)
      retain(x)
      call(x)
      call(x)
      release(x)
      
      Then we know the following about the reference count of x:
      
      // rc(x) == N (for some N).
      retain(x)
      // rc(x) == N+1
      retain(x)
      // rc(x) == N+2
      call A(x)
      call B(x)
      // rc(x) >= 1 (since we can not release a deallocated pointer).
      release(x)
      // rc(x) >= 0
      
      That is all the information that we can know statically. That means that
      we know that A(x), B(x) together can release (x) at most N+1 times. Lets
      say that we remove the inner retain, release pair.
      
      // rc(x) == N (for some N).
      retain(x)
      // rc(x) == N+1
      call A(x)
      call B(x)
      // rc(x) >= 1
      release(x)
      // rc(x) >= 0
      
      We knew before that A(x), B(x) could release x up to N+1 times meaning
      that rc(x) may be zero at the release(x). That is not safe. On the other
      hand, consider the following situation where we have a must use of
      release(x) that x must be kept alive for after the release(x)**. Then we
      know that:
      
      // rc(x) == N (for some N).
      retain(x)
      // rc(x) == N+1
      retain(x)
      // rc(x) == N+2
      call A(x)
      call B(x)
      // rc(x) >= 2 (since we know that we are going to release x and that that release can not be the last use of x).
      release(x)
      // rc(x) >= 1 (since we can not deallocate the pointer since we have a must use after x).
      …
      // rc(x) >= 1
      use(x)
      
      Thus we know that statically the calls to A(x), B(x) can together only
      release rc(x) N times. Thus if we remove the inner retain, release pair:
      
      // rc(x) == N (for some N).
      retain(x)
      // rc(x) == N+1
      call A(x)
      call B(x)
      // rc(x) >= 1
      …
      // rc(x) >= 1
      use(x)
      
      We are still safe unless in the final … there are unbalanced retains,
      releases which would have caused the program to blow up anyways even
      before optimization occurred. The simplest form of must use is an
      additional release that has not been paired up with any retain (if we
      had paired the release with a retain and removed it we would not have
      the additional use). This fits nicely into the ARC framework since
      basically what you do is say that given any nested releases regardless
      of what is in between, the inner release is known safe. This enables us to get
      back the lost performance.
      
      <rdar://problem/19023795>
      
      llvm-svn: 232351
      dd60f9bb
Loading