Skip to content
  1. Mar 12, 2019
  2. Mar 11, 2019
  3. Mar 08, 2019
    • Clement Courbet's avatar
      [SelectionDAG] Allow the user to specify a memeq function. · 8e16d733
      Clement Courbet authored
      Summary:
      Right now, when we encounter a string equality check,
      e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a
      small compile-time constant, and fall back on calling `memcmp()` else.
      
      This is sub-optimal because memcmp has to compute much more than
      equality.
      
      This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms
      that support `bcmp`.
      
      `bcmp` can be made much more efficient than `memcmp` because equality
      compare is trivially parallel while lexicographic ordering has a chain
      dependency.
      
      Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56593
      
      llvm-svn: 355672
      8e16d733
  4. Mar 07, 2019
    • David Green's avatar
      [LSR] Attempt to increase the accuracy of LSR's setup cost · ffc922ec
      David Green authored
      In some loops, we end up generating loop induction variables that look like:
        {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1}
      As opposed to the simpler:
        {(zext i16 (%i0 * %i1) to i32),+,-1}
      i.e we count up from -limit to 0, not the simpler counting down from limit to
      0. This is because the scores, as LSR calculates them, are the same and the
      second is filtered in place of the first. We end up with a redundant SUB from 0
      in the code.
      
      This patch tries to make the calculation of the setup cost a little more
      thoroughly, recursing into the scev members to better approximate the setup
      required. The cost function for comparing LSR costs is:
      
      return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds,
                      C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
             std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds,
                      C2.ScaleCost, C2.ImmCost, C2.SetupCost);
      So this will only alter results if none of the other variables turn out to be
      different.
      
      Differential Revision: https://reviews.llvm.org/D58770
      
      llvm-svn: 355597
      ffc922ec
    • Fangrui Song's avatar
      [BDCE] Optimize find+insert with early insert · b0f764c7
      Fangrui Song authored
      llvm-svn: 355583
      b0f764c7
    • Nick Desaulniers's avatar
      [LoopRotate] fix crash encountered with callbr · 212c8ac2
      Nick Desaulniers authored
      Summary:
      While implementing inlining support for callbr
      (https://bugs.llvm.org/show_bug.cgi?id=40722), I hit a crash in Loop
      Rotation when trying to build the entire x86 Linux kernel
      (drivers/char/random.c). This is a small fix up to r353563.
      
      Test case is drivers/char/random.c (with callbr's inlined), then ran
      through creduce, then `opt -opt-bisect-limit=<limit>`, then bugpoint.
      
      Thanks to Craig Topper for immediately spotting the fix, and teaching me
      how to fish.
      
      Reviewers: craig.topper, jyknight
      
      Reviewed By: craig.topper
      
      Subscribers: hiraditya, llvm-commits, srhines
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58929
      
      llvm-svn: 355564
      212c8ac2
  5. Mar 06, 2019
  6. Mar 04, 2019
  7. Mar 02, 2019
    • Sanjay Patel's avatar
      [InstCombine] move add after smin/smax · 1f65903d
      Sanjay Patel authored
      Follow-up to rL355221.
      This isn't specifically called for within PR14613,
      but we'll get there eventually if it's not already
      requested in some other bug report.
      
      https://rise4fun.com/Alive/5b0
      
        Name: smax
        Pre: WillNotOverflowSignedSub(C1,C0)
        %a = add nsw i8 %x, C0
        %cond = icmp sgt i8 %a, C1
        %r = select i1 %cond, i8 %a, i8 C1
        =>
        %c2 = icmp sgt i8 %x, C1-C0
        %u2 = select i1 %c2, i8 %x, i8 C1-C0
        %r = add nsw i8 %u2, C0
      
        Name: smin
        Pre: WillNotOverflowSignedSub(C1,C0)
        %a = add nsw i32 %x, C0
        %cond = icmp slt i32 %a, C1
        %r = select i1 %cond, i32 %a, i32 C1
        =>
        %c2 = icmp slt i32 %x, C1-C0
        %u2 = select i1 %c2, i32 %x, i32 C1-C0
        %r = add nsw i32 %u2, C0
      
      llvm-svn: 355272
      1f65903d
  8. Mar 01, 2019
    • Philip Reames's avatar
      [InstCombine] Extend saturating idempotent atomicrmw transform to FP · cf0a978e
      Philip Reames authored
      I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw.
      
      Differential Revision: https://reviews.llvm.org/D58836
      
      llvm-svn: 355222
      cf0a978e
    • Sanjay Patel's avatar
      [InstCombine] move add after umin/umax · 6e1e7e1c
      Sanjay Patel authored
      In the motivating cases from PR14613:
      https://bugs.llvm.org/show_bug.cgi?id=14613
      ...moving the add enables us to narrow the
      min/max which eliminates zext/trunc which
      enables signficantly better vectorization.
      But that bug is still not completely fixed.
      
      https://rise4fun.com/Alive/5KQ
      
        Name: umax
        Pre: C1 u>= C0
        %a = add nuw i8 %x, C0
        %cond = icmp ugt i8 %a, C1
        %r = select i1 %cond, i8 %a, i8 C1
        =>
        %c2 = icmp ugt i8 %x, C1-C0
        %u2 = select i1 %c2, i8 %x, i8 C1-C0
        %r = add nuw i8 %u2, C0
      
        Name: umin
        Pre: C1 u>= C0
        %a = add nuw i32 %x, C0
        %cond = icmp ult i32 %a, C1
        %r = select i1 %cond, i32 %a, i32 C1
        =>
        %c2 = icmp ult i32 %x, C1-C0
        %u2 = select i1 %c2, i32 %x, i32 C1-C0
        %r = add nuw i32 %u2, C0
      
      llvm-svn: 355221
      6e1e7e1c
    • Philip Reames's avatar
      [LICM] Infer proper alignment from loads during scalar promotion · 2226e9a7
      Philip Reames authored
      This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load.
      
      For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an *incredibly* rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we *may* fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually *is* well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit.
      
      Differential Revision: https://reviews.llvm.org/D58809
      
      llvm-svn: 355217
      2226e9a7
    • Philip Reames's avatar
      [InstCombine] Extend "idempotent" atomicrmw optimizations to floating point · 77982868
      Philip Reames authored
      An idempotent atomicrmw is one that does not change memory in the process of execution.  We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR.
      
      Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load.  As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future.
      
      Differential Revision: https://reviews.llvm.org/D58251
      
      llvm-svn: 355210
      77982868
    • Jonas Hahnfeld's avatar
      Hide two unused debugging methods, NFCI. · e071cd86
      Jonas Hahnfeld authored
      GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and
      StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used
      in Release builds. Hide them behind 'ifndef NDEBUG'.
      
      llvm-svn: 355205
      e071cd86
    • Manman Ren's avatar
      Try to fix NetBSD buildbot breakage introduced in D57463. · 576124a3
      Manman Ren authored
      By including the header file in the source.
      
      llvm-svn: 355202
      576124a3
    • Fangrui Song's avatar
      [ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid... · f4b25f70
      Fangrui Song authored
      [ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM
      
      Summary:
      ConstIntInfoVec contains elements extracted from the previous function.
      In new PM, releaseMemory() is not called and the dangling elements can
      cause segfault in findConstantInsertionPoint.
      
      Rename releaseMemory() to cleanup() to deliver the idea that it is
      mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix
      this.
      
      Reviewers: ormris, zzheng, dmgreen, wmi
      
      Reviewed By: ormris, wmi
      
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58589
      
      llvm-svn: 355174
      f4b25f70
  9. Feb 28, 2019
    • Reid Kleckner's avatar
      [sancov] Instrument reachable blocks that end in unreachable · 701593f1
      Reid Kleckner authored
      Summary:
      These sorts of blocks often contain calls to noreturn functions, like
      longjmp, throw, or trap. If they don't end the program, they are
      "interesting" from the perspective of sanitizer coverage, so we should
      instrument them. This was discussed in https://reviews.llvm.org/D57982.
      
      Reviewers: kcc, vitalybuka
      
      Subscribers: llvm-commits, craig.topper, efriedma, morehouse, hiraditya
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58740
      
      llvm-svn: 355152
      701593f1
    • Manman Ren's avatar
      Add a module pass for order file instrumentation · 1829512d
      Manman Ren authored
      The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name.
      
      In this pass, we add three global variables:
      (1) an order file buffer: a circular buffer at its own llvm section.
      (2) a bitmap for each module: one byte for each function to say if the function is already executed.
      (3) a global index to the order file buffer.
      
      At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index.
      
      Differential Revision:  https://reviews.llvm.org/D57463
      
      llvm-svn: 355133
      1829512d
    • Rong Xu's avatar
      [PGO] Context sensitive PGO (part 2) · a6ff69f6
      Rong Xu authored
      Part 2 of CSPGO changes (mostly related to ProfileSummary).
      Note that I use a default parameter in setProfileSummary() and getSummary().
      This is to break the dependency in clang. I will make the parameter explicit
      after changing clang in a separated patch.
      
      Differential Revision: https://reviews.llvm.org/D54175
      
      llvm-svn: 355131
      a6ff69f6
    • Sanjay Patel's avatar
      [InstCombine] fold adds of constants separated by sext/zext · 4a47f5f5
      Sanjay Patel authored
      This is part of a transform that may be done in the backend:
      D13757
      ...but it should always be beneficial to fold this sooner in IR
      for all targets.
      
      https://rise4fun.com/Alive/vaiW
      
        Name: sext add nsw
        %add = add nsw i8 %i, C0
        %ext = sext i8 %add to i32
        %r = add i32 %ext, C1
        =>
        %s = sext i8 %i to i32
        %r = add i32 %s, sext(C0)+C1
      
        Name: zext add nuw
        %add = add nuw i8 %i, C0
        %ext = zext i8 %add to i16
        %r = add i16 %ext, C1
        =>
        %s = zext i8 %i to i16
        %r = add i16 %s, zext(C0)+C1
      
      llvm-svn: 355118
      4a47f5f5
    • Chijun Sima's avatar
      Make MergeBlockIntoPredecessor conformant to the precondition of calling DTU.applyUpdates · 58618763
      Chijun Sima authored
      Summary:
      It is mentioned in the document of DTU that "It is illegal to submit any update that has already been submitted, i.e., you are supposed not to insert an existent edge or delete a nonexistent edge." It is dangerous to violet this rule because DomTree and PostDomTree occasionally crash on this scenario.
      
      This patch fixes `MergeBlockIntoPredecessor`, making it conformant to this precondition.
      
      Reviewers: kuhar, brzycki, chandlerc
      
      Reviewed By: brzycki
      
      Subscribers: hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58444
      
      llvm-svn: 355105
      58618763
    • Bjorn Pettersson's avatar
      Add support for computing "zext of value" in KnownBits. NFCI · d30f308a
      Bjorn Pettersson authored
      Summary:
      The description of KnownBits::zext() and
      KnownBits::zextOrTrunc() has confusingly been telling
      that the operation is equivalent to zero extending the
      value we're tracking. That has not been true, instead
      the user has been forced to explicitly set the extended
      bits as known zero afterwards.
      
      This patch adds a second argument to KnownBits::zext()
      and KnownBits::zextOrTrunc() to control if the extended
      bits should be considered as known zero or as unknown.
      
      Reviewers: craig.topper, RKSimon
      
      Reviewed By: RKSimon
      
      Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58650
      
      llvm-svn: 355099
      d30f308a
    • Eric Christopher's avatar
      Temporarily revert "ArgumentPromotion should copy all metadata to new... · 07944353
      Eric Christopher authored
      Temporarily revert "ArgumentPromotion should copy all metadata to new Function" and the dependent patch "Refine ArgPromotion metadata handling" as they're causing segfaults in argument promotion.
      
      This reverts commits r354032 and r353537.
      
      llvm-svn: 355060
      07944353
Loading