Skip to content
  1. Mar 12, 2019
    • Craig Topper's avatar
      [SanitizerCoverage] Avoid splitting critical edges when destination is a basic... · 03e93f51
      Craig Topper authored
      [SanitizerCoverage] Avoid splitting critical edges when destination is a basic block containing unreachable
      
      This patch adds a new option to SplitAllCriticalEdges and uses it to avoid splitting critical edges when the destination basic block ends with unreachable. Otherwise if we split the critical edge, sanitizer coverage will instrument the new block that gets inserted for the split. But since this block itself shouldn't be reachable this is pointless. These basic blocks will stick around and generate assembly, but they don't end in sane control flow and might get placed at the end of the function. This makes it look like one function has code that flows into the next function.
      
      This showed up while compiling the linux kernel with clang. The kernel has a tool called objtool that detected the code that appeared to flow from one function to the next. https://github.com/ClangBuiltLinux/linux/issues/351#issuecomment-461698884
      
      Differential Revision: https://reviews.llvm.org/D57982
      
      llvm-svn: 355947
      03e93f51
    • Liang Zou's avatar
      [format] \t => ' ' · 4a8afeb9
      Liang Zou authored
      Summary:
      1. \t => '  '
      2. test commit access
      
      Reviewers: Higuoxing, liangdzou
      
      Reviewed By: Higuoxing, liangdzou
      
      Subscribers: kristina, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D59243
      
      llvm-svn: 355924
      4a8afeb9
    • Fangrui Song's avatar
      [SimplifyLibCalls] Simplify optimizePuts · b1dfbebe
      Fangrui Song authored
      The code might intend to replace puts("") with putchar('\n') even if the
      return value is used. It failed because use_empty() was used to guard
      the whole block. While returning '\n' (putchar('\n')) is technically
      correct (puts is only required to return a nonnegative number on
      success), doing this looks weird and there is really little benefit to
      optimize puts whose return value is used. So don't do that.
      
      llvm-svn: 355921
      b1dfbebe
    • Simon Pilgrim's avatar
      Revert rL355906: [SLP] Remove redundancy of performing operand reordering... · d3a8fd8b
      Simon Pilgrim authored
      Revert rL355906: [SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree().
      
      This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree().
      To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order.
      
      This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo
      
      Patch by: @vporpo (Vasileios Porpodas)
      
      Differential Revision: https://reviews.llvm.org/D59059
      ........
      
      Reverted due to buildbot failures that I don't have time to track down.
      
      llvm-svn: 355913
      d3a8fd8b
    • Simon Pilgrim's avatar
    • Simon Pilgrim's avatar
      [SLP] Remove redundancy of performing operand reordering twice: once in... · 2086a889
      Simon Pilgrim authored
      [SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree().
      
      This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree().
      To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order.
      
      This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo
      
      Patch by: @vporpo (Vasileios Porpodas)
      
      Differential Revision: https://reviews.llvm.org/D59059
      
      llvm-svn: 355906
      2086a889
    • Fangrui Song's avatar
      f2609670
    • Kristina Brooks's avatar
      Very minor typo. NFC · 5b1e1c05
      Kristina Brooks authored
      Typo `we we're` => `we were` in the pass EarlyCSE
      
      Patch by liangdzou (Liang ZOU)
      
      Differential Revision: https://reviews.llvm.org/D59241
      
      llvm-svn: 355895
      5b1e1c05
    • Sanjoy Das's avatar
      Reland "Relax constraints for reduction vectorization" · 3f5ce186
      Sanjoy Das authored
      Change from original commit: move test (that uses an X86 triple) into the X86
      subdirectory.
      
      Original description:
      Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
      `reassoc` should be sufficient.
      
      Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal
      
      Reviewed By: sdesmalen
      
      Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D57728
      
      llvm-svn: 355889
      3f5ce186
  2. Mar 11, 2019
  3. Mar 08, 2019
    • Clement Courbet's avatar
      [SelectionDAG] Allow the user to specify a memeq function. · 8e16d733
      Clement Courbet authored
      Summary:
      Right now, when we encounter a string equality check,
      e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a
      small compile-time constant, and fall back on calling `memcmp()` else.
      
      This is sub-optimal because memcmp has to compute much more than
      equality.
      
      This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms
      that support `bcmp`.
      
      `bcmp` can be made much more efficient than `memcmp` because equality
      compare is trivially parallel while lexicographic ordering has a chain
      dependency.
      
      Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits
      
      Differential Revision: https://reviews.llvm.org/D56593
      
      llvm-svn: 355672
      8e16d733
  4. Mar 07, 2019
    • David Green's avatar
      [LSR] Attempt to increase the accuracy of LSR's setup cost · ffc922ec
      David Green authored
      In some loops, we end up generating loop induction variables that look like:
        {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1}
      As opposed to the simpler:
        {(zext i16 (%i0 * %i1) to i32),+,-1}
      i.e we count up from -limit to 0, not the simpler counting down from limit to
      0. This is because the scores, as LSR calculates them, are the same and the
      second is filtered in place of the first. We end up with a redundant SUB from 0
      in the code.
      
      This patch tries to make the calculation of the setup cost a little more
      thoroughly, recursing into the scev members to better approximate the setup
      required. The cost function for comparing LSR costs is:
      
      return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds,
                      C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
             std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds,
                      C2.ScaleCost, C2.ImmCost, C2.SetupCost);
      So this will only alter results if none of the other variables turn out to be
      different.
      
      Differential Revision: https://reviews.llvm.org/D58770
      
      llvm-svn: 355597
      ffc922ec
    • Fangrui Song's avatar
      [BDCE] Optimize find+insert with early insert · b0f764c7
      Fangrui Song authored
      llvm-svn: 355583
      b0f764c7
    • Nick Desaulniers's avatar
      [LoopRotate] fix crash encountered with callbr · 212c8ac2
      Nick Desaulniers authored
      Summary:
      While implementing inlining support for callbr
      (https://bugs.llvm.org/show_bug.cgi?id=40722), I hit a crash in Loop
      Rotation when trying to build the entire x86 Linux kernel
      (drivers/char/random.c). This is a small fix up to r353563.
      
      Test case is drivers/char/random.c (with callbr's inlined), then ran
      through creduce, then `opt -opt-bisect-limit=<limit>`, then bugpoint.
      
      Thanks to Craig Topper for immediately spotting the fix, and teaching me
      how to fish.
      
      Reviewers: craig.topper, jyknight
      
      Reviewed By: craig.topper
      
      Subscribers: hiraditya, llvm-commits, srhines
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58929
      
      llvm-svn: 355564
      212c8ac2
  5. Mar 06, 2019
  6. Mar 04, 2019
  7. Mar 02, 2019
    • Sanjay Patel's avatar
      [InstCombine] move add after smin/smax · 1f65903d
      Sanjay Patel authored
      Follow-up to rL355221.
      This isn't specifically called for within PR14613,
      but we'll get there eventually if it's not already
      requested in some other bug report.
      
      https://rise4fun.com/Alive/5b0
      
        Name: smax
        Pre: WillNotOverflowSignedSub(C1,C0)
        %a = add nsw i8 %x, C0
        %cond = icmp sgt i8 %a, C1
        %r = select i1 %cond, i8 %a, i8 C1
        =>
        %c2 = icmp sgt i8 %x, C1-C0
        %u2 = select i1 %c2, i8 %x, i8 C1-C0
        %r = add nsw i8 %u2, C0
      
        Name: smin
        Pre: WillNotOverflowSignedSub(C1,C0)
        %a = add nsw i32 %x, C0
        %cond = icmp slt i32 %a, C1
        %r = select i1 %cond, i32 %a, i32 C1
        =>
        %c2 = icmp slt i32 %x, C1-C0
        %u2 = select i1 %c2, i32 %x, i32 C1-C0
        %r = add nsw i32 %u2, C0
      
      llvm-svn: 355272
      1f65903d
  8. Mar 01, 2019
    • Philip Reames's avatar
      [InstCombine] Extend saturating idempotent atomicrmw transform to FP · cf0a978e
      Philip Reames authored
      I'm assuming that the nan propogation logic for InstructonSimplify's handling of fadd and fsub is correct, and applying the same to atomicrmw.
      
      Differential Revision: https://reviews.llvm.org/D58836
      
      llvm-svn: 355222
      cf0a978e
    • Sanjay Patel's avatar
      [InstCombine] move add after umin/umax · 6e1e7e1c
      Sanjay Patel authored
      In the motivating cases from PR14613:
      https://bugs.llvm.org/show_bug.cgi?id=14613
      ...moving the add enables us to narrow the
      min/max which eliminates zext/trunc which
      enables signficantly better vectorization.
      But that bug is still not completely fixed.
      
      https://rise4fun.com/Alive/5KQ
      
        Name: umax
        Pre: C1 u>= C0
        %a = add nuw i8 %x, C0
        %cond = icmp ugt i8 %a, C1
        %r = select i1 %cond, i8 %a, i8 C1
        =>
        %c2 = icmp ugt i8 %x, C1-C0
        %u2 = select i1 %c2, i8 %x, i8 C1-C0
        %r = add nuw i8 %u2, C0
      
        Name: umin
        Pre: C1 u>= C0
        %a = add nuw i32 %x, C0
        %cond = icmp ult i32 %a, C1
        %r = select i1 %cond, i32 %a, i32 C1
        =>
        %c2 = icmp ult i32 %x, C1-C0
        %u2 = select i1 %c2, i32 %x, i32 C1-C0
        %r = add nuw i32 %u2, C0
      
      llvm-svn: 355221
      6e1e7e1c
    • Philip Reames's avatar
      [LICM] Infer proper alignment from loads during scalar promotion · 2226e9a7
      Philip Reames authored
      This patch fixes an issue where we would compute an unnecessarily small alignment during scalar promotion when no store is not to be guaranteed to execute, but we've proven load speculation safety. Since speculating a load requires proving the existing alignment is valid at the new location (see Loads.cpp), we can use the alignment fact from the load.
      
      For non-atomics, this is a performance problem. For atomics, this is a correctness issue, though an *incredibly* rare one to see in practice. For atomics, we might not be able to lower an improperly aligned load or store (i.e. i32 align 1). If such an instruction makes it all the way to codegen, we *may* fail to codegen the operation, or we may simply generate a slow call to a library function. The part that makes this super hard to see in practice is that the memory location actually *is* well aligned, and instcombine knows that. So, to see a failure, you have to have a) hit the bug in LICM, b) somehow hit a depth limit in InstCombine/ValueTracking to avoid fixing the alignment, and c) then have generated an instruction which fails codegen rather than simply emitting a slow libcall. All around, pretty hard to hit.
      
      Differential Revision: https://reviews.llvm.org/D58809
      
      llvm-svn: 355217
      2226e9a7
    • Philip Reames's avatar
      [InstCombine] Extend "idempotent" atomicrmw optimizations to floating point · 77982868
      Philip Reames authored
      An idempotent atomicrmw is one that does not change memory in the process of execution.  We have already added handling for the various integer operations; this patch extends the same handling to floating point operations which were recently added to IR.
      
      Note: At the moment, we canonicalize idempotent fsub to fadd when ordering requirements prevent us from using a load.  As discussed in the review, I will be replacing this with canonicalizing both floating point ops to integer ops in the near future.
      
      Differential Revision: https://reviews.llvm.org/D58251
      
      llvm-svn: 355210
      77982868
    • Jonas Hahnfeld's avatar
      Hide two unused debugging methods, NFCI. · e071cd86
      Jonas Hahnfeld authored
      GCC correctly moans that PlainCFGBuilder::isExternalDef(llvm::Value*) and
      StackSafetyDataFlowAnalysis::verifyFixedPoint() are defined but not used
      in Release builds. Hide them behind 'ifndef NDEBUG'.
      
      llvm-svn: 355205
      e071cd86
    • Manman Ren's avatar
      Try to fix NetBSD buildbot breakage introduced in D57463. · 576124a3
      Manman Ren authored
      By including the header file in the source.
      
      llvm-svn: 355202
      576124a3
    • Fangrui Song's avatar
      [ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid... · f4b25f70
      Fangrui Song authored
      [ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM
      
      Summary:
      ConstIntInfoVec contains elements extracted from the previous function.
      In new PM, releaseMemory() is not called and the dangling elements can
      cause segfault in findConstantInsertionPoint.
      
      Rename releaseMemory() to cleanup() to deliver the idea that it is
      mandatory and call cleanup() in ConstantHoistingPass::runImpl to fix
      this.
      
      Reviewers: ormris, zzheng, dmgreen, wmi
      
      Reviewed By: ormris, wmi
      
      Subscribers: llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58589
      
      llvm-svn: 355174
      f4b25f70
  9. Feb 28, 2019
    • Reid Kleckner's avatar
      [sancov] Instrument reachable blocks that end in unreachable · 701593f1
      Reid Kleckner authored
      Summary:
      These sorts of blocks often contain calls to noreturn functions, like
      longjmp, throw, or trap. If they don't end the program, they are
      "interesting" from the perspective of sanitizer coverage, so we should
      instrument them. This was discussed in https://reviews.llvm.org/D57982.
      
      Reviewers: kcc, vitalybuka
      
      Subscribers: llvm-commits, craig.topper, efriedma, morehouse, hiraditya
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58740
      
      llvm-svn: 355152
      701593f1
    • Manman Ren's avatar
      Add a module pass for order file instrumentation · 1829512d
      Manman Ren authored
      The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name.
      
      In this pass, we add three global variables:
      (1) an order file buffer: a circular buffer at its own llvm section.
      (2) a bitmap for each module: one byte for each function to say if the function is already executed.
      (3) a global index to the order file buffer.
      
      At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index.
      
      Differential Revision:  https://reviews.llvm.org/D57463
      
      llvm-svn: 355133
      1829512d
    • Rong Xu's avatar
      [PGO] Context sensitive PGO (part 2) · a6ff69f6
      Rong Xu authored
      Part 2 of CSPGO changes (mostly related to ProfileSummary).
      Note that I use a default parameter in setProfileSummary() and getSummary().
      This is to break the dependency in clang. I will make the parameter explicit
      after changing clang in a separated patch.
      
      Differential Revision: https://reviews.llvm.org/D54175
      
      llvm-svn: 355131
      a6ff69f6
    • Sanjay Patel's avatar
      [InstCombine] fold adds of constants separated by sext/zext · 4a47f5f5
      Sanjay Patel authored
      This is part of a transform that may be done in the backend:
      D13757
      ...but it should always be beneficial to fold this sooner in IR
      for all targets.
      
      https://rise4fun.com/Alive/vaiW
      
        Name: sext add nsw
        %add = add nsw i8 %i, C0
        %ext = sext i8 %add to i32
        %r = add i32 %ext, C1
        =>
        %s = sext i8 %i to i32
        %r = add i32 %s, sext(C0)+C1
      
        Name: zext add nuw
        %add = add nuw i8 %i, C0
        %ext = zext i8 %add to i16
        %r = add i16 %ext, C1
        =>
        %s = zext i8 %i to i16
        %r = add i16 %s, zext(C0)+C1
      
      llvm-svn: 355118
      4a47f5f5
    • Chijun Sima's avatar
      Make MergeBlockIntoPredecessor conformant to the precondition of calling DTU.applyUpdates · 58618763
      Chijun Sima authored
      Summary:
      It is mentioned in the document of DTU that "It is illegal to submit any update that has already been submitted, i.e., you are supposed not to insert an existent edge or delete a nonexistent edge." It is dangerous to violet this rule because DomTree and PostDomTree occasionally crash on this scenario.
      
      This patch fixes `MergeBlockIntoPredecessor`, making it conformant to this precondition.
      
      Reviewers: kuhar, brzycki, chandlerc
      
      Reviewed By: brzycki
      
      Subscribers: hiraditya, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58444
      
      llvm-svn: 355105
      58618763
    • Bjorn Pettersson's avatar
      Add support for computing "zext of value" in KnownBits. NFCI · d30f308a
      Bjorn Pettersson authored
      Summary:
      The description of KnownBits::zext() and
      KnownBits::zextOrTrunc() has confusingly been telling
      that the operation is equivalent to zero extending the
      value we're tracking. That has not been true, instead
      the user has been forced to explicitly set the extended
      bits as known zero afterwards.
      
      This patch adds a second argument to KnownBits::zext()
      and KnownBits::zextOrTrunc() to control if the extended
      bits should be considered as known zero or as unknown.
      
      Reviewers: craig.topper, RKSimon
      
      Reviewed By: RKSimon
      
      Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits
      
      Tags: #llvm
      
      Differential Revision: https://reviews.llvm.org/D58650
      
      llvm-svn: 355099
      d30f308a
Loading