Skip to content
  1. Jan 25, 2021
    • Nikita Popov's avatar
      [LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943) · 835104a1
      Nikita Popov authored
      When LSR converts a branch on the pre-inc IV into a branch on the
      post-inc IV, the nowrap flags on the addition may no longer be valid.
      Previously, a poison result of the addition might have been ignored,
      in which case the program was well defined. After branching on the
      post-inc IV, we might be branching on poison, which is undefined behavior.
      
      Fix this by discarding nowrap flags which are not present on the SCEV
      expression. Nowrap flags on the SCEV expression are proven by SCEV
      to always hold, independently of how the expression will be used.
      This is essentially the same fix we applied to IndVars LFTR, which
      also performs this kind of pre-inc to post-inc conversion.
      
      I believe a similar problem can also exist for getelementptr inbounds,
      but I was not able to come up with a problematic test case. The
      inbounds case would have to be addressed in a differently anyway
      (as SCEV does not track this property).
      
      Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.
      
      Differential Revision: https://reviews.llvm.org/D95286
      835104a1
    • Richard Smith's avatar
      Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV" · 925ae8c7
      Richard Smith authored
      This reverts commit 53176c16, which
      introduceed a layering violation. LLVM's IR library can't include
      headers from Analysis.
      925ae8c7
    • Akira Hatanaka's avatar
      [ObjC][ARC] Annotate calls with attributes instead of emitting retainRV · 53176c16
      Akira Hatanaka authored
      or claimRV calls in the IR
      
      Background:
      
      This patch makes changes to the front-end and middle-end that are
      needed to fix a longstanding problem where llvm breaks ARC's autorelease
      optimization (see the link below) by separating calls from the marker
      instructions or retainRV/claimRV calls. The backend changes are in
      https://reviews.llvm.org/D92569.
      
      https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue
      
      What this patch does to fix the problem:
      
      - The front-end annotates calls with attribute "clang.arc.rv"="retain"
        or "clang.arc.rv"="claim", which indicates the call is implicitly
        followed by a marker instruction and a retainRV/claimRV call that
        consumes the call result. This is currently done only when the target
        is arm64 and the optimization level is higher than -O0.
      
      - ARC optimizer temporarily emits retainRV/claimRV calls after the
        annotated calls in the IR and removes the inserted calls after
        processing the function.
      
      - ARC contract pass emits retainRV/claimRV calls after the annotated
        calls. It doesn't remove the attribute on the call since the backend
        needs it to emit the marker instruction. The retainRV/claimRV calls
        are emitted late in the pipeline to prevent optimization passes from
        transforming the IR in a way that makes it harder for the ARC
        middle-end passes to figure out the def-use relationship between the
        call and the retainRV/claimRV calls (which is the cause of PR31925).
      
      - The function inliner removes the autoreleaseRV call in the callee that
        returns the result if nothing in the callee prevents it from being
        paired up with the calls annotated with "clang.arc.rv"="retain/claim"
        in the caller. If the call is annotated with "claim", a release call
        is inserted since autoreleaseRV+claimRV is equivalent to a release. If
        it cannot find an autoreleaseRV call, it tries to transfer the
        attributes to a function call in the callee. This is important since
        ARC optimizer can remove the autoreleaseRV call returning the callee
        result, which makes it impossible to pair it up with the retainRV or
        claimRV call in the caller. If that fails, it simply emits a retain
        call in the IR if the call is annotated with "retain" and does nothing
        if it's annotated with "claim".
      
      - This patch teaches dead argument elimination pass not to change the
        return type of a function if any of the calls to the function are
        annotated with attribute "clang.arc.rv". This is necessary since the
        pass can incorrectly determine nothing in the IR uses the function
        return, which can happen since the front-end no longer explicitly
        emits retainRV/claimRV calls in the IR, and change its return type to
        'void'.
      
      Future work:
      
      - Use the attribute on x86-64.
      
      - Fix the auto upgrader to convert call+retainRV/claimRV pairs into
        calls annotated with the attributes.
      
      rdar://71443534
      
      Differential Revision: https://reviews.llvm.org/D92808
      53176c16
    • Florian Hahn's avatar
      [VPlan] Replace uses with new value in VPInstructionsToVPRecipe (NFC). · 76afbf60
      Florian Hahn authored
      Now that VPRecipeBase inherits from VPDef, we can always use the new
      VPValue for replacement, if the recipe defines one. Given the recipes
      that are supported at the moment, all new recipes must have either 0 or
      1 defined values.
      76afbf60
    • Nick Desaulniers's avatar
      [GVN] do not repeat PRE on failure to split critical edge · d3681289
      Nick Desaulniers authored
      Fixes an infinite loop encountered in GVN.
      
      GVN will delay PRE if it encounters critical edges, attempt to split
      them later via calls to SplitCriticalEdge(), then restart.
      
      The caller of GVN::splitCriticalEdges() assumed a return value of true
      meant that critical edges were split, that the IR had changed, and that
      PRE should be re-attempted, upon which we loop infinitely.
      
      This was exposed after D88438, by compiling the Linux kernel for s390,
      but the test case is reproducible on x86.
      
      Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261
      
      Reviewed By: void
      
      Differential Revision: https://reviews.llvm.org/D94996
      d3681289
    • Wei Mi's avatar
      [SampleFDO] Report error when reading a bad/incompatible profile instead of · c9cd9a00
      Wei Mi authored
      turning off SampleFDO silently.
      
      Currently sample loader pass turns off SampleFDO optimization silently when
      it sees error in reading the profile. This behavior will defeat the tests
      which could have caught those bad/incompatible profile problems. This patch
      change the behavior to report error.
      
      Differential Revision: https://reviews.llvm.org/D95269
      c9cd9a00
    • Xun Li's avatar
      17c3538a
    • Florian Hahn's avatar
      [VPlan] Handle scalarized values in VPTransformState. · 3201274d
      Florian Hahn authored
      This patch adds plumbing to handle scalarized values directly in
      VPTransformState.
      
      Reviewed By: gilr
      
      Differential Revision: https://reviews.llvm.org/D92282
      3201274d
    • Sanjay Patel's avatar
      [InstCombine] narrow min/max intrinsics with extended inputs · 09a136bc
      Sanjay Patel authored
      We can sink extends after min/max if they match and would
      not change the sign-interpreted compare. The only combo
      that doesn't work is zext+smin/smax because the zexts
      could change a negative number into positive:
      https://alive2.llvm.org/ce/z/D6sz6J
      
      Sext+umax/umin works:
      
        define i32 @src(i8 %x, i8 %y) {
        %0:
          %sx = sext i8 %x to i32
          %sy = sext i8 %y to i32
          %m = umax i32 %sx, %sy
          ret i32 %m
        }
        =>
        define i32 @tgt(i8 %x, i8 %y) {
        %0:
          %m = umax i8 %x, %y
          %r = sext i8 %m to i32
          ret i32 %r
        }
        Transformation seems to be correct!
      09a136bc
    • Sander de Smalen's avatar
      [SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost. · 171d1248
      Sander de Smalen authored
      This change also changes getReductionCost to return InstructionCost,
      and it simplifies two expressions by removing a redundant 'isValid' check.
      171d1248
  2. Jan 24, 2021
  3. Jan 23, 2021
    • Roman Lebedev's avatar
      [NFC][SimplifyCFG] Extract... · 6f275327
      Roman Lebedev authored
      [NFC][SimplifyCFG] Extract CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses() out of PerformBranchToCommonDestFolding()
      
      To be used in PerformValueComparisonIntoPredecessorFolding()
      6f275327
    • Roman Lebedev's avatar
    • Roman Lebedev's avatar
      [NFC][SimplifyCFG] Extract PerformValueComparisonIntoPredecessorFolding() out... · a4e6c2e6
      Roman Lebedev authored
      [NFC][SimplifyCFG] Extract PerformValueComparisonIntoPredecessorFolding() out of FoldValueComparisonIntoPredecessors()
      
      Less nested code is much easier to follow and modify.
      a4e6c2e6
    • Nikita Popov's avatar
      [IR] Add NoAliasScopeDeclInst (NFC) · c83cff45
      Nikita Popov authored
      Add an intrinsic type class to represent the
      llvm.experimental.noalias.scope.decl intrinsic, to make code
      working with it a bit nicer by hiding the metadata extraction
      from view.
      c83cff45
    • Kazu Hirata's avatar
      [llvm] Use pop_back_val (NFC) · 1238378f
      Kazu Hirata authored
      1238378f
    • Florian Hahn's avatar
      [InstCombine] Set MadeIRChange in replaceInstUsesWith. · d60b74c2
      Florian Hahn authored
      Some utilities used by InstCombine, like SimplifyLibCalls, may add new
      instructions and replace the uses of a call, but return nullptr because
      the inserted call produces multiple results.
      
      Previously, the replaced library calls would get removed by
      InstCombine's deleter, but after
      29207707 this may not happen, if the
      willreturn attribute is missing.
      
      As a work-around, update replaceInstUsesWith to set MadeIRChange, if it
      replaces any uses. This catches the cases where it is used as replacer
      by utilities used by InstCombine and seems useful in general; updating
      uses will modify the IR.
      
      This fixes an expensive-check failure when replacing
      @__sinpif/@__cospifi with @__sincospif_sret.
      d60b74c2
    • Sanjay Patel's avatar
      [SLP] fix fast-math-flag propagation on FP reductions · a6f02212
      Sanjay Patel authored
      As shown in the test diffs, we could miscompile by
      propagating flags that did not exist in the original
      code.
      
      The flags required for fmin/fmax reductions will be
      fixed in a follow-up patch.
      a6f02212
    • Florian Hahn's avatar
      [Local] Treat calls that may not return as being alive. · 29207707
      Florian Hahn authored
      With the addition of the `willreturn` attribute, functions that may
      not return (e.g. due to an infinite loop) are well defined, if they are
      not marked as `willreturn`.
      
      This patch updates `wouldInstructionBeTriviallyDead` to not consider
      calls that may not return as dead.
      
      This patch still provides an escape hatch for intrinsics, which are
      still assumed as willreturn unconditionally. It will be removed once
      all intrinsics definitions have been reviewed and updated.
      
      Reviewed By: jdoerfert
      
      Differential Revision: https://reviews.llvm.org/D94106
      29207707
    • Roman Lebedev's avatar
      [SimplifyCFG] Change 'LoopHeaders' to be ArrayRef<WeakVH>, not a naked set,... · 022da61f
      Roman Lebedev authored
      [SimplifyCFG] Change 'LoopHeaders' to be ArrayRef<WeakVH>, not a naked set, thus avoiding dangling pointers
      
      If i change it to AssertingVH instead, a number of existing tests fail,
      which means we don't consistently remove from the set when deleting blocks,
      which means newly-created blocks may happen to appear in that set
      if they happen to occupy the same memory chunk as did some block
      that was in the set originally.
      
      There are many places where we delete blocks,
      and while we could probably consistently delete from LoopHeaders
      when deleting a block in transforms located in SimplifyCFG.cpp itself,
      transforms located elsewhere (Local.cpp/BasicBlockUtils.cpp) also may
      delete blocks, and it doesn't seem good to teach them to deal with it.
      
      Since we at most only ever delete from LoopHeaders,
      let's just delegate to WeakVH to do that automatically.
      
      But to be honest, personally, i'm not sure that the idea
      behind LoopHeaders is sound.
      022da61f
    • Jeroen Dobbelaere's avatar
      [InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments. · 2b9a834c
      Jeroen Dobbelaere authored
      Insert a llvm.experimental.noalias.scope.decl intrinsic that identifies where a noalias argument was inlined.
      
      This patch includes some refactorings from D90104.
      
      Reviewed By: nikic
      
      Differential Revision: https://reviews.llvm.org/D93040
      2b9a834c
    • Zequan Wu's avatar
      [InstCombine] remove incompatible attribute when simplifying some lib calls · 867bdfef
      Zequan Wu authored
      Like D95088, remove incompatible attribute in more lib calls.
      
      Differential Revision: https://reviews.llvm.org/D95278
      867bdfef
    • Philip Reames's avatar
      [LoopDeletion] Handle inner loops w/untaken backedges · ef51eed3
      Philip Reames authored
      This builds on the restricted after initial revert form of D93906, and adds back support for breaking backedges of inner loops. It turns out the original invalidation logic wasn't quite right, specifically around the handling of LCSSA.
      
      When breaking the backedge of an inner loop, we can cause blocks which were in the outer loop only because they were also included in a sub-loop to be removed from both loops. This results in the exit block set for our original parent loop changing, and thus a need for new LCSSA phi nodes.
      
      This case happens when the inner loop has an exit block which is also an exit block of the parent, and there's a block in the child which reaches an exit to said block without also reaching an exit to the parent loop.
      
      (I'm describing this in terms of the immediate parent, but the problem is general for any transitive parent in the nest.)
      
      The approach implemented here involves a potentially expensive LCSSA rebuild.  Perf testing during review didn't show anything concerning, but we may end up needing to revert this if anyone encounters a practical compile time issue.
      
      Differential Revision: https://reviews.llvm.org/D94378
      ef51eed3
  4. Jan 22, 2021
Loading