Skip to content
  1. Jul 22, 2015
    • Chandler Carruth's avatar
      [PM/AA] Remove the last of the legacy update API from AliasAnalysis as · a1032a0f
      Chandler Carruth authored
      part of simplifying its interface and usage in preparation for porting
      to work with the new pass manager.
      
      Note that this will likely expose that we have dead arguments, members,
      and maybe even pass requirements for AA. I'll be cleaning those up in
      seperate patches. This just zaps the actual update API.
      
      Differential Revision: http://reviews.llvm.org/D11325
      
      llvm-svn: 242881
      a1032a0f
    • Chandler Carruth's avatar
      [PM/AA] Switch to an early-exit. NFC. This was split out of another · d86a4f5e
      Chandler Carruth authored
      change because the diff is *useless*. I assure you, I just switched to
      early-return in this function.
      
      Cleanup in preparation for my next commit, as requested in code review!
      
      llvm-svn: 242880
      d86a4f5e
    • Chen Li's avatar
      [LoopUnswitch] Code refactoring to separate trivial loop unswitch and... · c0f3a158
      Chen Li authored
      [LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop()
      
      Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change.
      
      Reviewers: meheff, reames, broune
      
      Subscribers: llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D11276
      
      llvm-svn: 242873
      c0f3a158
    • Chandler Carruth's avatar
      [SROA] Fix a nasty pile of bugs to do with big-endian, different alloca · ccffdaf7
      Chandler Carruth authored
      types and loads, loads or stores widened past the size of an alloca,
      etc.
      
      This started off with a bug report about big-endian behavior with
      bitfields and loads and stores to a { i32, i24 } struct. An initial
      attempt to fix this was sent for review in D10357, but that didn't
      really get to the root of the problem.
      
      The core issue was that canConvertValue and convertValue in SROA were
      handling different bitwidth integers by doing a zext of the integer. It
      wouldn't do a trunc though, only a zext! This would in turn lead SROA to
      form an i24 load from an i24 alloca, zext it to i32, and then use it.
      This would at least produce the wrong value for big-endian systems.
      
      One of my many false starts here was to correct the computation for
      big-endian systems by shifting. But this doesn't actually work because
      the original code has a 64-bit store to the entire 8 bytes, and a 32-bit
      load of the last 4 bytes, and because the alloc size is 8 bytes, we
      can't lose that last (least significant if bigendian) byte! The real
      problem here is that we're forming an i24 load in SROA which is actually
      not sufficiently wide to load all of the necessary bits here. The source
      has an i32 load, and SROA needs to form that as well.
      
      The straightforward way to do this is to disable the zext logic in
      canConvertValue and convertValue, forcing us to actually load all
      32-bits. This seems like a really good change, but it in turn breaks
      several other parts of SROA.
      
      First in the chain of knock-on failures, we had places where we were
      doing integer-widening promotion even though some of the integer loads
      or stores extended *past the end* of the alloca's memory! There was even
      a comment about preventing this, but it only prevented the case where
      the type had a different bit size from its store size. So I added checks
      to handle the cases where we actually have a widened load or store and
      to avoid trying to special integer widening promotion in those cases.
      
      Second, we actually rely on the ability to promote in the face of loads
      past the end of an alloca! This is important so that we can (for
      example) speculate loads around PHI nodes to do more promotion. The bits
      loaded are garbage, but as long as they aren't used and the alignment is
      suitable high (which it wasn't in the test case!) this is "fine". And we
      can't stop promoting here, lots of things stop working well if we do. So
      we need to add specific logic to handle the extension (and truncation)
      case, but *only* where that extension or truncation are over bytes that
      *are outside the alloca's allocated storage* and thus totally bogus to
      load or store.
      
      And of course, once we add back this correct handling of extension or
      truncation, we need to correctly handle bigendian systems to avoid
      re-introducing the exact bug that started us off on this chain of misery
      in the first place, but this time even more subtle as it only happens
      along speculated loads atop a PHI node.
      
      I've ported an existing test for PHI speculation to the big-endian test
      file and checked that we get that part correct, and I've added several
      more interesting big-endian test cases that should help check that we're
      getting this correct.
      
      Fun times.
      
      llvm-svn: 242869
      ccffdaf7
  2. Jul 21, 2015
  3. Jul 19, 2015
  4. Jul 18, 2015
    • Yaron Keren's avatar
      Rangify for loops in GlobalDCE, NFC. · 3d49f6df
      Yaron Keren authored
      llvm-svn: 242619
      3d49f6df
    • Chandler Carruth's avatar
      [PM/AA] Remove the addEscapingUse update API that won't be easy to · 9f2bf1af
      Chandler Carruth authored
      directly model in the new PM.
      
      This also was an incredibly brittle and expensive update API that was
      never fully utilized by all the passes that claimed to preserve AA, nor
      could it reasonably have been extended to all of them. Any number of
      places add uses of values. If we ever wanted to reliably instrument
      this, we would want a callback hook much like we have with ValueHandles,
      but doing this for every use addition seems *extremely* expensive in
      terms of compile time.
      
      The only user of this update mechanism is GlobalsModRef. The idea of
      using this to keep it up to date doesn't really work anyways as its
      analysis requires a symmetric analysis of two different memory
      locations. It would be very hard to make updates be sufficiently
      rigorous to *guarantee* symmetric analysis in this way, and it pretty
      certainly isn't true today.
      
      However, folks have been using GMR with this update for a long time and
      seem to not be hitting the issues. The reported issue that the update
      hook fixes isn't even a problem any more as other changes to
      GetUnderlyingObject worked around it, and that issue stemmed from *many*
      years ago. As a consequence, a prior patch provided a flag to control
      the unsafe behavior of GMR, and this patch removes the update mechanism
      that has questionable compile-time tradeoffs and is causing problems
      with moving to the new pass manager. Note the lack of test updates --
      not one test in tree actually requires this update, even for a contrived
      case.
      
      All of this was extensively discussed on the dev list, this patch will
      just enact what that discussion decides on. I'm sending it for review in
      part to show what I'm planning, and in part to show the *amazing* amount
      of work this avoids. Every call to the AA here is something like three
      to six indirect function calls, which in the non-LTO pipeline never do
      any work! =[
      
      Differential Revision: http://reviews.llvm.org/D11214
      
      llvm-svn: 242605
      9f2bf1af
    • Evgeniy Stepanov's avatar
      [asan] Fix shadow mapping on Android/AArch64. · 9cb08f82
      Evgeniy Stepanov authored
      Instrumentation and the runtime library were in disagreement about
      ASan shadow offset on Android/AArch64.
      
      This fixes a large number of existing tests on Android/AArch64.
      
      llvm-svn: 242595
      9cb08f82
  5. Jul 17, 2015
  6. Jul 16, 2015
    • Peter Collingbourne's avatar
      Internalize: internalize comdat members as a group, and drop comdat on such members. · 9b0fe610
      Peter Collingbourne authored
      Internalizing an individual comdat group member without also internalizing
      the other members of the comdat can break comdat semantics. For example,
      if a module contains a reference to an internalized comdat member, and the
      linker chooses a comdat group from a different object file, this will break
      the reference to the internalized member.
      
      This change causes the internalizer to only internalize comdat members if all
      other members of the comdat are not externally visible. Once a comdat group
      has been fully internalized, there is no need to apply comdat rules to its
      members; later optimization passes (e.g. globaldce) can legally drop individual
      members of the comdat. So we drop the comdat attribute from all comdat members.
      
      Differential Revision: http://reviews.llvm.org/D10679
      
      llvm-svn: 242423
      9b0fe610
    • Tobias Grosser's avatar
      Add PM extension point EP_VectorizerStart · 39a7bd18
      Tobias Grosser authored
      This extension point allows passes to be executed right before the vectorizer
      and other highly target specific optimizations are run.
      
      llvm-svn: 242389
      39a7bd18
    • Cong Hou's avatar
      Create a wrapper pass for BranchProbabilityInfo. · ab23bfbc
      Cong Hou authored
      This new wrapper pass is useful when we want to do branch probability analysis conditionally (e.g. only in PGO mode) but don't want to add one more pass dependence.
      
      http://reviews.llvm.org/D11241
      
      llvm-svn: 242349
      ab23bfbc
    • Chen Li's avatar
      [LoopUnswitch] Add an else clause to IsTrivialUnswitchCondition() when... · 3f5ed156
      Chen Li authored
      [LoopUnswitch] Add an else clause to IsTrivialUnswitchCondition() when checking HeaderTerm instruction type
      
      Summary:
      This is a trivial code change with no functionality effect. 
      
      When LoopUnswitch determines trivial unswitch condition, it checks whether the loop header's terminator instruction is a branch instruction or switch instruction since trivial unswitch condition can only apply to these two instruction types. The current code does not fail the check directly on other instruction types, but check the nullness of LoopExitBB variable instead. The added else clause makes the check fail immediately on other instruction types and makes the code more obvious.  
      
      Reviewers: reames
      
      Subscribers: llvm-commits
      
      Differential Revision: http://reviews.llvm.org/D11239
      
      llvm-svn: 242345
      3f5ed156
  7. Jul 15, 2015
  8. Jul 14, 2015
    • Tim Northover's avatar
      GVN: use a static array instead of regenerating it each time. NFC. · 586b7419
      Tim Northover authored
      llvm-svn: 242202
      586b7419
    • Tim Northover's avatar
      GVN: tolerate an instruction being replaced without existing in the leaderboard · d5fdef01
      Tim Northover authored
      Sometimes an incidentally created instruction can duplicate a Value used
      elsewhere. It then often doesn't end up in the leader table. If it's later
      removed, we attempt to remove it from the leader table and segfault.
      
      Instead we should just ignore the removal request, which won't cause any
      problems. The reverse situation, where the original instruction is replaced by
      the new one (which you might think could leave the leader table empty) cannot
      occur, because the incidental instruction will never be found in the first
      place.
      
      llvm-svn: 242199
      d5fdef01
    • David Majnemer's avatar
      [SROA] Don't de-atomic volatile loads and stores · 62690b19
      David Majnemer authored
      Volatile loads and stores are made visible in global state regardless of
      what memory is involved.  It is not correct to disregard the ordering
      and synchronization scope because it is possible to synchronize with
      memory operations performed by hardware.
      
      This partially addresses PR23737.
      
      llvm-svn: 242126
      62690b19
    • Reid Kleckner's avatar
      Update enforceKnownAlignment after the isWeakForLinker semantic change · 486fa397
      Reid Kleckner authored
      Previously we would refrain from attempting to increase the linkage of
      available_externally globals because they were considered weak for the
      linker. Now they are treated more like a declaration instead of a weak
      definition.
      
      This was causing SSE alignment faults in Chromuim, when some code
      assumed it could increase the alignment of a dllimported global that it
      didn't control.  http://crbug.com/509256
      
      llvm-svn: 242091
      486fa397
  9. Jul 13, 2015
    • Pete Cooper's avatar
      Loop idiom recognizer was replacing too many uses of popcount. · 90d95edb
      Pete Cooper authored
      When spotting that a loop can use ctpop, we were incorrectly replacing all uses of a value with a value derived from ctpop.
      
      The bug here was exposed because we were replacing a use prior to the ctpop with the ctpop value and so we have a use before def, i.e., we changed
      
       %tobool.5 = icmp ne i32 %num, 0
       store i1 %tobool.5, i1* %ptr
       br i1 %tobool.5, label %for.body.lr.ph, label %for.end
      
      to
      
       store i1 %1, i1* %ptr
       %0 = call i32 @llvm.ctpop.i32(i32 %num)
       %1 = icmp ne i32 %0, 0
       br i1 %1, label %for.body.lr.ph, label %for.end
      
      Even if we inserted the ctpop so that it dominates the store here, that would still be incorrect.  The store doesn’t want the result of ctpop.
      
      The fix is very simple, and involves replacing only the branch condition with the ctpop instead of all uses.
      
      Reviewed by Hal Finkel.
      
      llvm-svn: 242068
      90d95edb
    • Mark Heffernan's avatar
      Enable runtime unrolling with unroll pragma metadata · d7ebc241
      Mark Heffernan authored
      Enable runtime unrolling for loops with unroll count metadata ("#pragma unroll N")
      and a runtime trip count. Also, do not unroll loops with unroll full metadata if the
      loop has a runtime loop count. Previously, such loops would be unrolled with a
      very large threshold (pragma-unroll-threshold) if runtime unrolled happened to be
      enabled resulting in a very large (and likely unwise) unroll factor.
      
      llvm-svn: 242047
      d7ebc241
    • Benjamin Kramer's avatar
      Avoid using Loop::getSubLoopsVector. · e448b5be
      Benjamin Kramer authored
      Passes should never modify it, just use the const version. While there
      reduce copying in LoopInterchange. No functional change intended.
      
      llvm-svn: 242041
      e448b5be
    • Rafael Espindola's avatar
      Remove unused variable. · c1d63f74
      Rafael Espindola authored
      Sorry I missed it in the previous commit.
      
      llvm-svn: 242032
      c1d63f74
Loading