Skip to content
  1. Feb 25, 2014
    • Rafael Espindola's avatar
      Make DataLayout a plain object, not a pass. · 93512512
      Rafael Espindola authored
      Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
      don't don't handle passes to also use DataLayout.
      
      llvm-svn: 202168
      93512512
    • Rafael Espindola's avatar
      Factor out calls to AA.getDataLayout(). · 6d6e87be
      Rafael Espindola authored
      llvm-svn: 202157
      6d6e87be
    • Chandler Carruth's avatar
      [SROA] Use the original load name with the SROA-prefixed IRB rather than · 25adb7b0
      Chandler Carruth authored
      just "load". This helps avoid pointless de-duping with order-sensitive
      numbers as we already have unique names from the original load. It also
      makes the resulting IR quite a bit easier to read.
      
      llvm-svn: 202140
      25adb7b0
    • Chandler Carruth's avatar
      [SROA] Thread the ability to add a pointer-specific name prefix through · cb93cd2d
      Chandler Carruth authored
      the pointer adjustment code. This is the primary code path that creates
      totally new instructions in SROA and being able to lump them based on
      the pointer value's name for which they were created causes
      *significantly* fewer name collisions and general noise in the debug
      output. This is particularly significant because it is making it much
      harder to track down instability in the output of SROA, as name
      de-duplication is a totally harmless form of instability that gets in
      the way of seeing real problems.
      
      The new fancy naming scheme tries to dig out the root "pre-SROA" name
      for pointer values and associate that all the way through the pointer
      formation instructions. Digging out the root is important to prevent the
      multiple iterative rounds of SROA from just layering too much cruft on
      top of cruft here. We already track the layers of SROAs iteration in the
      alloca name prefix. We don't need to duplicate it here.
      
      Should have no functionality change, and shouldn't have any really
      measurable impact on NDEBUG builds, as most of the complex logic is
      debug-only.
      
      llvm-svn: 202139
      cb93cd2d
    • Chandler Carruth's avatar
      [SROA] Rather than copying the logic for building a name prefix into the · 51175533
      Chandler Carruth authored
      PHI-pointer builder, just copy the builder and clobber the obvious
      fields.
      
      llvm-svn: 202136
      51175533
    • Chandler Carruth's avatar
      [SROA] Simplify some of the logic to dig out the old pointer value by · 8183a50f
      Chandler Carruth authored
      using OldPtr more heavily. Lots of this code was written before the
      rewriter had an OldPtr member setup ahead of time. There are already
      asserts in place that should ensure this doesn't change any
      functionality.
      
      llvm-svn: 202135
      8183a50f
    • Chandler Carruth's avatar
      [SROA] Adjust to new clang-format style. · 7625c54e
      Chandler Carruth authored
      llvm-svn: 202134
      7625c54e
    • Chandler Carruth's avatar
      [SROA] Fix a *glaring* bug in r202091: you have to actually *write* · a8c4cc68
      Chandler Carruth authored
      the break statement, not just think it to yourself....
      
      No idea how this worked at all, much less survived most bots, my
      bootstrap, and some bot bootstraps!
      
      The Polly one didn't survive, and this was filed as PR18959. I don't
      have a reduced test case and honestly I'm not seeing the need. What we
      probably need here are better asserts / debug-build behavior in
      SmallPtrSet so that this madness doesn't make it so far.
      
      llvm-svn: 202129
      a8c4cc68
    • Alexey Samsonov's avatar
      Silence GCC warning · 26af6f7f
      Alexey Samsonov authored
      llvm-svn: 202119
      26af6f7f
    • Alp Toker's avatar
      Fix typos · 70b36995
      Alp Toker authored
      llvm-svn: 202107
      70b36995
    • Chandler Carruth's avatar
      [SROA] Add a debugging tool which shuffles the slices sequence prior to · 83cee772
      Chandler Carruth authored
      sorting it. This helps uncover latent reliance on the original ordering
      which aren't guaranteed to be preserved by std::sort (but often are),
      and which are based on the use-def chain orderings which also aren't
      (technically) guaranteed.
      
      Only available in C++11 debug builds, and behind a flag to prevent noise
      at the moment, but this is generally useful so figured I'd put it in the
      tree rather than keeping it out-of-tree.
      
      llvm-svn: 202106
      83cee772
    • Chandler Carruth's avatar
      [SROA] Use a more direct way of determining whether we are processing · bb2a9324
      Chandler Carruth authored
      the destination operand or source operand of a memmove.
      
      It so happens that it was impossible for SROA to try to rewrite
      self-memmove where the operands are *identical*, because either such
      a think is volatile (and we don't rewrite) or it is non-volatile, and we
      don't even register it as a use of the alloca.
      
      However, making the 'IsDest' test *rely* on this subtle fact is... Very
      confusing for the reader. We should use the direct and readily available
      test of the Use* which gives us concrete information about which operand
      is being rewritten.
      
      No functionality changed, I hope! ;]
      
      llvm-svn: 202103
      bb2a9324
    • Chandler Carruth's avatar
      [SROA] Fix another instability in SROA with respect to the slice · 3bf18ed5
      Chandler Carruth authored
      ordering.
      
      The fundamental problem that we're hitting here is that the use-def
      chain ordering is *itself* not a stable thing to be relying on in the
      rewriting for SROA. Further, we use a non-stable sort over the slices to
      arrange them based on the section of the alloca they're operating on.
      With a debugging STL implementation (or different implementations in
      stage2 and stage3) this can cause stage2 != stage3.
      
      The specific aspect of this problem fixed in this commit deals with the
      rewriting and load-speculation around PHIs and Selects. This, like many
      other aspects of the use-rewriting in SROA, is really part of the
      "strong SSA-formation" that is doen by SROA where it works very hard to
      canonicalize loads and stores in *just* the right way to satisfy the
      needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the
      same alloca, we test that loads downstream of the select are
      speculatable around it twice. If only one of the operands to the select
      needs to be rewritten, then if we get lucky we rewrite that one first
      and the select is immediately speculatable. This can cause the order of
      operand visitation, and thus the order of slices to be rewritten, to
      change an alloca from promotable to non-promotable and vice versa.
      
      The fix is to defer all of the speculation until *after* the rewrite
      phase is done. Once we've rewritten everything, we can accurately test
      for whether speculation will work (once, instead of twice!) and the
      order ceases to matter.
      
      This also happens to simplify the other subtlety of speculation -- we
      need to *not* speculate anything unless the result of speculating will
      make the alloca fully promotable by mem2reg. I had a previous attempt at
      simplifying this, but it was still pretty horrible.
      
      There is actually already a *really* nice test case for this in
      basictest.ll, but on multiple STL implementations and inputs, we just
      got "lucky". Fortunately, the test case is very small and we can
      essentially build it in exactly the opposite way to get reasonable
      coverage in both directions even from normal STL implementations.
      
      llvm-svn: 202092
      3bf18ed5
    • Rafael Espindola's avatar
      Make some DataLayout pointers const. · aeff8a9c
      Rafael Espindola authored
      No functionality change. Just reduces the noise of an upcoming patch.
      
      llvm-svn: 202087
      aeff8a9c
  2. Feb 22, 2014
  3. Feb 21, 2014
  4. Feb 19, 2014
    • Tim Northover's avatar
      X86 CodeGenPrep: sink shufflevectors before shifts · aeb8e06d
      Tim Northover authored
      On x86, shifting a vector by a scalar is significantly cheaper than shifting a
      vector by another fully general vector. Unfortunately, because SelectionDAG
      operates on just one basic block at a time, the shufflevector instruction that
      reveals whether the right-hand side of a shift *is* really a scalar is often
      not visible to CodeGen when it's needed.
      
      This adds another handler to CodeGenPrepare, to sink any useful shufflevector
      instructions down to the basic block where they're used, predicated on a target
      hook (since on other architectures, doing so will often just introduce extra
      real work).
      
      rdar://problem/16063505
      
      llvm-svn: 201655
      aeb8e06d
  5. Feb 18, 2014
  6. Feb 14, 2014
  7. Feb 11, 2014
    • Chandler Carruth's avatar
      [LPM] Switch LICM to actively use LCSSA in addition to preserving it. · fc25854b
      Chandler Carruth authored
      Fixes PR18753 and PR18782.
      
      This is necessary for LICM to preserve LCSSA correctly and efficiently.
      There is still some active discussion about whether we should be using
      LCSSA, but we can't just immediately stop using it and we *need* LICM to
      preserve it while we are using it. We can restore the old SSAUpdater
      driven code if and when there is a serious effort to remove the reliance
      on LCSSA from all of the loop passes.
      
      However, this also serves as a great example of why LCSSA is very nice
      to have. This change significantly simplifies the process of sinking
      instructions for LICM, and makes it quite a bit less expensive.
      
      It wouldn't even be as complex as it is except that I had to start the
      process of removing the big recursive LCSSA formation hammer in order to
      switch even this much of the re-forming code to asserting that LCSSA was
      preserved. I'll fully remove that next just to tidy things up until the
      LCSSA debate settles one way or the other.
      
      llvm-svn: 201148
      fc25854b
    • Quentin Colombet's avatar
      [CodeGenPrepare] Undo changes that happened for the profitability check. · 5a69dda9
      Quentin Colombet authored
      The addressing mode matcher checks at some point the profitability of folding an
      instruction into the addressing mode. When the instruction to be folded has
      several uses, it checks that the instruction can be folded in each use.
      To do so, it creates a new matcher for each use and check if the instruction is
      in the list of the matched instructions of this new matcher.
      
      The new matchers may promote some instructions and this has to be undone to keep
      the state of the original matcher consistent.
      
      A test case will follow.
      
      <rdar://problem/16020230>
      
      llvm-svn: 201121
      5a69dda9
  8. Feb 10, 2014
  9. Feb 08, 2014
  10. Feb 06, 2014
  11. Feb 04, 2014
  12. Feb 01, 2014
    • Chandler Carruth's avatar
      [LPM] Apply a really big hammer to fix PR18688 by recursively reforming · 1665152c
      Chandler Carruth authored
      LCSSA when we promote to SSA registers inside of LICM.
      
      Currently, this is actually necessary. The promotion logic in LICM uses
      SSAUpdater which doesn't understand how to place LCSSA PHI nodes.
      Teaching it to do so would be a very significant undertaking. It may be
      worthwhile and I've left a FIXME about this in the code as well as
      starting a thread on llvmdev to try to figure out the right long-term
      solution.
      
      For now, the PR needs to be fixed. Short of using the promition
      SSAUpdater to place both the LCSSA PHI nodes and the promoted PHI nodes,
      I don't see a cleaner or cheaper way of achieving this. Fortunately,
      LCSSA is relatively lazy and sparse -- it should only update
      instructions which need it. We can also skip the recursive variant when
      we don't promote to SSA values.
      
      llvm-svn: 200612
      1665152c
  13. Jan 29, 2014
    • Chandler Carruth's avatar
      [LPM] Fix PR18643, another scary place where loop transforms failed to · d4be9dc0
      Chandler Carruth authored
      preserve loop simplify of enclosing loops.
      
      The problem here starts with LoopRotation which ends up cloning code out
      of the latch into the new preheader it is buidling. This can create
      a new edge from the preheader into the exit block of the loop which
      breaks LoopSimplify form. The code tries to fix this by splitting the
      critical edge between the latch and the exit block to get a new exit
      block that only the latch dominates. This sadly isn't sufficient.
      
      The exit block may be an exit block for multiple nested loops. When we
      clone an edge from the latch of the inner loop to the new preheader
      being built in the outer loop, we create an exiting edge from the outer
      loop to this exit block. Despite breaking the LoopSimplify form for the
      inner loop, this is fine for the outer loop. However, when we split the
      edge from the inner loop to the exit block, we create a new block which
      is in neither the inner nor outer loop as the new exit block. This is
      a predecessor to the old exit block, and so the split itself takes the
      outer loop out of LoopSimplify form. We need to split every edge
      entering the exit block from inside a loop nested more deeply than the
      exit block in order to preserve all of the loop simplify constraints.
      
      Once we try to do that, a problem with splitting critical edges
      surfaces. Previously, we tried a very brute force to update LoopSimplify
      form by re-computing it for all exit blocks. We don't need to do this,
      and doing this much will sometimes but not always overlap with the
      LoopRotate bug fix. Instead, the code needs to specifically handle the
      cases which can start to violate LoopSimplify -- they aren't that
      common. We need to see if the destination of the split edge was a loop
      exit block in simplified form for the loop of the source of the edge.
      For this to be true, all the predecessors need to be in the exact same
      loop as the source of the edge being split. If the dest block was
      originally in this form, we have to split all of the deges back into
      this loop to recover it. The old mechanism of doing this was
      conservatively correct because at least *one* of the exiting blocks it
      rewrote was the DestBB and so the DestBB's predecessors were fixed. But
      this is a much more targeted way of doing it. Making it targeted is
      important, because ballooning the set of edges touched prevents
      LoopRotate from being able to split edges *it* needs to split to
      preserve loop simplify in a coherent way -- the critical edge splitting
      would sometimes find the other edges in need of splitting but not
      others.
      
      Many, *many* thanks for help from Nick reducing these test cases
      mightily. And helping lots with the analysis here as this one was quite
      tricky to track down.
      
      llvm-svn: 200393
      d4be9dc0
    • Chandler Carruth's avatar
      [LPM] Fix PR18642, a pretty nasty bug in IndVars that "never mattered" · 66f0b163
      Chandler Carruth authored
      because of the inside-out run of LoopSimplify in the LoopPassManager and
      the fact that LoopSimplify couldn't be "preserved" across two
      independent LoopPassManagers.
      
      Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI
      node because it thought it was rewriting (via SCEV) the incoming value
      to a loop invariant value. While it may well be invariant for the
      current loop, it may be rewritten in terms of an enclosing loop's
      values. This in and of itself is fine, as the LCSSA PHI node in the
      enclosing loop for the inner loop value we're rewriting will have its
      own LCSSA PHI node if used outside of the enclosing loop. With me so
      far?
      
      Well, the current loop and the enclosing loop may share an exiting
      block and exit block, and when they do they also share LCSSA PHI nodes.
      In this case, its not valid to RAUW through the LCSSA PHI node.
      
      Expected crazy test included.
      
      llvm-svn: 200372
      66f0b163
  14. Jan 28, 2014
  15. Jan 27, 2014
  16. Jan 25, 2014
    • Chandler Carruth's avatar
      [LPM] Make LCSSA a utility with a FunctionPass that applies it to all · 8765cf70
      Chandler Carruth authored
      the loops in a function, and teach LICM to work in the presance of
      LCSSA.
      
      Previously, LCSSA was a loop pass. That made passes requiring it also be
      loop passes and unable to depend on function analysis passes easily. It
      also caused outer loops to have a different "canonical" form from inner
      loops during analysis. Instead, we go into LCSSA form and preserve it
      through the loop pass manager run.
      
      Note that this has the same problem as LoopSimplify that prevents
      enabling its verification -- loop passes which run at the end of the loop
      pass manager and don't preserve these are valid, but the subsequent loop
      pass runs of outer loops that do preserve this pass trigger too much
      verification and fail because the inner loop no longer verifies.
      
      The other problem this exposed is that LICM was completely unable to
      handle LCSSA form. It didn't preserve it and it actually would give up
      on moving instructions in many cases when they were used by an LCSSA phi
      node. I've taught LICM to support detecting LCSSA-form PHI nodes and to
      hoist and sink around them. This may actually let LICM fire
      significantly more because we put everything into LCSSA form to rotate
      the loop before running LICM. =/ Now LICM should handle that fine and
      preserve it correctly. The down side is that LICM has to require LCSSA
      in order to preserve it. This is just a fact of life for LCSSA. It's
      entirely possible we should completely remove LCSSA from the optimizer.
      
      The test updates are essentially accomodating LCSSA phi nodes in the
      output of LICM, and the fact that we now completely sink every
      instruction in ashr-crash below the loop bodies prior to unrolling.
      
      With this change, LCSSA is computed only three times in the pass
      pipeline. One of them could be removed (and potentially a SCEV run and
      a separate LoopPassManager entirely!) if we had a LoopPass variant of
      InstCombine that ran InstCombine on the loop body but refused to combine
      away LCSSA PHI nodes. Currently, this also prevents loop unrolling from
      being in the same loop pass manager is rotate, LICM, and unswitch.
      
      There is one thing that I *really* don't like -- preserving LCSSA in
      LICM is quite expensive. We end up having to re-run LCSSA twice for some
      loops after LICM runs because LICM can undo LCSSA both in the current
      loop and the parent loop. I don't really see good solutions to this
      other than to completely move away from LCSSA and using tools like
      SSAUpdater instead.
      
      llvm-svn: 200067
      8765cf70
    • Juergen Ributzka's avatar
      Revert "Revert "Add Constant Hoisting Pass" (r200034)" · f26beda7
      Juergen Ributzka authored
      This reverts commit r200058 and adds the using directive for
      ARMTargetTransformInfo to silence two g++ overload warnings.
      
      llvm-svn: 200062
      f26beda7
Loading