Skip to content
  1. Aug 02, 2013
  2. Aug 01, 2013
  3. Jul 31, 2013
  4. Jul 30, 2013
  5. Jul 29, 2013
    • Nadav Rotem's avatar
      SLPVectorier: update the debug location for the new instructions. · d9c74cc6
      Nadav Rotem authored
      llvm-svn: 187363
      d9c74cc6
    • Chandler Carruth's avatar
      Teach the AllocaPromoter which is wrapped around the SSAUpdater · cd7c8cdf
      Chandler Carruth authored
      infrastructure to do promotion without a domtree the same smarts about
      looking through GEPs, bitcasts, etc., that I just taught mem2reg about.
      This way, if SROA chooses to promote an alloca which still has some
      noisy instructions this code can cope with them.
      
      I've not used as principled of an approach here for two reasons:
      1) This code doesn't really need it as we were already set up to zip
         through the instructions used by the alloca.
      2) I view the code here as more of a hack, and hopefully a temporary one.
      
      The SSAUpdater path in SROA is a real sore point for me. It doesn't make
      a lot of architectural sense for many reasons:
      - We're likely to end up needing the domtree anyways in a subsequent
        pass, so why not compute it earlier and use it.
      - In the future we'll likely end up needing the domtree for parts of the
        inliner itself.
      - If we need to we could teach the inliner to preserve the domtree. Part
        of the re-work of the pass manager will allow this to be very powerful
        even in large SCCs with many functions.
      - Ultimately, computing a domtree has gotten significantly faster since
        the original SSAUpdater-using code went into ScalarRepl. We no longer
        use domfrontiers, and much of domtree is lazily done based on queries
        rather than eagerly.
      - At this point keeping the SSAUpdater-based promotion saves a total of
        0.7% on a build of the 'opt' tool for me. That's not a lot of
        performance given the complexity!
      
      So I'm leaving this a bit ugly in the hope that eventually we just
      remove all of this nonsense.
      
      I can't even readily test this because this code isn't reachable except
      through SROA. When I re-instate the patch that fast-tracks allocas
      already suitable for promotion, I'll add a testcase there that failed
      before this change. Before that, SROA will fix any test case I give it.
      
      llvm-svn: 187347
      cd7c8cdf
    • Nadav Rotem's avatar
      Don't vectorize when the attribute NoImplicitFloat is used. · 750e42cb
      Nadav Rotem authored
      llvm-svn: 187340
      750e42cb
    • Rafael Espindola's avatar
      Fix -Wdocumentation warnings. · caa776be
      Rafael Espindola authored
      llvm-svn: 187336
      caa776be
    • Chandler Carruth's avatar
      Update comments for SSAUpdater to use the modern doxygen comment · 6b55dbea
      Chandler Carruth authored
      standards for LLVM. Remove duplicated comments on the interface from the
      implementation file (implementation comments are left there of course).
      Also clean up, re-word, and fix a few typos and errors in the commenst
      spotted along the way.
      
      This is in preparation for changes to these files and to keep the
      uninteresting tidying in a separate commit.
      
      llvm-svn: 187335
      6b55dbea
  6. Jul 28, 2013
  7. Jul 27, 2013
  8. Jul 26, 2013
    • Owen Anderson's avatar
      When InstCombine tries to fold away (fsub x, (fneg y)) into (fadd x, y), it is · e37c2e4d
      Owen Anderson authored
      also worthwhile for it to look through FP extensions and truncations, whose
      application commutes with fneg.
      
      llvm-svn: 187249
      e37c2e4d
    • Stephen Lin's avatar
    • Chandler Carruth's avatar
      Re-implement the analysis of uses in mem2reg to be significantly more · 9af38fc2
      Chandler Carruth authored
      robust. It now uses an InstVisitor and worklist to actually walk the
      uses of the Alloca transitively and detect the pattern which we can
      directly promote: loads & stores of the whole alloca and instructions we
      can completely ignore.
      
      Also, with this new implementation teach both the predicate for testing
      whether we can promote and the promotion engine itself to use the same
      code so we no longer have strange divergence between the two code paths.
      
      I've added some silly test cases to demonstrate that we can handle
      slightly more degenerate code patterns now. See the below for why this
      is even interesting.
      
      Performance impact: roughly 1% regression in the performance of SROA or
      ScalarRepl on a large C++-ish test case where most of the allocas are
      basically ready for promotion. The reason is because of silly redundant
      work that I've left FIXMEs for and which I'll address in the next
      commit. I wanted to separate this commit as it changes the behavior.
      Once the redundant work in removing the dead uses of the alloca is
      fixed, this code appears to be faster than the old version. =]
      
      So why is this useful? Because the previous requirement for promotion
      required a *specific* visit pattern of the uses of the alloca to verify:
      we *had* to look for no more than 1 intervening use. The end goal is to
      have SROA automatically detect when an alloca is already promotable and
      directly hand it to the mem2reg machinery rather than trying to
      partition and rewrite it. This is a 25% or more performance improvement
      for SROA, and a significant chunk of the delta between it and
      ScalarRepl. To get there, we need to make mem2reg actually capable of
      promoting allocas which *look* promotable to SROA without have SROA do
      tons of work to massage the code into just the right form.
      
      This is actually the tip of the iceberg. There are tremendous potential
      savings we can realize here by de-duplicating work between mem2reg and
      SROA.
      
      llvm-svn: 187191
      9af38fc2
    • Bill Schmidt's avatar
      [PowerPC] Support powerpc64le as a syntax-checking target. · 0a9170d9
      Bill Schmidt authored
      This patch provides basic support for powerpc64le as an LLVM target.
      However, use of this target will not actually generate little-endian
      code.  Instead, use of the target will cause the correct little-endian
      built-in defines to be generated, so that code that tests for
      __LITTLE_ENDIAN__, for example, will be correctly parsed for
      syntax-only testing.  Code generation will otherwise be the same as
      powerpc64 (big-endian), for now.
      
      The patch leaves open the possibility of creating a little-endian
      PowerPC64 back end, but there is no immediate intent to create such a
      thing.
      
      The LLVM portions of this patch simply add ppc64le coverage everywhere
      that ppc64 coverage currently exists.  There is nothing of any import
      worth testing until such time as little-endian code generation is
      implemented.  In the corresponding Clang patch, there is a new test
      case variant to ensure that correct built-in defines for little-endian
      code are generated.
      
      llvm-svn: 187179
      0a9170d9
  9. Jul 25, 2013
  10. Jul 24, 2013
    • Benjamin Kramer's avatar
      TRE: Move class into anonymous namespace. · 328da33d
      Benjamin Kramer authored
      While there shrink a dangerously large SmallPtrSet.
      
      llvm-svn: 187050
      328da33d
    • Chandler Carruth's avatar
      Fix a problem I introduced in r187029 where we would over-eagerly · 58e25d39
      Chandler Carruth authored
      schedule an alloca for another iteration in SROA. This only showed up
      with a mixture of promotable and unpromotable selects and phis. Added
      a test case for this.
      
      llvm-svn: 187031
      58e25d39
    • Chandler Carruth's avatar
      Fix PR16687 where we were incorrectly promoting an alloca that had · 83ea195d
      Chandler Carruth authored
      pending speculation for a phi node. The problem here is that we were
      using growth of the specluation set as an indicator of whether
      speculation would occur, and if the phi node is already in the set we
      don't see it grow. This is a symptom of the fact that this signal is
      a total hack.
      
      Unfortunately, I couldn't really come up with a non-hacky way of
      signaling that promotion remains valid *after* speculation occurs, such
      that we only speculate when all else looks good for promotion. In the
      end, I went with at least a much more explicit approach of doing the
      work of queuing inside the phi and select processing and setting
      a preposterously named flag to convey that we're in the special state of
      requiring speculating before promotion.
      
      Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing
      a testcase for this from a pretty giant, nasty assert in a big
      application. =] The testcase was excellent.
      
      llvm-svn: 187029
      83ea195d
    • Matt Arsenault's avatar
      Fix spelling · f64212b2
      Matt Arsenault authored
      llvm-svn: 186997
      f64212b2
  11. Jul 23, 2013
  12. Jul 22, 2013
Loading