Skip to content
  1. Nov 21, 2010
    • Chris Lattner's avatar
      Implement PR8644: forwarding a memcpy value to a byval, · 58f9f587
      Chris Lattner authored
      allowing the memcpy to be eliminated.
      
      Unfortunately, the requirements on byval's without explicit 
      alignment are really weak and impossible to predict in the 
      mid-level optimizer, so this doesn't kick in much with current
      frontends.  The fix is to change clang to set alignment on all
      byval arguments.
      
      llvm-svn: 119916
      58f9f587
  2. Nov 20, 2010
  3. Nov 19, 2010
  4. Nov 18, 2010
    • Duncan Sands's avatar
      Factor code for testing whether replacing one value with another · aef146b8
      Duncan Sands authored
      preserves LCSSA form out of ScalarEvolution and into the LoopInfo
      class.  Use it to check that SimplifyInstruction simplifications
      are not breaking LCSSA form.  Fixes PR8622.
      
      llvm-svn: 119727
      aef146b8
    • Owen Anderson's avatar
      Completely rework the datastructure GVN uses to represent the value number to... · c21c100f
      Owen Anderson authored
      Completely rework the datastructure GVN uses to represent the value number to leader mapping.  Previously,
      this was a tree of hashtables, and a query recursed into the table for the immediate dominator ad infinitum
      if the initial lookup failed.  This led to really bad performance on tall, narrow CFGs.
      
      We can instead replace it with what is conceptually a multimap of value numbers to leaders (actually
      represented by a hashtable with a list of Value*'s as the value type), and then
      determine which leader from that set to use very cheaply thanks to the DFS numberings maintained by
      DominatorTree.  Because there are typically few duplicates of a given value, this scan tends to be
      quite fast.  Additionally, we use a custom linked list and BumpPtr allocation to avoid any unnecessary
      allocation in representing the value-side of the multimap.
      
      This change brings with it a 15% (!) improvement in the total running time of GVN on 403.gcc, which I
      think is pretty good considering that includes all the "real work" being done by MemDep as well.
      
      The one downside to this approach is that we can no longer use GVN to perform simple conditional progation,
      but that seems like an acceptable loss since we now have LVI and CorrelatedValuePropagation to pick up
      the slack.  If you see conditional propagation that's not happening, please file bugs against LVI or CVP.
      
      llvm-svn: 119714
      c21c100f
    • Chris Lattner's avatar
      slightly simplify code and substantially improve comment. Instead of · 1385dff8
      Chris Lattner authored
      saying "it would be bad", give an example of what is going on.
      
      llvm-svn: 119695
      1385dff8
    • Chris Lattner's avatar
      remove a pointless restriction from memcpyopt. It was · 731caac7
      Chris Lattner authored
      refusing to optimize two memcpy's like this:
      
      copy A <- B
      copy C <- A
      
      if it couldn't prove that noalias(B,C).  We can eliminate
      the copy by producing a memmove instead of memcpy.
      
      llvm-svn: 119694
      731caac7
    • Chris Lattner's avatar
      remove another pointless noalias check: M is a memcpy, so the · c274a834
      Chris Lattner authored
      source and dest are known to not overlap.
      
      llvm-svn: 119692
      c274a834
    • Chris Lattner's avatar
      use AA::isNoAlias instead of open coding it. Remove an extraneous noalias check: · 75cfe985
      Chris Lattner authored
      there is no need to check to see if the source and dest of a memcpy are noalias,
      behavior is undefined if not.
      
      llvm-svn: 119691
      75cfe985
    • Chris Lattner's avatar
      finish a thought. · 1e37bbaf
      Chris Lattner authored
      llvm-svn: 119690
      1e37bbaf
    • Chris Lattner's avatar
      rearrange some code, splitting memcpy/memcpy optimization · 7e9b2ea3
      Chris Lattner authored
      out of processMemCpy into its own function.
      
      llvm-svn: 119687
      7e9b2ea3
    • Chris Lattner's avatar
      allow eliminating an alloca that is just copied from an constant global · ac570131
      Chris Lattner authored
      if it is passed as a byval argument.  The byval argument will just be a
      read, so it is safe to read from the original global instead.  This allows
      us to promote away the %agg.tmp alloca in PR8582
      
      llvm-svn: 119686
      ac570131
    • Chris Lattner's avatar
      enhance the "alloca is just a memcpy from constant global" · f183d5c4
      Chris Lattner authored
      to ignore calls that obviously can't modify the alloca
      because they are readonly/readnone.
      
      llvm-svn: 119683
      f183d5c4
    • Chris Lattner's avatar
      fix a small oversight in the "eliminate memcpy from constant global" · 7aeae25c
      Chris Lattner authored
      optimization.  If the alloca that is "memcpy'd from constant" also has
      a memcpy from *it*, ignore it: it is a load.  We now optimize the testcase to:
      
      define void @test2() {
        %B = alloca %T
        %a = bitcast %T* @G to i8*
        %b = bitcast %T* %B to i8*
        call void @llvm.memcpy.p0i8.p0i8.i64(i8* %b, i8* %a, i64 124, i32 4, i1 false)
        call void @bar(i8* %b)
        ret void
      }
      
      previously we would generate:
      
      define void @test() {
        %B = alloca %T
        %b = bitcast %T* %B to i8*
        %G.0 = getelementptr inbounds %T* @G, i32 0, i32 0
        %tmp3 = load i8* %G.0, align 4
        %G.1 = getelementptr inbounds %T* @G, i32 0, i32 1
        %G.15 = bitcast [123 x i8]* %G.1 to i8*
        %1 = bitcast [123 x i8]* %G.1 to i984*
        %srcval = load i984* %1, align 1
        %B.0 = getelementptr inbounds %T* %B, i32 0, i32 0
        store i8 %tmp3, i8* %B.0, align 4
        %B.1 = getelementptr inbounds %T* %B, i32 0, i32 1
        %B.12 = bitcast [123 x i8]* %B.1 to i8*
        %2 = bitcast [123 x i8]* %B.1 to i984*
        store i984 %srcval, i984* %2, align 1
        call void @bar(i8* %b)
        ret void
      }
      
      llvm-svn: 119682
      7aeae25c
  5. Nov 17, 2010
  6. Nov 16, 2010
  7. Nov 14, 2010
  8. Nov 12, 2010
    • Duncan Sands's avatar
      Have GVN simplify instructions as it goes. For example, consider · 246b71c5
      Duncan Sands authored
      "%z = %x and %y".  If GVN can prove that %y equals %x, then it turns
      this into "%z = %x and %x".  With the new code, %z will be replaced
      with %x everywhere (and then deleted).  Previously %z would be value
      numbered too, which is a waste of time.  Also, while a clever value
      numbering algorithm would give %z the same value number as %x, our
      current one doesn't do so (at least I don't think it does).  The new
      logic has an essentially equivalent effect to what you would get if
      %z was given the same value number as %x, i.e. it should make value
      numbering smarter.  While there, get hold of target data once at the
      start rather than a gazillion times all over the place.
      
      llvm-svn: 118923
      246b71c5
    • Dan Gohman's avatar
      Enhance DSE to handle the case where a free call makes more than · d4b7fff2
      Dan Gohman authored
      one store dead. This is especially noticeable in
      SingleSource/Benchmarks/Shootout/objinst.
      
      llvm-svn: 118875
      d4b7fff2
  9. Nov 11, 2010
  10. Nov 10, 2010
  11. Nov 09, 2010
  12. Oct 29, 2010
    • Owen Anderson's avatar
      Give up on doing in-line instruction simplification during correlated value... · 374e1464
      Owen Anderson authored
      Give up on doing in-line instruction simplification during correlated value propagation.  Instruction simplification
      needs to be guaranteed never to be run on an unreachable block.  However, earlier block simplifications may have
      changed the CFG to make block that were reachable when we began our iteration unreachable by the time we try to
      simplify them. (Note that this also means that our depth-first iterators were potentially being invalidated).
      
      This should not have a large impact on code quality, since later runs of instcombine should pick up these simplifications.
      Fixes PR8506.
      
      llvm-svn: 117709
      374e1464
    • John Thompson's avatar
      Inline asm multiple alternative constraints development phase 2 - improved... · e8360b71
      John Thompson authored
      Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.
      
      llvm-svn: 117667
      e8360b71
  13. Oct 20, 2010
  14. Oct 19, 2010
  15. Oct 18, 2010
  16. Oct 16, 2010
Loading