Skip to content
  1. Oct 18, 2003
  2. Oct 17, 2003
  3. Oct 16, 2003
  4. Oct 15, 2003
  5. Oct 14, 2003
  6. Oct 13, 2003
  7. Oct 12, 2003
  8. Oct 10, 2003
  9. Oct 08, 2003
  10. Oct 07, 2003
  11. Oct 06, 2003
    • Chris Lattner's avatar
      Minor speedups for the instcombine pass · e8ed4ef0
      Chris Lattner authored
      llvm-svn: 8894
      e8ed4ef0
    • Chris Lattner's avatar
      Speed up the predicate used to decide when to inline by caching the size · 6dc0ae2d
      Chris Lattner authored
      of callees between executions.
      
      On eon, in release mode, this changes the inliner from taking 11.5712s
      to taking 2.2066s.  In debug mode, it went from taking 14.4148s to
      taking 7.0745s.  In release mode, this is a 24.7% speedup of gccas, in
      debug mode, it's a total speedup of 11.7%.
      
      This also makes it slightly more aggressive.  This could be because we
      are not judging the size of the functions quite as accurately as before.
      When we start looking at the performance of the generated code, this can
      be investigated further.
      
      llvm-svn: 8893
      6dc0ae2d
    • Chris Lattner's avatar
      Avoid doing pointless work. Amazingly, this makes us go faster. · 6aa34b0d
      Chris Lattner authored
      Running the inliner on 252.eon used to take 48.4763s, now it takes 14.4148s.
      
      In release mode, it went from taking 25.8741s to taking 11.5712s.
      
      This also fixes a FIXME.
      
      llvm-svn: 8890
      6aa34b0d
    • Chris Lattner's avatar
      This changes the PromoteMemToReg function to create "pruned" SSA form, not · c30f22f5
      Chris Lattner authored
      "minimal" SSA form (in other words, it doesn't insert dead PHIs).  This
      speeds up the mem2reg pass very significantly because it doesn't have to
      do a lot of frivolous work in many common cases.
      
      In the 252.eon function I have been playing with, this doesn't even insert
      the 120 PHI nodes that it used to which were trivially dead (in the process
      of promoting 356 alloca instructions overall).  This speeds up the mem2reg
      pass from 1.2459s to 0.1284s.  More significantly, the DCE pass used to take
      2.4138s to remove the 120 dead PHI nodes that mem2reg constructed, now it
      takes 0.0134s (which is the time to scan the function and decide that there
      is nothing dead).  So overall, on this one function, we speed things up a
      total of 3.5179s, which is a 24.8x speedup!  :)
      
      This change is tested by the Mem2Reg/2003-10-05-DeadPHIInsertion.ll test,
      which now passes.
      
      llvm-svn: 8884
      c30f22f5
  12. Oct 05, 2003
    • Chris Lattner's avatar
      a906bacf
    • Chris Lattner's avatar
      Speed up the mem2reg transform for allocas which are only read/written in a single · 80471529
      Chris Lattner authored
      basic block.  This is amazingly common in code generated by the C/C++ front-ends.
      This change makes it not have to insert ANY phi nodes, whereas before it would insert
      a ton of dead ones which DCE would have to clean up.
      
      Thus, this fix improves compile-time performance of these trivial allocas in two ways:
        1. It doesn't have to do the walking and book-keeping for renaming
        2. It does not insert dead phi nodes for them which would have to
           subsequently be cleaned up.
      
      On my favorite testcase from 252.eon, this special case handles 305 out of
      356 promoted allocas in the function.  It speeds up the mem2reg pass from 7.5256s
      to 1.2505s.  It inserts 677 fewer dead PHI nodes, which speeds up a subsequent
      -dce pass from 18.7524s to 2.4806s.
      
      There are still 120 trivially dead PHI nodes being inserted for variables used
      in multiple basic blocks, but they are not handled by this patch.
      
      llvm-svn: 8881
      80471529
Loading