Skip to content
  1. Nov 26, 2006
  2. Nov 17, 2006
    • Chris Lattner's avatar
      If an indvar with a variable stride is used by the exit condition, go ahead · 21eba2da
      Chris Lattner authored
      and handle it like constant stride vars.  This fixes some bad codegen in
      variable stride cases.  For example, it compiles this:
      
      void foo(int k, int i) {
        for (k=i+i; k <= 8192; k+=i)
          flags2[k] = 0;
      }
      
      to:
      
      LBB1_1: #bb.preheader
              movl %eax, %ecx
              addl %ecx, %ecx
              movl L_flags2$non_lazy_ptr, %edx
      LBB1_2: #bb
              movb $0, (%edx,%ecx)
              addl %eax, %ecx
              cmpl $8192, %ecx
              jle LBB1_2      #bb
      LBB1_5: #return
              ret
      
      or (if the array is local and we are in dynamic-nonpic or static mode):
      
      LBB3_2: #bb
              movb $0, _flags2(%ecx)
              addl %eax, %ecx
              cmpl $8192, %ecx
              jle LBB3_2      #bb
      
      and:
      
              lis r2, ha16(L_flags2$non_lazy_ptr)
              lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
              slwi r3, r4, 1
      LBB1_2: ;bb
              li r5, 0
              add r6, r4, r3
              stbx r5, r2, r3
              cmpwi cr0, r6, 8192
              bgt cr0, LBB1_5 ;return
      
      instead of:
      
              leal (%eax,%eax,2), %ecx
              movl %eax, %edx
              addl %edx, %edx
              addl L_flags2$non_lazy_ptr, %edx
              xorl %esi, %esi
      LBB1_2: #bb
              movb $0, (%edx,%esi)
              movl %eax, %edi
              addl %esi, %edi
              addl %ecx, %esi
              cmpl $8192, %esi
              jg LBB1_5       #return
      
      and:
      
              lis r2, ha16(L_flags2$non_lazy_ptr)
              lwz r2, lo16(L_flags2$non_lazy_ptr)(r2)
              mulli r3, r4, 3
              slwi r5, r4, 1
              li r6, 0
              add r2, r2, r5
      LBB1_2: ;bb
              li r5, 0
              add r7, r3, r6
              stbx r5, r2, r6
              add r6, r4, r6
              cmpwi cr0, r7, 8192
              ble cr0, LBB1_2 ;bb
      
      This speeds up Benchmarks/Shootout/sieve from 8.533s to 6.464s and
      implements LoopStrengthReduce/var_stride_used_by_compare.ll
      
      llvm-svn: 31809
      21eba2da
  3. Nov 02, 2006
    • Reid Spencer's avatar
      For PR786: · de46e484
      Reid Spencer authored
      Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
      fall out by removing unused variables. Remaining warnings have to do with
      unused functions (I didn't want to delete code without review) and unused
      variables in generated code. Maintainers should clean up the remaining
      issues when they see them. All changes pass DejaGnu tests and Olden.
      
      llvm-svn: 31380
      de46e484
  4. Oct 28, 2006
  5. Oct 20, 2006
    • Reid Spencer's avatar
      For PR950: · e0fc4dfc
      Reid Spencer authored
      This patch implements the first increment for the Signless Types feature.
      All changes pertain to removing the ConstantSInt and ConstantUInt classes
      in favor of just using ConstantInt.
      
      llvm-svn: 31063
      e0fc4dfc
  6. Aug 28, 2006
  7. Aug 27, 2006
  8. Aug 03, 2006
    • Chris Lattner's avatar
      · 3ff62017
      Chris Lattner authored
      Changes:
        1. Update an obsolete comment.
        2. Make the sorting by base an explicit (though still N^2) step, so
           that the code is more clear on what it is doing.
        3. Partition uses so that uses inside the loop are handled before uses
           outside the loop.
      
      Note that none of these changes currently changes the code inserted by LSR,
      but they are a stepping stone to getting there.
      
      This code is the result of some crazy pair programming with Nate. :)
      
      llvm-svn: 29493
      3ff62017
  9. Jul 18, 2006
  10. Jun 29, 2006
  11. Jun 09, 2006
  12. Apr 12, 2006
  13. Mar 24, 2006
  14. Mar 22, 2006
  15. Mar 18, 2006
  16. Mar 17, 2006
  17. Mar 16, 2006
    • Evan Cheng's avatar
      For each loop, keep track of all the IV expressions inserted indexed by · 3df447d3
      Evan Cheng authored
      stride. For a set of uses of the IV of a stride which is a multiple
      of another stride, do not insert a new IV expression. Rather, reuse the
      previous IV and rewrite the uses as uses of IV expression multiplied by
      the factor.
      
      e.g.
      x = 0 ...; x ++
      y = 0 ...; y += 4
      then use of y can be rewritten as use of 4*x for x86.
      
      llvm-svn: 26803
      3df447d3
  18. Mar 14, 2006
  19. Feb 04, 2006
  20. Jan 23, 2006
  21. Jan 11, 2006
  22. Dec 05, 2005
  23. Oct 21, 2005
  24. Oct 20, 2005
    • Chris Lattner's avatar
      Do NOT touch FP ops with LSR. This fixes a testcase Nate sent me from an · 0c0b38bb
      Chris Lattner authored
      inner loop like this:
      
      LBB_RateConvertMono8AltiVec_2:  ; no_exit
              lis r2, ha16(.CPI_RateConvertMono8AltiVec_0)
              lfs f3, lo16(.CPI_RateConvertMono8AltiVec_0)(r2)
              fmr f3, f3
              fadd f0, f2, f0
              fadd f3, f0, f3
              fcmpu cr0, f3, f1
              bge cr0, LBB_RateConvertMono8AltiVec_2  ; no_exit
      
      to an inner loop like this:
      
      LBB_RateConvertMono8AltiVec_1:  ; no_exit
              fsub f2, f2, f1
              fcmpu cr0, f2, f1
              fmr f0, f2
              bge cr0, LBB_RateConvertMono8AltiVec_1  ; no_exit
      
      Doh! good catch!
      
      llvm-svn: 23838
      0c0b38bb
  25. Oct 11, 2005
  26. Oct 09, 2005
  27. Oct 03, 2005
    • Chris Lattner's avatar
      Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In · f07a587c
      Chris Lattner authored
      particular, it should realize that phi's use their values in the pred block
      not the phi block itself.  This change turns our em3d loop from this:
      
      _test:
              cmpwi cr0, r4, 0
              bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
      LBB_test_1:     ; entry.loopexit_crit_edge
              li r2, 0
              b LBB_test_6    ; loopexit
      LBB_test_2:     ; entry.no_exit_crit_edge
              li r6, 0
      LBB_test_3:     ; no_exit
              or r2, r6, r6
              lwz r6, 0(r3)
              cmpw cr0, r6, r5
              beq cr0, LBB_test_6     ; loopexit
      LBB_test_4:     ; endif
              addi r3, r3, 4
              addi r6, r2, 1
              cmpw cr0, r6, r4
              blt cr0, LBB_test_3     ; no_exit
      LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
              addi r3, r2, 1
              blr
      LBB_test_6:     ; loopexit
              or r3, r2, r2
              blr
      
      into:
      
      _test:
              cmpwi cr0, r4, 0
              bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
      LBB_test_1:     ; entry.loopexit_crit_edge
              li r2, 0
              b LBB_test_5    ; loopexit
      LBB_test_2:     ; entry.no_exit_crit_edge
              li r6, 0
      LBB_test_3:     ; no_exit
              lwz r2, 0(r3)
              cmpw cr0, r2, r5
              or r2, r6, r6
              beq cr0, LBB_test_5     ; loopexit
      LBB_test_4:     ; endif
              addi r3, r3, 4
              addi r6, r6, 1
              cmpw cr0, r6, r4
              or r2, r6, r6
              blt cr0, LBB_test_3     ; no_exit
      LBB_test_5:     ; loopexit
              or r3, r2, r2
              blr
      
      
      Unfortunately, this is actually worse code, because the register coallescer
      is getting confused somehow.  If it were doing its job right, it could turn the
      code into this:
      
      _test:
              cmpwi cr0, r4, 0
              bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
      LBB_test_1:     ; entry.loopexit_crit_edge
              li r6, 0
              b LBB_test_5    ; loopexit
      LBB_test_2:     ; entry.no_exit_crit_edge
              li r6, 0
      LBB_test_3:     ; no_exit
              lwz r2, 0(r3)
              cmpw cr0, r2, r5
              beq cr0, LBB_test_5     ; loopexit
      LBB_test_4:     ; endif
              addi r3, r3, 4
              addi r6, r6, 1
              cmpw cr0, r6, r4
              blt cr0, LBB_test_3     ; no_exit
      LBB_test_5:     ; loopexit
              or r3, r6, r6
              blr
      
      ... which I'll work on next. :)
      
      llvm-svn: 23604
      f07a587c
    • Chris Lattner's avatar
      Refactor some code into a function · e4ed42a4
      Chris Lattner authored
      llvm-svn: 23603
      e4ed42a4
    • Chris Lattner's avatar
      This break is bogus and I have no idea why it was there. Basically it prevents · 360928db
      Chris Lattner authored
      memoizing code when IV's are used by phinodes outside of loops.  In a simple
      example, we were getting this code before (note that r6 and r7 are isomorphic
      IV's):
      
              li r6, 0
              or r7, r6, r6
      LBB_test_3:     ; no_exit
              lwz r2, 0(r3)
              cmpw cr0, r2, r5
              or r2, r7, r7
              beq cr0, LBB_test_5     ; loopexit
      LBB_test_4:     ; endif
              addi r2, r7, 1
              addi r7, r7, 1
              addi r3, r3, 4
              addi r6, r6, 1
              cmpw cr0, r6, r4
              blt cr0, LBB_test_3     ; no_exit
      
      Now we get:
      
              li r6, 0
      LBB_test_3:     ; no_exit
              or r2, r6, r6
              lwz r6, 0(r3)
              cmpw cr0, r6, r5
              beq cr0, LBB_test_6     ; loopexit
      LBB_test_4:     ; endif
              addi r3, r3, 4
              addi r6, r2, 1
              cmpw cr0, r6, r4
              blt cr0, LBB_test_3     ; no_exit
      
      this was noticed in em3d.
      
      llvm-svn: 23602
      360928db
    • Chris Lattner's avatar
      when checking if we should move a split edge block outside of a loop, · 8fcce170
      Chris Lattner authored
      check the presplit pred, not the post-split pred.  This was causing us
      to make the wrong decision in some cases, leaving the critical edge block
      in the loop.
      
      llvm-svn: 23601
      8fcce170
  28. Sep 27, 2005
  29. Sep 13, 2005
  30. Sep 12, 2005
    • Chris Lattner's avatar
      Fix a regression from last night, which caused this pass to create invalid · 8048b85e
      Chris Lattner authored
      code for IV uses outside of loops that are not dominated by the latch block.
      We should only convert these uses to use the post-inc value if they ARE
      dominated by the latch block.
      
      Also use a new LoopInfo method to simplify some code.
      
      This fixes Transforms/LoopStrengthReduce/2005-09-12-UsesOutOutsideOfLoop.ll
      
      llvm-svn: 23318
      8048b85e
    • Chris Lattner's avatar
      _test: · a6764839
      Chris Lattner authored
              li r2, 0
      LBB_test_1:     ; no_exit.2
              li r5, 0
              stw r5, 0(r3)
              addi r2, r2, 1
              addi r3, r3, 4
              cmpwi cr0, r2, 701
              blt cr0, LBB_test_1     ; no_exit.2
      LBB_test_2:     ; loopexit.2.loopexit
              addi r2, r2, 1
              stw r2, 0(r4)
              blr
      [zion ~/llvm]$ cat > ~/xx
      Uses of IV's outside of the loop should use hte post-incremented version
      of the IV, not the preincremented version.  This helps many loops (e.g. in sixtrack)
      which used to generate code like this (this is the code from the
      dont-hoist-simple-loop-constants.ll testcase):
      
      _test:
              li r2, 0                 **** IV starts at 0
      LBB_test_1:     ; no_exit.2
              or r5, r2, r2            **** Copy for loop exit
              li r2, 0
              stw r2, 0(r3)
              addi r3, r3, 4
              addi r2, r5, 1
              addi r6, r5, 2           **** IV+2
              cmpwi cr0, r6, 701
              blt cr0, LBB_test_1     ; no_exit.2
      LBB_test_2:     ; loopexit.2.loopexit
              addi r2, r5, 2       ****  IV+2
              stw r2, 0(r4)
              blr
      
      And now generated code like this:
      
      _test:
              li r2, 1               *** IV starts at 1
      LBB_test_1:     ; no_exit.2
              li r5, 0
              stw r5, 0(r3)
              addi r2, r2, 1
              addi r3, r3, 4
              cmpwi cr0, r2, 701     *** IV.postinc + 0
              blt cr0, LBB_test_1
      LBB_test_2:     ; loopexit.2.loopexit
              stw r2, 0(r4)          *** IV.postinc + 0
              blr
      
      llvm-svn: 23313
      a6764839
Loading