Skip to content
  1. Dec 07, 2006
  2. Dec 06, 2006
  3. Dec 01, 2006
  4. Nov 17, 2006
  5. Nov 04, 2006
  6. Nov 02, 2006
  7. Oct 12, 2006
  8. Sep 05, 2006
    • Chris Lattner's avatar
      Fix a long-standing wart in the code generator: two-address instruction lowering · 13a5dcdd
      Chris Lattner authored
      actually *removes* one of the operands, instead of just assigning both operands
      the same register.  This make reasoning about instructions unnecessarily complex,
      because you need to know if you are before or after register allocation to match
      up operand #'s with the target description file.
      
      Changing this also gets rid of a bunch of hacky code in various places.
      
      This patch also includes changes to fold loads into cmp/test instructions in
      the X86 backend, along with a significant simplification to the X86 spill
      folding code.
      
      llvm-svn: 30108
      13a5dcdd
  9. Aug 27, 2006
  10. Aug 25, 2006
    • Chris Lattner's avatar
      Take advantage of the recent improvements to the liveintervals set (tracking · bdf12106
      Chris Lattner authored
      instructions which define each value#) to simplify and improve the coallescer.
      In particular, this patch:
      
      1. Implements iterative coallescing.
      2. Reverts an unsafe hack from handlePhysRegDef, superceeding it with a
         better solution.
      3. Implements PR865, "coallescing" away the second copy in code like:
      
         A = B
         ...
         B = A
      
      This also includes changes to symbolically print registers in intervals
      when possible.
      
      llvm-svn: 29862
      bdf12106
  11. Aug 21, 2006
  12. Jul 21, 2006
  13. Jul 20, 2006
  14. Jun 29, 2006
  15. May 04, 2006
  16. May 02, 2006
  17. May 01, 2006
  18. Apr 30, 2006
  19. Apr 28, 2006
    • Chris Lattner's avatar
      Mapping of physregs can make it so that the designated and input physregs are · 79c50d96
      Chris Lattner authored
      the same.  In this case, don't emit a noop copy.
      
      llvm-svn: 28008
      79c50d96
    • Chris Lattner's avatar
      When we have a two-address instruction where the input cannot be clobbered · 84e95d00
      Chris Lattner authored
      and is already available, instead of falling back to emitting a load, fall
      back to emitting a reg-reg copy.  This generates significantly better code
      for some SSE testcases, as SSE has lots of two-address instructions and
      none of them are read/modify/write.  As one example, this change does:
      
              pshufd %XMM5, XMMWORD PTR [%ESP + 84], 255
              xorps %XMM2, %XMM5
              cmpltps %XMM1, %XMM0
      -       movaps XMMWORD PTR [%ESP + 52], %XMM0
      -       movapd %XMM6, XMMWORD PTR [%ESP + 52]
      +       movaps %XMM6, %XMM0
              cmpltps %XMM6, XMMWORD PTR [%ESP + 68]
              movapd XMMWORD PTR [%ESP + 52], %XMM6
              movaps %XMM6, %XMM0
              cmpltps %XMM6, XMMWORD PTR [%ESP + 36]
              cmpltps %XMM3, %XMM0
      -       movaps XMMWORD PTR [%ESP + 20], %XMM0
      -       movapd %XMM7, XMMWORD PTR [%ESP + 20]
      +       movaps %XMM7, %XMM0
              cmpltps %XMM7, XMMWORD PTR [%ESP + 4]
              movapd XMMWORD PTR [%ESP + 20], %XMM7
              cmpltps %XMM4, %XMM0
      
      ... which is far better than a store followed by a load!
      
      llvm-svn: 28001
      84e95d00
  20. Feb 25, 2006
  21. Feb 04, 2006
  22. Feb 03, 2006
    • Jeff Cohen's avatar
      Fix VC++ compilation error caused by using a std::map iterator variable to receive · 3276ff7a
      Jeff Cohen authored
      a std::multimap iterator value.  For some reason, GCC doesn't have a problem with this.
      
      llvm-svn: 25927
      3276ff7a
    • Chris Lattner's avatar
    • Chris Lattner's avatar
      Simplify some code · 774d4a19
      Chris Lattner authored
      llvm-svn: 25924
      774d4a19
    • Chris Lattner's avatar
      Add code that checks for noop copies, which triggers when either: · 1ef239af
      Chris Lattner authored
      1. a target doesn't know how to fold load/stores into copies, or
      2. the spiller rewrites the input to a copy to the same register as the dest
         instead of to the reloaded reg.
      
      This will be moved/improved in the near future, but allows elimination of
      some ancient x86 hacks.  This eliminates 92 copies from SMG2000 on X86 and
      163 copies from 252.eon.
      
      llvm-svn: 25922
      1ef239af
    • Chris Lattner's avatar
      Physregs may hold multiple stack slot values at the same time. Keep track · b7f24de4
      Chris Lattner authored
      of this, and use it to our advantage (bwahahah).  This allows us to eliminate another
      60 instructions from smg2000 on PPC (probably significantly more on X86).  A common
      old-new diff looks like this:
      
              stw r2, 3304(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3300(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3296(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3200(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3196(r1)
      -       lwz r2, 3192(r1)
      +       or r2, r2, r2
              stw r2, 3188(r1)
      
      and
      
      -       lwz r31, 604(r1)
      -       lwz r13, 604(r1)
      -       lwz r14, 604(r1)
      -       lwz r15, 604(r1)
      -       lwz r16, 604(r1)
      -       lwz r30, 604(r1)
      +       or r31, r30, r30
      +       or r13, r30, r30
      +       or r14, r30, r30
      +       or r15, r30, r30
      +       or r16, r30, r30
      +       or r30, r30, r30
      
      Removal of the R = R copies is coming next...
      
      llvm-svn: 25919
      b7f24de4
    • Chris Lattner's avatar
      Fix a deficiency in the spiller that Evan noticed. In particular, consider · f3aef1b0
      Chris Lattner authored
      this code:
      
        store [stack slot #0],  R10
          = add R14, [stack slot #0]
      
      The spiller didn't know that the store made the value of [stackslot#0] available
      in R10 *IF* the store came from a copy instruction with the store folded into it.
      
      This patch teaches VirtRegMap to look at these stores and recognize the values
      they make available.  In one case Evan provided, this code:
      
              divsd %XMM0, %XMM1
              movsd %XMM1, QWORD PTR [%ESP + 40]
      1)      movsd QWORD PTR [%ESP + 48], %XMM1
      2)      movsd %XMM1, QWORD PTR [%ESP + 48]
              addsd %XMM1, %XMM0
      3)      movsd QWORD PTR [%ESP + 48], %XMM1
              movsd QWORD PTR [%ESP + 4], %XMM0
      
      turns into:
      
              divsd %XMM0, %XMM1
              movsd %XMM1, QWORD PTR [%ESP + 40]
              addsd %XMM1, %XMM0
      3)      movsd QWORD PTR [%ESP + 48], %XMM1
              movsd QWORD PTR [%ESP + 4], %XMM0
      
      In this case, instruction #2 was removed because of the value made
      available by #1, and inst #1 was later deleted because it is now
      never used before the stack slot is redefined by #3.
      
      This occurs here and there in a lot of code with high spilling, on PPC
      most of the removed loads/stores are LSU-reject-causing loads, which is
      nice.
      
      On X86, things are much better (because it spills more), where we nuke
      about 1% of the instructions from SMG2000 and several hundred from eon.
      
      More improvements to come...
      
      llvm-svn: 25917
      f3aef1b0
Loading