Skip to content
  1. Feb 03, 2006
    • Jeff Cohen's avatar
      Fix VC++ compilation error caused by using a std::map iterator variable to receive · 3276ff7a
      Jeff Cohen authored
      a std::multimap iterator value.  For some reason, GCC doesn't have a problem with this.
      
      llvm-svn: 25927
      3276ff7a
    • Chris Lattner's avatar
    • Chris Lattner's avatar
      Simplify some code · 774d4a19
      Chris Lattner authored
      llvm-svn: 25924
      774d4a19
    • Chris Lattner's avatar
      Add code that checks for noop copies, which triggers when either: · 1ef239af
      Chris Lattner authored
      1. a target doesn't know how to fold load/stores into copies, or
      2. the spiller rewrites the input to a copy to the same register as the dest
         instead of to the reloaded reg.
      
      This will be moved/improved in the near future, but allows elimination of
      some ancient x86 hacks.  This eliminates 92 copies from SMG2000 on X86 and
      163 copies from 252.eon.
      
      llvm-svn: 25922
      1ef239af
    • Chris Lattner's avatar
      Physregs may hold multiple stack slot values at the same time. Keep track · b7f24de4
      Chris Lattner authored
      of this, and use it to our advantage (bwahahah).  This allows us to eliminate another
      60 instructions from smg2000 on PPC (probably significantly more on X86).  A common
      old-new diff looks like this:
      
              stw r2, 3304(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3300(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3296(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3200(r1)
      -       lwz r2, 3192(r1)
              stw r2, 3196(r1)
      -       lwz r2, 3192(r1)
      +       or r2, r2, r2
              stw r2, 3188(r1)
      
      and
      
      -       lwz r31, 604(r1)
      -       lwz r13, 604(r1)
      -       lwz r14, 604(r1)
      -       lwz r15, 604(r1)
      -       lwz r16, 604(r1)
      -       lwz r30, 604(r1)
      +       or r31, r30, r30
      +       or r13, r30, r30
      +       or r14, r30, r30
      +       or r15, r30, r30
      +       or r16, r30, r30
      +       or r30, r30, r30
      
      Removal of the R = R copies is coming next...
      
      llvm-svn: 25919
      b7f24de4
    • Chris Lattner's avatar
      Fix a deficiency in the spiller that Evan noticed. In particular, consider · f3aef1b0
      Chris Lattner authored
      this code:
      
        store [stack slot #0],  R10
          = add R14, [stack slot #0]
      
      The spiller didn't know that the store made the value of [stackslot#0] available
      in R10 *IF* the store came from a copy instruction with the store folded into it.
      
      This patch teaches VirtRegMap to look at these stores and recognize the values
      they make available.  In one case Evan provided, this code:
      
              divsd %XMM0, %XMM1
              movsd %XMM1, QWORD PTR [%ESP + 40]
      1)      movsd QWORD PTR [%ESP + 48], %XMM1
      2)      movsd %XMM1, QWORD PTR [%ESP + 48]
              addsd %XMM1, %XMM0
      3)      movsd QWORD PTR [%ESP + 48], %XMM1
              movsd QWORD PTR [%ESP + 4], %XMM0
      
      turns into:
      
              divsd %XMM0, %XMM1
              movsd %XMM1, QWORD PTR [%ESP + 40]
              addsd %XMM1, %XMM0
      3)      movsd QWORD PTR [%ESP + 48], %XMM1
              movsd QWORD PTR [%ESP + 4], %XMM0
      
      In this case, instruction #2 was removed because of the value made
      available by #1, and inst #1 was later deleted because it is now
      never used before the stack slot is redefined by #3.
      
      This occurs here and there in a lot of code with high spilling, on PPC
      most of the removed loads/stores are LSU-reject-causing loads, which is
      nice.
      
      On X86, things are much better (because it spills more), where we nuke
      about 1% of the instructions from SMG2000 and several hundred from eon.
      
      More improvements to come...
      
      llvm-svn: 25917
      f3aef1b0
  2. Feb 02, 2006
  3. Jan 23, 2006
  4. Jan 04, 2006
  5. Oct 06, 2005
  6. Oct 05, 2005
    • Chris Lattner's avatar
      Fix a bug in the local spiller, where we could take code like this: · 55149d78
      Chris Lattner authored
        store r12 -> [ss#2]
        R3 = load [ss#1]
        use R3
        R3 = load [ss#2]
        R4 = load [ss#1]
      
      and turn it into this code:
      
        store R12 -> [ss#2]
        R3 = load [ss#1]
        use R3
        R3 = R12
        R4 = R3    <- oops!
      
      The problem was that promoting R3 = load[ss#2] to a copy missed the fact that
      the instruction invalidated R3 at that point.
      
      llvm-svn: 23638
      55149d78
  7. Sep 30, 2005
  8. Sep 19, 2005
    • Chris Lattner's avatar
      Teach the local spiller to turn stack slot loads into register-register copies · 2f838f21
      Chris Lattner authored
      when possible, avoiding the load (and avoiding the copy if the value is already
      in the right register).
      
      This patch came about when I noticed code like the following being generated:
      
        store R17 -> [SS1]
        ...blah...
        R4 = load [SS1]
      
      This was causing an LSU reject on the G5.  This problem was due to the register
      allocator folding spill code into a reg-reg copy (producing the load), which
      prevented the spiller from being able to rewrite the load into a copy, despite
      the fact that the value was already available in a register.  In the case
      above, we now rip out the R4 load and replace it with a R4 = R17 copy.
      
      This speeds up several programs on X86 (which spills a lot :) ), e.g.
      smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
      68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
      impact in some cases on the G5 (by avoiding LSU rejects), though it probably
      won't trigger as often (less spilling in general).
      
      Targets that implement folding of loads/stores into copies should implement
      the isLoadFromStackSlot hook to get this.
      
      llvm-svn: 23388
      2f838f21
  9. Sep 09, 2005
  10. Apr 22, 2005
  11. Apr 04, 2005
  12. Jan 23, 2005
  13. Jan 14, 2005
  14. Oct 26, 2004
  15. Oct 15, 2004
    • Chris Lattner's avatar
      This patch fixes the nasty bug that caused 175.vpr to fail for X86 last night. · 21522363
      Chris Lattner authored
      The problem occurred when trying to reload this instruction:
      
      MOV32mr %reg2326, 8, %reg2297, 4, %reg2295
      
      The value of reg2326 was available in EBX, so it was reused from there, instead
      of reloading it into EDX.
      
      The value of reg2297 was available in EDX, so it was reused from there, instead
      of reloading it into EDI.
      
      The value of reg2295 was not available, so we tried reloading it into EBX, its
      assigned register.  However, we checked and saw that we already reloaded
      something into EBX, so we chose what reg2326 was assigned to (EDX) and reloaded
      into that register instead.
      
      Unfortunately EDX had already been used by reg2297, so reloading into EDX
      clobbered the value used by the reg2326 operand, breaking the program.
      
      The fix for this is to check that the newly picked register is ok.  In this
      case we now find that EDX is already used and try using EDI, which succeeds.
      
      llvm-svn: 17006
      21522363
    • Chris Lattner's avatar
      9af0572a
  16. Oct 02, 2004
  17. Oct 01, 2004
    • Chris Lattner's avatar
      Add a simple little improvement to the local spiller to keep track of stores · 04f52079
      Chris Lattner authored
      and delete them if they turn out to be dead.  This is a useful little hack
      that even speeds up some programs.  For example, it speeds up Ptrdist/ks
      from 17.53s to 15.59s, and 188.ammp from 149s to 146s.
      
      This also speeds up llc :)
      
      llvm-svn: 16630
      04f52079
    • Chris Lattner's avatar
      Substantially revamp the local spiller, causing it to actually improve the · d3b1f6c7
      Chris Lattner authored
      generated code over the simple spiller.  The new local spiller generates
      substantially better code than the simple one in some cases, by reusing
      values that are loaded out of stack slots and kept available in registers.
      
      This primarily helps programs that are spilling a lot, and there is still
      stuff that can be done to improve it.  This patch makes the local spiller
      the default, as it's only a tiny bit slower than the simple spiller (it
      increases the runtime of llc by < 1%).
      
      Here are some numbers with speedups.
      
      Program    #reuse  old(s)    new(s)  Speedup
      
      Povray:     3452,  16.87 ->  15.93   (5.5%)
      177.mesa:   2176,   2.77 ->   2.76   (0%)
      179.art:      35,  28.43 ->  28.01   (1.5%)
      183.equake:   55,  61.44 ->  61.41   (0%)
      188.ammp:    869, 174    -> 149      (15%)
      
      164.gzip:     43,  40.73 ->  40.71   (0%)
      175.vpr:     351,  18.54 ->  17.34   (6.5%)
      176.gcc:    2471,   5.01 ->   4.92   (1.8%)
      181.mcf       42,  79.30 ->  75.20   (5.2%)
      186.crafty:  484,  29.73 ->  30.04   (-1%)
      197.parser:  251,  10.47 ->  10.67   (-1%)
      252.eon:    1501,   1.98 ->   1.75   (12%)
      253.perlbm: 1183,  14.83 ->  14.42   (2.8%)
      254.gap:     825,   7.46 ->   7.29   (2.3%)
      255.vortex:  285,  10.51 ->  10.27   (2.3%)
      256.bzip2:    63,  55.70 ->  55.20   (0.9%)
      300.twolf:   830,  21.63 ->  22.00   (-1%)
      
      PtrDist/ks    14,  32.75 -> 17.53    (46.5%)
      Olden/tsp     46,   8.71 ->  8.24    (5.4%)
      Free/distray  70,   1.09 ->  0.99    (9.2%)
      
      llvm-svn: 16629
      d3b1f6c7
  18. Sep 30, 2004
  19. Sep 02, 2004
    • Reid Spencer's avatar
      Changes For Bug 352 · 7c16caa3
      Reid Spencer authored
      Move include/Config and include/Support into include/llvm/Config,
      include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
      public header files must be under include/llvm/.
      
      llvm-svn: 16137
      7c16caa3
  20. Aug 16, 2004
  21. Aug 15, 2004
  22. Jul 21, 2004
  23. Jul 16, 2004
  24. Jun 25, 2004
  25. Jun 02, 2004
  26. May 29, 2004
Loading