Skip to content
  1. Feb 29, 2004
  2. Feb 28, 2004
  3. Feb 27, 2004
  4. Feb 26, 2004
  5. Feb 25, 2004
    • Misha Brukman's avatar
      SparcV8 regs are really 32-bit, not 64! Thanks, Chris. · 564654d6
      Misha Brukman authored
      llvm-svn: 11835
      564654d6
    • Misha Brukman's avatar
      Clean up the tablegen descriptions for SparcV8. · f8dcdcc8
      Misha Brukman authored
      llvm-svn: 11834
      f8dcdcc8
    • Misha Brukman's avatar
      2122b969
    • Misha Brukman's avatar
      0e3a7ca5
    • Chris Lattner's avatar
      Fix failures in 099.go due to the cfgsimplify pass creating switch instructions · 64c9b223
      Chris Lattner authored
      where there did not used to be any before
      
      llvm-svn: 11829
      64c9b223
    • Brian Gaeke's avatar
      SparcV8 skeleton · 9a5bd7fc
      Brian Gaeke authored
      llvm-svn: 11828
      9a5bd7fc
    • Brian Gaeke's avatar
    • Brian Gaeke's avatar
      Great renaming: Sparc --> SparcV9 · 94e95d2b
      Brian Gaeke authored
      llvm-svn: 11826
      94e95d2b
    • Chris Lattner's avatar
      Teach the instruction selector how to transform 'array' GEP computations into X86 · 309327a4
      Chris Lattner authored
      scaled indexes.  This allows us to compile GEP's like this:
      
      int* %test([10 x { int, { int } }]* %X, int %Idx) {
              %Idx = cast int %Idx to long
              %X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0
              ret int* %X
      }
      
      Into a single address computation:
      
      test:
              mov %EAX, DWORD PTR [%ESP + 4]
              mov %ECX, DWORD PTR [%ESP + 8]
              lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4]
              ret
      
      Before it generated:
      test:
              mov %EAX, DWORD PTR [%ESP + 4]
              mov %ECX, DWORD PTR [%ESP + 8]
              shl %ECX, 3
              add %EAX, %ECX
              lea %EAX, DWORD PTR [%EAX + 4]
              ret
      
      This is useful for things like int/float/double arrays, as the indexing can be folded into
      the loads&stores, reducing register pressure and decreasing the pressure on the decode unit.
      With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot.  On
      bzip2 for example, we go from this:
      
      10665 asm-printer           - Number of machine instrs printed
         40 ra-local              - Number of loads/stores folded into instructions
       1708 ra-local              - Number of loads added
       1532 ra-local              - Number of stores added
       1354 twoaddressinstruction - Number of instructions added
       1354 twoaddressinstruction - Number of two-address instructions
       2794 x86-peephole          - Number of peephole optimization performed
      
      to this:
      9873 asm-printer           - Number of machine instrs printed
        41 ra-local              - Number of loads/stores folded into instructions
      1710 ra-local              - Number of loads added
      1521 ra-local              - Number of stores added
       789 twoaddressinstruction - Number of instructions added
       789 twoaddressinstruction - Number of two-address instructions
      2142 x86-peephole          - Number of peephole optimization performed
      
      ... and these types of instructions are often in tight loops.
      
      Linear scan is also helped, but not as much.  It goes from:
      
      8787 asm-printer           - Number of machine instrs printed
      2389 liveintervals         - Number of identity moves eliminated after coalescing
      2288 liveintervals         - Number of interval joins performed
      3522 liveintervals         - Number of intervals after coalescing
      5810 liveintervals         - Number of original intervals
       700 spiller               - Number of loads added
       487 spiller               - Number of stores added
       303 spiller               - Number of register spills
      1354 twoaddressinstruction - Number of instructions added
      1354 twoaddressinstruction - Number of two-address instructions
       363 x86-peephole          - Number of peephole optimization performed
      
      to:
      
      7982 asm-printer           - Number of machine instrs printed
      1759 liveintervals         - Number of identity moves eliminated after coalescing
      1658 liveintervals         - Number of interval joins performed
      3282 liveintervals         - Number of intervals after coalescing
      4940 liveintervals         - Number of original intervals
       635 spiller               - Number of loads added
       452 spiller               - Number of stores added
       288 spiller               - Number of register spills
       789 twoaddressinstruction - Number of instructions added
       789 twoaddressinstruction - Number of two-address instructions
       258 x86-peephole          - Number of peephole optimization performed
      
      Though I'm not complaining about the drop in the number of intervals.  :)
      
      llvm-svn: 11820
      309327a4
Loading