Skip to content
  1. Nov 13, 2004
    • Chris Lattner's avatar
      · 049d33a7
      Chris Lattner authored
      shld is a very high latency operation. Instead of emitting it for shifts of
      two or three, open code the equivalent operation which is faster on athlon
      and P4 (by a substantial margin).
      
      For example, instead of compiling this:
      
      long long X2(long long Y) { return Y << 2; }
      
      to:
      
      X3_2:
              movl 4(%esp), %eax
              movl 8(%esp), %edx
              shldl $2, %eax, %edx
              shll $2, %eax
              ret
      
      Compile it to:
      
      X2:
              movl 4(%esp), %eax
              movl 8(%esp), %ecx
              movl %eax, %edx
              shrl $30, %edx
              leal (%edx,%ecx,4), %edx
              shll $2, %eax
              ret
      
      Likewise, for << 3, compile to:
      
      X3:
              movl 4(%esp), %eax
              movl 8(%esp), %ecx
              movl %eax, %edx
              shrl $29, %edx
              leal (%edx,%ecx,8), %edx
              shll $3, %eax
              ret
      
      This matches icc, except that icc open codes the shifts as adds on the P4.
      
      llvm-svn: 17707
      049d33a7
    • Chris Lattner's avatar
      Add missing check · ef6bd92a
      Chris Lattner authored
      llvm-svn: 17706
      ef6bd92a
    • Chris Lattner's avatar
      Compile: · 8d521bb1
      Chris Lattner authored
      long long X3_2(long long Y) { return Y+Y; }
      int X(int Y) { return Y+Y; }
      
      into:
      
      X3_2:
              movl 4(%esp), %eax
              movl 8(%esp), %edx
              addl %eax, %eax
              adcl %edx, %edx
              ret
      X:
              movl 4(%esp), %eax
              addl %eax, %eax
              ret
      
      instead of:
      
      X3_2:
              movl 4(%esp), %eax
              movl 8(%esp), %edx
              shldl $1, %eax, %edx
              shll $1, %eax
              ret
      
      X:
              movl 4(%esp), %eax
              shll $1, %eax
              ret
      
      llvm-svn: 17705
      8d521bb1
    • Chris Lattner's avatar
      Simplify handling of shifts to be the same as we do for adds. Add support · 8c3e7b92
      Chris Lattner authored
      for (X * C1) + (X * C2) (where * can be mul or shl), allowing us to fold:
      
         Y+Y+Y+Y+Y+Y+Y+Y
      
      into
               %tmp.8 = shl long %Y, ubyte 3           ; <long> [#uses=1]
      
      instead of
      
              %tmp.4 = shl long %Y, ubyte 2           ; <long> [#uses=1]
              %tmp.12 = shl long %Y, ubyte 2          ; <long> [#uses=1]
              %tmp.8 = add long %tmp.4, %tmp.12               ; <long> [#uses=1]
      
      This implements add.ll:test25
      
      Also add support for (X*C1)-(X*C2) -> X*(C1-C2), implementing sub.ll:test18
      
      llvm-svn: 17704
      8c3e7b92
    • Chris Lattner's avatar
      New testcase · f6392b46
      Chris Lattner authored
      llvm-svn: 17703
      f6392b46
    • Chris Lattner's avatar
      Add support for shifts · 6912370a
      Chris Lattner authored
      llvm-svn: 17702
      6912370a
    • Chris Lattner's avatar
      Fold: · 4efe20a1
      Chris Lattner authored
         (X + (X << C2)) --> X * ((1 << C2) + 1)
         ((X << C2) + X) --> X * ((1 << C2) + 1)
      
      This means that we now canonicalize "Y+Y+Y" into:
      
              %tmp.2 = mul long %Y, 3         ; <long> [#uses=1]
      
      instead of:
      
              %tmp.10 = shl long %Y, ubyte 1          ; <long> [#uses=1]
              %tmp.6 = add long %Y, %tmp.10               ; <long> [#uses=1]
      
      llvm-svn: 17701
      4efe20a1
    • Chris Lattner's avatar
      Lazily create the abort message, so only translation units that use unwind · 2858e175
      Chris Lattner authored
      will actually get it.
      
      llvm-svn: 17700
      2858e175
    • Chris Lattner's avatar
      Fix: CodeExtractor/2004-11-12-InvokeExtract.ll · 9b0291b1
      Chris Lattner authored
      llvm-svn: 17699
      9b0291b1
    • Chris Lattner's avatar
      New testcase · 8cc98850
      Chris Lattner authored
      llvm-svn: 17698
      8cc98850
    • Chris Lattner's avatar
      Fix a bug where the code extractor would get a bit confused handling invoke · 5bcca605
      Chris Lattner authored
      instructions, setting DefBlock to a block it did not have dom info for.
      
      llvm-svn: 17697
      5bcca605
  2. Nov 12, 2004
  3. Nov 11, 2004
  4. Nov 10, 2004
  5. Nov 09, 2004
Loading