Skip to content
  1. Feb 27, 2004
  2. Feb 26, 2004
    • John Criswell's avatar
      Fixes for PR258 and PR259. · feb7c49c
      John Criswell authored
      Functions with linkonce linkage are declared with weak linkage.
      Global floating point constants used to represent unprintable values
      (such as NaN and infinity) are declared static so that they don't interfere
      with other CBE generated translation units.
      
      llvm-svn: 11884
      feb7c49c
    • Chris Lattner's avatar
      Be a good little compiler and handle direct calls efficiently, even if there · 5ef1638d
      Chris Lattner authored
      are beastly ConstantPointerRefs in the way...
      
      llvm-svn: 11883
      5ef1638d
    • Alkis Evlogimenos's avatar
      Uncomment assertions that register# != 0 on calls to · 61719d48
      Alkis Evlogimenos authored
      MRegisterInfo::is{Physical,Virtual}Register. Apply appropriate fixes
      to relevant files.
      
      llvm-svn: 11882
      61719d48
    • Chris Lattner's avatar
      Since LLVM uses structure type equivalence, it isn't useful to keep around · 79636d7c
      Chris Lattner authored
      multiple type names for the same structural type.  Make DTE eliminate all
      but one of the type names
      
      llvm-svn: 11879
      79636d7c
    • Chris Lattner's avatar
      Use a map instead of annotations · 7140e469
      Chris Lattner authored
      llvm-svn: 11875
      7140e469
    • Chris Lattner's avatar
      remove obsolete comment · 234a2d4f
      Chris Lattner authored
      llvm-svn: 11872
      234a2d4f
    • Chris Lattner's avatar
      Make sure that at least one virtual method is defined in a .cpp file to avoid · 12003589
      Chris Lattner authored
      having the compiler emit RTTI and vtables to EVERY translation unit.
      
      llvm-svn: 11871
      12003589
    • Chris Lattner's avatar
      turn things like: · 21e941fb
      Chris Lattner authored
         if (X == 0 || X == 2)
      
      ...where the comparisons and branches are in different blocks... into a switch
      instruction.  This comes up a lot in various programs, and works well with
      the switch/switch merging code I checked earlier.  For example, this testcase:
      
      int switchtest(int C) {
        return C == 0 ? f(123) :
               C == 1 ? f(3123) :
               C == 4 ? f(312) :
               C == 5 ? f(1234): f(444);
      }
      
      is converted into this:
              switch int %C, label %cond_false.3 [
                       int 0, label %cond_true.0
                       int 1, label %cond_true.1
                       int 4, label %cond_true.2
                       int 5, label %cond_true.3
              ]
      
      instead of a whole bunch of conditional branches.
      
      Admittedly the code is ugly, and incomplete.  To be complete, we need to add
      br -> switch merging and switch -> br merging.  For example, this testcase:
      
      struct foo { int Q, R, Z; };
      #define A (X->Q+X->R * 123)
      int test(struct foo *X) {
        return A  == 123 ? X1() :
              A == 12321 ? X2():
              (A == 111 || A == 222) ? X3() :
              A == 875 ? X4() : X5();
      }
      
      Gets compiled to this:
              switch int %tmp.7, label %cond_false.2 [
                       int 123, label %cond_true.0
                       int 12321, label %cond_true.1
                       int 111, label %cond_true.2
                       int 222, label %cond_true.2
              ]
      ...
      cond_false.2:           ; preds = %entry
              %tmp.52 = seteq int %tmp.7, 875         ; <bool> [#uses=1]
              br bool %tmp.52, label %cond_true.3, label %cond_false.3
      
      where the branch could be folded into the switch.
      
      This kind of thing occurs *ALL OF THE TIME*, especially in programs like
      176.gcc, which is a horrible mess of code.  It contains stuff like *shudder*:
      
      #define SWITCH_TAKES_ARG(CHAR) \
        (   (CHAR) == 'D' \
         || (CHAR) == 'U' \
         || (CHAR) == 'o' \
         || (CHAR) == 'e' \
         || (CHAR) == 'u' \
         || (CHAR) == 'I' \
         || (CHAR) == 'm' \
         || (CHAR) == 'L' \
         || (CHAR) == 'A' \
         || (CHAR) == 'h' \
         || (CHAR) == 'z')
      
      and
      
      #define CONST_OK_FOR_LETTER_P(VALUE, C)                 \
        ((C) == 'I' ? SMALL_INTVAL (VALUE)                    \
         : (C) == 'J' ? SMALL_INTVAL (-(VALUE))               \
         : (C) == 'K' ? (unsigned)(VALUE) < 32                \
         : (C) == 'L' ? ((VALUE) & 0xffff) == 0               \
         : (C) == 'M' ? integer_ok_for_set (VALUE)            \
         : (C) == 'N' ? (VALUE) < 0                           \
         : (C) == 'O' ? (VALUE) == 0                          \
         : (C) == 'P' ? (VALUE) >= 0                          \
         : 0)
      
      and
      
      #define LEGITIMIZE_ADDRESS(X,OLDX,MODE,WIN)                     \
      {                                                               \
        if (GET_CODE (X) == PLUS && CONSTANT_ADDRESS_P (XEXP (X, 1))) \
          (X) = gen_rtx (PLUS, SImode, XEXP (X, 0),                   \
                         copy_to_mode_reg (SImode, XEXP (X, 1)));     \
        if (GET_CODE (X) == PLUS && CONSTANT_ADDRESS_P (XEXP (X, 0))) \
          (X) = gen_rtx (PLUS, SImode, XEXP (X, 1),                   \
                         copy_to_mode_reg (SImode, XEXP (X, 0)));     \
        if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 0)) == MULT)   \
          (X) = gen_rtx (PLUS, SImode, XEXP (X, 1),                   \
                         force_operand (XEXP (X, 0), 0));             \
        if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 1)) == MULT)   \
          (X) = gen_rtx (PLUS, SImode, XEXP (X, 0),                   \
                         force_operand (XEXP (X, 1), 0));             \
        if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 0)) == PLUS)   \
          (X) = gen_rtx (PLUS, Pmode, force_operand (XEXP (X, 0), NULL_RTX),\
                         XEXP (X, 1));                                \
        if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 1)) == PLUS)   \
          (X) = gen_rtx (PLUS, Pmode, XEXP (X, 0),                    \
                         force_operand (XEXP (X, 1), NULL_RTX));      \
        if (GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == CONST       \
                 || GET_CODE (X) == LABEL_REF)                        \
          (X) = legitimize_address (flag_pic, X, 0, 0);               \
        if (memory_address_p (MODE, X))                               \
          goto WIN; }
      
      and others.  These macros get used multiple times of course.  These are such
      lovely candidates for macros, aren't they?  :)
      
      This code also nicely handles LLVM constructs that look like this:
      
        if (isa<CastInst>(I))
         ...
        else if (isa<BranchInst>(I))
         ...
        else if (isa<SetCondInst>(I))
         ...
        else if (isa<UnwindInst>(I))
         ...
        else if (isa<VAArgInst>(I))
         ...
      
      where the isa can obviously be a dyn_cast as well.  Switch instructions are a
      good thing.
      
      llvm-svn: 11870
      21e941fb
    • Chris Lattner's avatar
      No need to clear the map here, it will always be empty · 28a08859
      Chris Lattner authored
      llvm-svn: 11868
      28a08859
    • Chris Lattner's avatar
      Fix typo · 36ab728f
      Chris Lattner authored
      llvm-svn: 11864
      36ab728f
    • Chris Lattner's avatar
      The node doesn't have to be _no_ node flags, it just has to be complete and · 128e8419
      Chris Lattner authored
      not have any globals.
      
      llvm-svn: 11863
      128e8419
    • Chris Lattner's avatar
      Add _more_ functions · c8167b0e
      Chris Lattner authored
      llvm-svn: 11862
      c8167b0e
    • Chris Lattner's avatar
      Fix some warnings, some of which were spurious, and some of which were real · 9192bbda
      Chris Lattner authored
      bugs.  Thanks Brian!
      
      llvm-svn: 11859
      9192bbda
    • Misha Brukman's avatar
      Instructions to call and return from functions. · 1743c409
      Misha Brukman authored
      llvm-svn: 11858
      1743c409
    • Chris Lattner's avatar
      Two changes: · 71626b8f
      Chris Lattner authored
       1. Functions do not make things incomplete, only variables
       2. Constant global variables no longer need to be marked incomplete, because
          we are guaranteed that the initializer for the global will be in the
          graph we are hacking on now.  This makes resolution of indirect calls happen
          a lot more in the bu pass, supports things like vtables and the C counterparts
          (giant constant arrays of function pointers), etc...
      
      Testcase here: test/Regression/Analysis/DSGraph/constant_globals.ll
      
      llvm-svn: 11852
      71626b8f
    • Chris Lattner's avatar
      When building local graphs, clone the initializer for constant globals into each · fab2872b
      Chris Lattner authored
      local graph that uses the global.
      
      llvm-svn: 11850
      fab2872b
    • Alkis Evlogimenos's avatar
      Fix bugs found with recent addition of assertions in · e62ddd40
      Alkis Evlogimenos authored
      MRegisterInfo::is{Physical,Virtual}Register.
      
      llvm-svn: 11849
      e62ddd40
    • Chris Lattner's avatar
      Simplify the dead node elimination stuff · 6ce59b4a
      Chris Lattner authored
      Make the incompleteness marker faster by looping directly over the globals
      instead of over the scalars to find the globals
      
      Fix a bug where we didn't mark a global incomplete if it didn't have any
      outgoing edges.  This wouldn't break any current clients but is still wrong.
      
      llvm-svn: 11848
      6ce59b4a
    • Chris Lattner's avatar
      Add a bunch more functions · 5e5e0606
      Chris Lattner authored
      llvm-svn: 11847
      5e5e0606
    • Chris Lattner's avatar
      Try harder to get symbol info · 17bce881
      Chris Lattner authored
      llvm-svn: 11846
      17bce881
    • Brian Gaeke's avatar
      Represent va_list in interpreter as a (ec-stack-depth . var-arg-index) · 7b4be13f
      Brian Gaeke authored
      pair, and look up varargs in the execution stack every time, instead of
      just pushing iterators (which can be invalidated during callFunction())
      around.  (union GenericValue now has a "pair of uints" member, to support
      this mechanism.) Fixes Bug 234.
      
      llvm-svn: 11845
      7b4be13f
  3. Feb 25, 2004
    • Brian Gaeke's avatar
      Great sparc renaming fallout IV: Sparc --> SparcV9. · 84b76c9b
      Brian Gaeke authored
      llvm-svn: 11844
      84b76c9b
    • Alkis Evlogimenos's avatar
      Remove asssert since it is breaking cases that it shouldn't. · a9f03fba
      Alkis Evlogimenos authored
      llvm-svn: 11841
      a9f03fba
    • Alkis Evlogimenos's avatar
      Add DenseMap template and actually use it for for mapping virtual regs · d8bace7f
      Alkis Evlogimenos authored
      to objects.
      
      llvm-svn: 11840
      d8bace7f
    • Chris Lattner's avatar
      My faith in programmers has been found to be totally misplaced. One would · 8d1da1ab
      Chris Lattner authored
      assume that if they don't intend to write to a global variable, that they
      would mark it as constant.  However, there are people that don't understand
      that the compiler can do nice things for them if they give it the information
      it needs.
      
      This pass looks for blatently obvious globals that are only ever read from.
      Though it uses a trivially simple "alias analysis" of sorts, it is still able
      to do amazing things to important benchmarks.  253.perlbmk, for example,
      contains several ***GIANT*** function pointer tables that are not marked
      constant and should be.  Marking them constant allows the optimizer to turn
      a whole bunch of indirect calls into direct calls.  Note that only a link-time
      optimizer can do this transformation, but perlbmk does have several strings
      and other minor globals that can be marked constant by this pass when run
      from GCCAS.
      
      176.gcc has a ton of strings and large tables that are marked constant, both
      at compile time (38 of them) and at link time (48 more).  Other benchmarks
      give similar results, though it seems like big ones have disproportionally
      more than small ones.
      
      This pass is extremely quick and does good things.  I'm going to enable it
      in gccas & gccld.  Not bad for 50 SLOC.
      
      llvm-svn: 11836
      8d1da1ab
    • Misha Brukman's avatar
      SparcV8 regs are really 32-bit, not 64! Thanks, Chris. · 564654d6
      Misha Brukman authored
      llvm-svn: 11835
      564654d6
    • Misha Brukman's avatar
      Clean up the tablegen descriptions for SparcV8. · f8dcdcc8
      Misha Brukman authored
      llvm-svn: 11834
      f8dcdcc8
    • Misha Brukman's avatar
      2122b969
    • Misha Brukman's avatar
      0e3a7ca5
    • Chris Lattner's avatar
      Add an assertion · f5a393a1
      Chris Lattner authored
      llvm-svn: 11830
      f5a393a1
    • Chris Lattner's avatar
      Fix failures in 099.go due to the cfgsimplify pass creating switch instructions · 64c9b223
      Chris Lattner authored
      where there did not used to be any before
      
      llvm-svn: 11829
      64c9b223
    • Brian Gaeke's avatar
      SparcV8 skeleton · 9a5bd7fc
      Brian Gaeke authored
      llvm-svn: 11828
      9a5bd7fc
    • Brian Gaeke's avatar
    • Brian Gaeke's avatar
      Great renaming: Sparc --> SparcV9 · 94e95d2b
      Brian Gaeke authored
      llvm-svn: 11826
      94e95d2b
    • Chris Lattner's avatar
      Add a bunch more functions used by perlbmk · 864c9014
      Chris Lattner authored
      llvm-svn: 11824
      864c9014
    • Chris Lattner's avatar
      Fix incorrect debug code · 9c6833c5
      Chris Lattner authored
      llvm-svn: 11821
      9c6833c5
    • Chris Lattner's avatar
      Teach the instruction selector how to transform 'array' GEP computations into X86 · 309327a4
      Chris Lattner authored
      scaled indexes.  This allows us to compile GEP's like this:
      
      int* %test([10 x { int, { int } }]* %X, int %Idx) {
              %Idx = cast int %Idx to long
              %X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0
              ret int* %X
      }
      
      Into a single address computation:
      
      test:
              mov %EAX, DWORD PTR [%ESP + 4]
              mov %ECX, DWORD PTR [%ESP + 8]
              lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4]
              ret
      
      Before it generated:
      test:
              mov %EAX, DWORD PTR [%ESP + 4]
              mov %ECX, DWORD PTR [%ESP + 8]
              shl %ECX, 3
              add %EAX, %ECX
              lea %EAX, DWORD PTR [%EAX + 4]
              ret
      
      This is useful for things like int/float/double arrays, as the indexing can be folded into
      the loads&stores, reducing register pressure and decreasing the pressure on the decode unit.
      With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot.  On
      bzip2 for example, we go from this:
      
      10665 asm-printer           - Number of machine instrs printed
         40 ra-local              - Number of loads/stores folded into instructions
       1708 ra-local              - Number of loads added
       1532 ra-local              - Number of stores added
       1354 twoaddressinstruction - Number of instructions added
       1354 twoaddressinstruction - Number of two-address instructions
       2794 x86-peephole          - Number of peephole optimization performed
      
      to this:
      9873 asm-printer           - Number of machine instrs printed
        41 ra-local              - Number of loads/stores folded into instructions
      1710 ra-local              - Number of loads added
      1521 ra-local              - Number of stores added
       789 twoaddressinstruction - Number of instructions added
       789 twoaddressinstruction - Number of two-address instructions
      2142 x86-peephole          - Number of peephole optimization performed
      
      ... and these types of instructions are often in tight loops.
      
      Linear scan is also helped, but not as much.  It goes from:
      
      8787 asm-printer           - Number of machine instrs printed
      2389 liveintervals         - Number of identity moves eliminated after coalescing
      2288 liveintervals         - Number of interval joins performed
      3522 liveintervals         - Number of intervals after coalescing
      5810 liveintervals         - Number of original intervals
       700 spiller               - Number of loads added
       487 spiller               - Number of stores added
       303 spiller               - Number of register spills
      1354 twoaddressinstruction - Number of instructions added
      1354 twoaddressinstruction - Number of two-address instructions
       363 x86-peephole          - Number of peephole optimization performed
      
      to:
      
      7982 asm-printer           - Number of machine instrs printed
      1759 liveintervals         - Number of identity moves eliminated after coalescing
      1658 liveintervals         - Number of interval joins performed
      3282 liveintervals         - Number of intervals after coalescing
      4940 liveintervals         - Number of original intervals
       635 spiller               - Number of loads added
       452 spiller               - Number of stores added
       288 spiller               - Number of register spills
       789 twoaddressinstruction - Number of instructions added
       789 twoaddressinstruction - Number of two-address instructions
       258 x86-peephole          - Number of peephole optimization performed
      
      Though I'm not complaining about the drop in the number of intervals.  :)
      
      llvm-svn: 11820
      309327a4
    • Chris Lattner's avatar
      * Make the previous patch more efficient by not allocating a temporary MachineInstr · d1ee55d4
      Chris Lattner authored
        to do analysis.
      
      *** FOLD getelementptr instructions into loads and stores when possible,
          making use of some of the crazy X86 addressing modes.
      
      For example, the following C++ program fragment:
      
      struct complex {
          double re, im;
          complex(double r, double i) : re(r), im(i) {}
      };
      inline complex operator+(const complex& a, const complex& b) {
          return complex(a.re+b.re, a.im+b.im);
      }
      complex addone(const complex& arg) {
          return arg + complex(1,0);
      }
      
      Used to be compiled to:
      _Z6addoneRK7complex:
              mov %EAX, DWORD PTR [%ESP + 4]
              mov %ECX, DWORD PTR [%ESP + 8]
      ***     mov %EDX, %ECX
              fld QWORD PTR [%EDX]
              fld1
              faddp %ST(1)
      ***     add %ECX, 8
              fld QWORD PTR [%ECX]
              fldz
              faddp %ST(1)
      ***     mov %ECX, %EAX
              fxch %ST(1)
              fstp QWORD PTR [%ECX]
      ***     add %EAX, 8
              fstp QWORD PTR [%EAX]
              ret
      
      Now it is compiled to:
      _Z6addoneRK7complex:
              mov %EAX, DWORD PTR [%ESP + 4]
              mov %ECX, DWORD PTR [%ESP + 8]
              fld QWORD PTR [%ECX]
              fld1
              faddp %ST(1)
              fld QWORD PTR [%ECX + 8]
              fldz
              faddp %ST(1)
              fxch %ST(1)
              fstp QWORD PTR [%EAX]
              fstp QWORD PTR [%EAX + 8]
              ret
      
      Other programs should see similar improvements, across the board.  Note that
      in addition to reducing instruction count, this also reduces register pressure
      a lot, always a good thing on X86.  :)
      
      llvm-svn: 11819
      d1ee55d4
    • Chris Lattner's avatar
      Add a helper to create an addressing mode given all of the pieces. · 4b3514c1
      Chris Lattner authored
      llvm-svn: 11818
      4b3514c1
Loading