Skip to content
  1. Aug 10, 2005
    • Chris Lattner's avatar
      Fix some 80 column violations. · c6c4d99a
      Chris Lattner authored
      Once we compute the evolution for a GEP, tell SE about it.  This allows users
      of the GEP to know it, if the users are not direct.  This allows us to compile
      this testcase:
      
      void fbSolidFillmmx(int w, unsigned char *d) {
          while (w >= 64) {
              *(unsigned long long *) (d +  0) = 0;
              *(unsigned long long *) (d +  8) = 0;
              *(unsigned long long *) (d + 16) = 0;
              *(unsigned long long *) (d + 24) = 0;
              *(unsigned long long *) (d + 32) = 0;
              *(unsigned long long *) (d + 40) = 0;
              *(unsigned long long *) (d + 48) = 0;
              *(unsigned long long *) (d + 56) = 0;
              w -= 64;
              d += 64;
          }
      }
      
      into:
      
      .LBB_fbSolidFillmmx_2:  ; no_exit
              li r2, 0
              stw r2, 0(r4)
              stw r2, 4(r4)
              stw r2, 8(r4)
              stw r2, 12(r4)
              stw r2, 16(r4)
              stw r2, 20(r4)
              stw r2, 24(r4)
              stw r2, 28(r4)
              stw r2, 32(r4)
              stw r2, 36(r4)
              stw r2, 40(r4)
              stw r2, 44(r4)
              stw r2, 48(r4)
              stw r2, 52(r4)
              stw r2, 56(r4)
              stw r2, 60(r4)
              addi r4, r4, 64
              addi r3, r3, -64
              cmpwi cr0, r3, 63
              bgt .LBB_fbSolidFillmmx_2       ; no_exit
      
      instead of:
      
      .LBB_fbSolidFillmmx_2:  ; no_exit
              li r11, 0
              stw r11, 0(r4)
              stw r11, 4(r4)
              stwx r11, r10, r4
              add r12, r10, r4
              stw r11, 4(r12)
              stwx r11, r9, r4
              add r12, r9, r4
              stw r11, 4(r12)
              stwx r11, r8, r4
              add r12, r8, r4
              stw r11, 4(r12)
              stwx r11, r7, r4
              add r12, r7, r4
              stw r11, 4(r12)
              stwx r11, r6, r4
              add r12, r6, r4
              stw r11, 4(r12)
              stwx r11, r5, r4
              add r12, r5, r4
              stw r11, 4(r12)
              stwx r11, r2, r4
              add r12, r2, r4
              stw r11, 4(r12)
              addi r4, r4, 64
              addi r3, r3, -64
              cmpwi cr0, r3, 63
              bgt .LBB_fbSolidFillmmx_2       ; no_exit
      
      llvm-svn: 22737
      c6c4d99a
    • Chris Lattner's avatar
      implement two helper methods · b310ac4a
      Chris Lattner authored
      llvm-svn: 22736
      b310ac4a
    • Chris Lattner's avatar
      add two helper methods · 67017db5
      Chris Lattner authored
      llvm-svn: 22735
      67017db5
    • Chris Lattner's avatar
      Fix spelling, fix some broken canonicalizations by my last patch · 679f5b0b
      Chris Lattner authored
      llvm-svn: 22734
      679f5b0b
    • Chris Lattner's avatar
      I can't believe I caught this before Misha! :) · fc070f1f
      Chris Lattner authored
      llvm-svn: 22733
      fc070f1f
    • Chris Lattner's avatar
      add a optimization note · 54ee86ac
      Chris Lattner authored
      llvm-svn: 22732
      54ee86ac
  2. Aug 09, 2005
    • Chris Lattner's avatar
      14e060f7
    • Chris Lattner's avatar
      Add testcases for new rlwinm cases handled, patch by Jim Laskey! · 080f741f
      Chris Lattner authored
      llvm-svn: 22730
      080f741f
    • Chris Lattner's avatar
      Update the targets to the new SETCC/CondCodeSDNode interfaces. · 6ec7745e
      Chris Lattner authored
      llvm-svn: 22729
      6ec7745e
    • Chris Lattner's avatar
      Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the · d47675ed
      Chris Lattner authored
      CC out of the SetCC operation, making SETCC a standard ternary operation and
      CC's a standard DAG leaf.  This will make it possible for other node to use
      CC's as operands in the future...
      
      llvm-svn: 22728
      d47675ed
    • Chris Lattner's avatar
      2035c4f7
    • Chris Lattner's avatar
      4c62c647
    • Chris Lattner's avatar
      new reg test for a failure last night on ppc/darwin · 91fca092
      Chris Lattner authored
      llvm-svn: 22725
      91fca092
    • Chris Lattner's avatar
      SCEVAddExpr::get() of an empty list is invalid. · 02742710
      Chris Lattner authored
      llvm-svn: 22724
      02742710
    • Chris Lattner's avatar
      This is now implemented · 23e3fb9e
      Chris Lattner authored
      llvm-svn: 22723
      23e3fb9e
    • Chris Lattner's avatar
      Implement: LoopStrengthReduce/share_ivs.ll · a091ff17
      Chris Lattner authored
      Two changes:
        * Only insert one PHI node for each stride.  Other values are live in
          values.  This cannot introduce higher register pressure than the
          previous approach, and can take advantage of reg+reg addressing modes.
        * Factor common base values out of uses before moving values from the
          base to the immediate fields.  This improves codegen by starting the
          stride-specific PHI node out at a common place for each IV use.
      
      As an example, we used to generate this for a loop in swim:
      
      .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
              lfd f0, 0(r8)
              stfd f0, 0(r3)
              lfd f0, 0(r6)
              stfd f0, 0(r7)
              lfd f0, 0(r2)
              stfd f0, 0(r5)
              addi r9, r9, 1
              addi r2, r2, 8
              addi r5, r5, 8
              addi r6, r6, 8
              addi r7, r7, 8
              addi r8, r8, 8
              addi r3, r3, 8
              cmpw cr0, r9, r4
              bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1
      
      now we emit:
      
      .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
              lfdx f0, r8, r2
              stfdx f0, r9, r2
              lfdx f0, r5, r2
              stfdx f0, r7, r2
              lfdx f0, r3, r2
              stfdx f0, r6, r2
              addi r10, r10, 1
              addi r2, r2, 8
              cmpw cr0, r10, r4
              bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1
      
      As another more dramatic example, we used to emit this:
      
      .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
              lfd f0, 8(r21)
              lfd f4, 8(r3)
              lfd f5, 8(r27)
              lfd f6, 8(r22)
              lfd f7, 8(r5)
              lfd f8, 8(r6)
              lfd f9, 8(r30)
              lfd f10, 8(r11)
              lfd f11, 8(r12)
              fsub f10, f10, f11
              fadd f5, f4, f5
              fmul f5, f5, f1
              fadd f6, f6, f7
              fadd f6, f6, f8
              fadd f6, f6, f9
              fmadd f0, f5, f6, f0
              fnmsub f0, f10, f2, f0
              stfd f0, 8(r4)
              lfd f0, 8(r25)
              lfd f5, 8(r26)
              lfd f6, 8(r23)
              lfd f9, 8(r28)
              lfd f10, 8(r10)
              lfd f12, 8(r9)
              lfd f13, 8(r29)
              fsub f11, f13, f11
              fadd f4, f4, f5
              fmul f4, f4, f1
              fadd f5, f6, f9
              fadd f5, f5, f10
              fadd f5, f5, f12
              fnmsub f0, f4, f5, f0
              fnmsub f0, f11, f3, f0
              stfd f0, 8(r24)
              lfd f0, 8(r8)
              fsub f4, f7, f8
              fsub f5, f12, f10
              fnmsub f0, f5, f2, f0
              fnmsub f0, f4, f3, f0
              stfd f0, 8(r2)
              addi r20, r20, 1
              addi r2, r2, 8
              addi r8, r8, 8
              addi r10, r10, 8
              addi r12, r12, 8
              addi r6, r6, 8
              addi r29, r29, 8
              addi r28, r28, 8
              addi r26, r26, 8
              addi r25, r25, 8
              addi r24, r24, 8
              addi r5, r5, 8
              addi r23, r23, 8
              addi r22, r22, 8
              addi r3, r3, 8
              addi r9, r9, 8
              addi r11, r11, 8
              addi r30, r30, 8
              addi r27, r27, 8
              addi r21, r21, 8
              addi r4, r4, 8
              cmpw cr0, r20, r7
              bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1
      
      we now emit:
      
      .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
              lfdx f0, r21, r20
              lfdx f4, r3, r20
              lfdx f5, r27, r20
              lfdx f6, r22, r20
              lfdx f7, r5, r20
              lfdx f8, r6, r20
              lfdx f9, r30, r20
              lfdx f10, r11, r20
              lfdx f11, r12, r20
              fsub f10, f10, f11
              fadd f5, f4, f5
              fmul f5, f5, f1
              fadd f6, f6, f7
              fadd f6, f6, f8
              fadd f6, f6, f9
              fmadd f0, f5, f6, f0
              fnmsub f0, f10, f2, f0
              stfdx f0, r4, r20
              lfdx f0, r25, r20
              lfdx f5, r26, r20
              lfdx f6, r23, r20
              lfdx f9, r28, r20
              lfdx f10, r10, r20
              lfdx f12, r9, r20
              lfdx f13, r29, r20
              fsub f11, f13, f11
              fadd f4, f4, f5
              fmul f4, f4, f1
              fadd f5, f6, f9
              fadd f5, f5, f10
              fadd f5, f5, f12
              fnmsub f0, f4, f5, f0
              fnmsub f0, f11, f3, f0
              stfdx f0, r24, r20
              lfdx f0, r8, r20
              fsub f4, f7, f8
              fsub f5, f12, f10
              fnmsub f0, f5, f2, f0
              fnmsub f0, f4, f3, f0
              stfdx f0, r2, r20
              addi r19, r19, 1
              addi r20, r20, 8
              cmpw cr0, r19, r7
              bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1
      
      llvm-svn: 22722
      a091ff17
    • Chris Lattner's avatar
      Suck the base value out of the UsersToProcess vector into the BasedUser · 37c24cc9
      Chris Lattner authored
      class to simplify the code.  Fuse two loops.
      
      llvm-svn: 22721
      37c24cc9
    • Chris Lattner's avatar
      Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The · 37ed895b
      Chris Lattner authored
      first is a correctness thing, and the later is an optzn thing.  This also
      is needed to support a future change.
      
      llvm-svn: 22720
      37ed895b
    • Nate Begeman's avatar
      Factor out some common code, and be smarter about when to emit load hi/lo · c92787e1
      Nate Begeman authored
      code sequences.
      
      llvm-svn: 22719
      c92787e1
    • Chris Lattner's avatar
      A testcase I don't want to break in the future · 319292a6
      Chris Lattner authored
      llvm-svn: 22718
      319292a6
  3. Aug 08, 2005
Loading