Skip to content
  1. Apr 15, 2007
    • Owen Anderson's avatar
      Remove ImmediateDominator analysis. The same information can be obtained from... · f35a1dbc
      Owen Anderson authored
      Remove ImmediateDominator analysis.  The same information can be obtained from DomTree.  A lot of code for
      constructing ImmediateDominator is now folded into DomTree construction.
      
      This is part of the ongoing work for PR217.
      
      llvm-svn: 36063
      f35a1dbc
    • Chris Lattner's avatar
      fix SimplifyLibCalls/IsDigit.ll · f8a7bf31
      Chris Lattner authored
      llvm-svn: 36047
      f8a7bf31
    • Chris Lattner's avatar
      Extend store merging to support the 'if/then' version in addition to if/then/else. · 4a6e0cbd
      Chris Lattner authored
      This sinks the two stores in this example into a single store in cond_next.  In this
      case, it allows elimination of the load as well:
      
              store double 0.000000e+00, double* @s.3060
              %tmp3 = fcmp ogt double %tmp1, 5.000000e-01             ; <i1> [#uses=1]
              br i1 %tmp3, label %cond_true, label %cond_next
      cond_true:              ; preds = %entry
              store double 1.000000e+00, double* @s.3060
              br label %cond_next
      cond_next:              ; preds = %entry, %cond_true
              %tmp6 = load double* @s.3060            ; <double> [#uses=1]
      
      This implements Transforms/InstCombine/store-merge.ll:test2
      
      llvm-svn: 36040
      4a6e0cbd
    • Chris Lattner's avatar
      refactor some code, no functionality change. · 14a251b9
      Chris Lattner authored
      llvm-svn: 36037
      14a251b9
    • Chris Lattner's avatar
      fix long lines · 28d921d0
      Chris Lattner authored
      llvm-svn: 36031
      28d921d0
    • Chris Lattner's avatar
      Implement Transforms/InstCombine/vec_extract_elt.ll, transforming: · 7bfdd0ab
      Chris Lattner authored
      define i32 @test(float %f) {
              %tmp7 = insertelement <4 x float> undef, float %f, i32 0
              %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32>
              %tmp19 = extractelement <4 x i32> %tmp17, i32 0
              ret i32 %tmp19
      }
      
      into:
      
      define i32 @test(float %f) {
              %tmp19 = bitcast float %f to i32                ; <i32> [#uses=1]
              ret i32 %tmp19
      }
      
      On PPC, this is the difference between:
      
      _test:
              mfspr r2, 256
              oris r3, r2, 8192
              mtspr 256, r3
              stfs f1, -16(r1)
              addi r3, r1, -16
              addi r4, r1, -32
              lvx v2, 0, r3
              stvx v2, 0, r4
              lwz r3, -32(r1)
              mtspr 256, r2
              blr
      
      and:
      
      _test:
              stfs f1, -4(r1)
              nop
              nop
              nop
              lwz r3, -4(r1)
              blr
      
      llvm-svn: 36025
      7bfdd0ab
    • Chris Lattner's avatar
      Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn · b37fb6a0
      Chris Lattner authored
      unsigned test(float f) {
       return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f ));
      }
      
      into:
      
      _test:
              movss 4(%esp), %xmm0
              mulss %xmm0, %xmm0
              movd %xmm0, %eax
              ret
      
      instead of:
      
      _test:
              movss 4(%esp), %xmm0
              mulss %xmm0, %xmm0
              xorps %xmm1, %xmm1
              movss %xmm0, %xmm1
              movd %xmm1, %eax
              ret
      
      GCC gets:
      
      _test:
              subl    $28, %esp
              movss   32(%esp), %xmm0
              mulss   %xmm0, %xmm0
              xorps   %xmm1, %xmm1
              movss   %xmm0, %xmm1
              movaps  %xmm1, %xmm0
              movd    %xmm0, 12(%esp)
              movl    12(%esp), %eax
              addl    $28, %esp
              ret
      
      llvm-svn: 36020
      b37fb6a0
    • Chris Lattner's avatar
      avoid copying sets and vectors around. · a6b56602
      Chris Lattner authored
      llvm-svn: 36017
      a6b56602
  2. Apr 14, 2007
  3. Apr 13, 2007
    • Chris Lattner's avatar
      Now that codegen prepare isn't defeating me, I can finally fix what I set · efd3051d
      Chris Lattner authored
      out to do! :)
      
      This fixes a problem where LSR would insert a bunch of code into each MBB
      that uses a particular subexpression (e.g. IV+base+C).  The problem is that
      this code cannot be CSE'd back together if inserted into different blocks.
      
      This patch changes LSR to attempt to insert a single copy of this code and
      share it, allowing codegenprepare to duplicate the code if it can be sunk
      into various addressing modes.  On CodeGen/ARM/lsr-code-insertion.ll,
      for example, this gives us code like:
      
              add r8, r0, r5
              str r6, [r8, #+4]
      ..
              ble LBB1_4      @cond_next
      LBB1_3: @cond_true
              str r10, [r8, #+4]
      LBB1_4: @cond_next
      ...
      LBB1_5: @cond_true55
              ldr r6, LCPI1_1
              str r6, [r8, #+4]
      
      instead of:
      
              add r10, r0, r6
              str r8, [r10, #+4]
      ...
              ble LBB1_4      @cond_next
      LBB1_3: @cond_true
              add r8, r0, r6
              str r10, [r8, #+4]
      LBB1_4: @cond_next
      ...
      LBB1_5: @cond_true55
              add r8, r0, r6
              ldr r10, LCPI1_1
              str r10, [r8, #+4]
      
      Besides being smaller and more efficient, this makes it immediately
      obvious that it is profitable to predicate LBB1_3 now :)
      
      llvm-svn: 35972
      efd3051d
    • Chris Lattner's avatar
      Completely rewrite addressing-mode related sinking of code. In particular, · feee64e9
      Chris Lattner authored
      this fixes problems where codegenprepare would sink expressions into load/stores
      that are not valid, and fixes cases where it would miss important valid ones.
      
      This fixes several serious codesize and perf issues, particularly on targets
      with complex addressing modes like arm and x86.  For example, now we compile
      CodeGen/X86/isel-sink.ll to:
      
      _test:
              movl 8(%esp), %eax
              movl 4(%esp), %ecx
              cmpl $1233, %eax
              ja LBB1_2       #F
      LBB1_1: #T
              movl $4, (%ecx,%eax,4)
              movl $141, %eax
              ret
      LBB1_2: #F
              movl (%ecx,%eax,4), %eax
              ret
      
      instead of:
      
      _test:
              movl 8(%esp), %eax
              leal (,%eax,4), %ecx
              addl 4(%esp), %ecx
              cmpl $1233, %eax
              ja LBB1_2       #F
      LBB1_1: #T
              movl $4, (%ecx)
              movl $141, %eax
              ret
      LBB1_2: #F
              movl (%ecx), %eax
              ret
      
      llvm-svn: 35970
      feee64e9
    • Devang Patel's avatar
      Remove use of SlowOperationInformer. · 38705d54
      Devang Patel authored
      llvm-svn: 35967
      38705d54
    • Devang Patel's avatar
      Undo previous check-in. · b730fe57
      Devang Patel authored
      llvm-svn: 35966
      b730fe57
    • Devang Patel's avatar
      Hello uses LLVMSupport.a (SlowerOperationInformer) · f929b861
      Devang Patel authored
      llvm-svn: 35965
      f929b861
  4. Apr 12, 2007
  5. Apr 11, 2007
  6. Apr 10, 2007
  7. Apr 09, 2007
Loading