Skip to content
  1. Jun 30, 2010
  2. Jun 29, 2010
    • Chris Lattner's avatar
      change ABIArgInfo to hold its llvm type with PATypeHolder so that · cccaad95
      Chris Lattner authored
      it doesn't dangle as types get refined.  This fixes Shootout-C++/lists1
      and probably also PR7522.
      
      llvm-svn: 107196
      cccaad95
    • Chris Lattner's avatar
      relax the CGFunctionInfo::CGFunctionInfo ctor to allow any sequence · 34d6281a
      Chris Lattner authored
      of CanQualTypes to be passed in.
      
      llvm-svn: 107176
      34d6281a
    • Chris Lattner's avatar
      fix PR7519: after thrashing around and remembering how all this stuff · ab1e65e2
      Chris Lattner authored
      works, the fix is quite simple: just make sure to call ConvertTypeRecursive
      when the function type being lowered is in the midst of ConvertType.
      
      llvm-svn: 107173
      ab1e65e2
    • Chris Lattner's avatar
      minor cleanups. · e70a007b
      Chris Lattner authored
      llvm-svn: 107150
      e70a007b
    • Chris Lattner's avatar
      Change X86_64ABIInfo to have ASTContext and TargetData ivars to · 22a931e3
      Chris Lattner authored
      avoid passing ASTContext down through all the methods it has.
      
      When classifying an argument, or argument piece, as INTEGER, check
      to see if we have a pointer at exactly the same offset in the 
      preferred type.  If so, use that pointer type instead of i64.  This
      allows us to compile A function taking a stringref into something
      like this:
      
      define i8* @foo(i64 %D.coerce0, i8* %D.coerce1) nounwind ssp {
      entry:
        %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=4]
        %0 = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        store i64 %D.coerce0, i64* %0
        %1 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
        store i8* %D.coerce1, i8** %1
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
        %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
        %tmp3 = load i8** %tmp2                         ; <i8*> [#uses=1]
        %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1]
        ret i8* %add.ptr
      }
      
      instead of this:
      
      define i8* @foo(i64 %D.coerce0, i64 %D.coerce1) nounwind ssp {
      entry:
        %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
        %0 = insertvalue %0 undef, i64 %D.coerce0, 0    ; <%0> [#uses=1]
        %1 = insertvalue %0 %0, i64 %D.coerce1, 1       ; <%0> [#uses=1]
        %2 = bitcast %struct.DeclGroup* %D to %0*       ; <%0*> [#uses=1]
        store %0 %1, %0* %2, align 1
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
        %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
        %tmp3 = load i8** %tmp2                         ; <i8*> [#uses=1]
        %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1]
        ret i8* %add.ptr
      }
      
      This implements rdar://7375902 - [codegen quality] clang x86-64 ABI lowering code punishing StringRef
      
      llvm-svn: 107123
      22a931e3
    • Chris Lattner's avatar
      plumb preferred types down into X86_64ABIInfo::classifyArgumentType, · 399d22ac
      Chris Lattner authored
      no functionality change.
      
      llvm-svn: 107115
      399d22ac
    • Chris Lattner's avatar
      Pass the LLVM IR version of argument types down into computeInfo. · 1d7c9f7f
      Chris Lattner authored
      This is somewhat annoying to do this at this level, but it avoids
      having ABIInfo know depend on CodeGenTypes for a hint.
      
      Nothing is using this yet, so no functionality change.
      
      llvm-svn: 107111
      1d7c9f7f
    • Chris Lattner's avatar
      add IR names to coerced arguments. · 9e748e9d
      Chris Lattner authored
      llvm-svn: 107105
      9e748e9d
    • Chris Lattner's avatar
      make the argument passing stuff in the FCA case smarter still, by · 15ec361b
      Chris Lattner authored
      avoiding making the FCA at all when the types exactly line up.  For
      example, before we made:
      
      %struct.DeclGroup = type { i64, i64 }
      
      define i64 @_Z3foo9DeclGroup(i64, i64) nounwind {
      entry:
        %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
        %2 = insertvalue %struct.DeclGroup undef, i64 %0, 0 ; <%struct.DeclGroup> [#uses=1]
        %3 = insertvalue %struct.DeclGroup %2, i64 %1, 1 ; <%struct.DeclGroup> [#uses=1]
        store %struct.DeclGroup %3, %struct.DeclGroup* %D
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
        %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1]
        %tmp3 = load i64* %tmp2                         ; <i64> [#uses=1]
        %add = add nsw i64 %tmp1, %tmp3                 ; <i64> [#uses=1]
        ret i64 %add
      }
      
      ... which has the pointless insertvalue, which fastisel hates, now we
      make:
      
      %struct.DeclGroup = type { i64, i64 }
      
      define i64 @_Z3foo9DeclGroup(i64, i64) nounwind {
      entry:
        %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=4]
        %2 = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        store i64 %0, i64* %2
        %3 = getelementptr %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1]
        store i64 %1, i64* %3
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
        %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1]
        %tmp3 = load i64* %tmp2                         ; <i64> [#uses=1]
        %add = add nsw i64 %tmp1, %tmp3                 ; <i64> [#uses=1]
        ret i64 %add
      }
      
      This only kicks in when x86-64 abi lowering decides it likes us.
      
      llvm-svn: 107104
      15ec361b
    • Chris Lattner's avatar
      Change CGCall to handle the "coerce" case where the coerce-to type · 3dd716c3
      Chris Lattner authored
      is a FCA to pass each of the elements as individual scalars.  This
      produces code fast isel is less likely to reject and is easier on
      the optimizers.
      
      For example, before we would compile:
      struct DeclGroup { long NumDecls; char * Y; };
      char * foo(DeclGroup D) {
        return D.NumDecls+D.Y;
      }
      
      to:
      %struct.DeclGroup = type { i64, i64 }
      
      define i64 @_Z3foo9DeclGroup(%struct.DeclGroup) nounwind {
      entry:
        %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
        store %struct.DeclGroup %0, %struct.DeclGroup* %D, align 1
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
        %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i64*> [#uses=1]
        %tmp3 = load i64* %tmp2                         ; <i64> [#uses=1]
        %add = add nsw i64 %tmp1, %tmp3                 ; <i64> [#uses=1]
        ret i64 %add
      }
      
      Now we get:
      
      %0 = type { i64, i64 }
      %struct.DeclGroup = type { i64, i8* }
      
      define i8* @_Z3foo9DeclGroup(i64, i64) nounwind {
      entry:
        %D = alloca %struct.DeclGroup, align 8          ; <%struct.DeclGroup*> [#uses=3]
        %2 = insertvalue %0 undef, i64 %0, 0            ; <%0> [#uses=1]
        %3 = insertvalue %0 %2, i64 %1, 1               ; <%0> [#uses=1]
        %4 = bitcast %struct.DeclGroup* %D to %0*       ; <%0*> [#uses=1]
        store %0 %3, %0* %4, align 1
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i64*> [#uses=1]
        %tmp1 = load i64* %tmp                          ; <i64> [#uses=1]
        %tmp2 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 1 ; <i8**> [#uses=1]
        %tmp3 = load i8** %tmp2                         ; <i8*> [#uses=1]
        %add.ptr = getelementptr inbounds i8* %tmp3, i64 %tmp1 ; <i8*> [#uses=1]
        ret i8* %add.ptr
      }
      
      Elimination of the FCA inside the function is still-to-come.
      
      llvm-svn: 107099
      3dd716c3
    • Chris Lattner's avatar
      make the trivial forms of CreateCoerced{Load|Store} trivial. · d200eda4
      Chris Lattner authored
      llvm-svn: 107091
      d200eda4
  3. Jun 28, 2010
    • Chris Lattner's avatar
      pass/return structs of char and short as i8/i16 to avoid · 93af3328
      Chris Lattner authored
      aweful through-memory coersion, just like we do for i32 now.
      
      llvm-svn: 107078
      93af3328
    • Chris Lattner's avatar
      more tidying up. · d776fb15
      Chris Lattner authored
      llvm-svn: 107076
      d776fb15
    • Chris Lattner's avatar
      random acts of tidying. · 0cf2419c
      Chris Lattner authored
      llvm-svn: 107050
      0cf2419c
    • Chris Lattner's avatar
      X86-64: · a7d81ab7
      Chris Lattner authored
      pass/return structs of float/int as float/i32 instead of double/i64
      to make the code generated for ABI cleaner.  Passing in the low part
      of a double is the same as passing in a float.
      
      For example, we now compile:
      
      struct DeclGroup { float NumDecls; };
      float foo(DeclGroup D);
      void bar(DeclGroup *D) {
       foo(*D);
      }
      
      into:
      
      %struct.DeclGroup = type { float }
      
      define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind {
      entry:
        %D.addr = alloca %struct.DeclGroup*, align 8    ; <%struct.DeclGroup**> [#uses=2]
        %agg.tmp = alloca %struct.DeclGroup, align 4    ; <%struct.DeclGroup*> [#uses=2]
        store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr
        %tmp = load %struct.DeclGroup** %D.addr         ; <%struct.DeclGroup*> [#uses=1]
        %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1]
        %tmp2 = bitcast %struct.DeclGroup* %tmp to i8*  ; <i8*> [#uses=1]
        call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false)
        %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1]
        %0 = load float* %coerce.dive, align 1          ; <float> [#uses=1]
        %call = call float @_Z3foo9DeclGroup(float %0)  ; <float> [#uses=0]
        ret void
      }
      
      instead of:
      
      %struct.DeclGroup = type { float }
      
      define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind {
      entry:
        %D.addr = alloca %struct.DeclGroup*, align 8    ; <%struct.DeclGroup**> [#uses=2]
        %agg.tmp = alloca %struct.DeclGroup, align 4    ; <%struct.DeclGroup*> [#uses=2]
        %tmp3 = alloca double                           ; <double*> [#uses=2]
        store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr
        %tmp = load %struct.DeclGroup** %D.addr         ; <%struct.DeclGroup*> [#uses=1]
        %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1]
        %tmp2 = bitcast %struct.DeclGroup* %tmp to i8*  ; <i8*> [#uses=1]
        call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false)
        %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <float*> [#uses=1]
        %0 = bitcast double* %tmp3 to float*            ; <float*> [#uses=1]
        %1 = load float* %coerce.dive                   ; <float> [#uses=1]
        store float %1, float* %0, align 1
        %2 = load double* %tmp3                         ; <double> [#uses=1]
        %call = call float @_Z3foo9DeclGroup(double %2) ; <float> [#uses=0]
        ret void
      }
      
      which is this machine code (at -O0):
      
      __Z3barP9DeclGroup:
      	subq	$24, %rsp
      	movq	%rdi, 16(%rsp)
      	movq	16(%rsp), %rdi
      	leaq	8(%rsp), %rax
      	movl	(%rdi), %ecx
      	movl	%ecx, (%rax)
      	movss	8(%rsp), %xmm0
      	callq	__Z3foo9DeclGroup
      	addq	$24, %rsp
      	ret
      
      vs this:
      
      __Z3barP9DeclGroup:
      	subq	$24, %rsp
      	movq	%rdi, 16(%rsp)
      	movq	16(%rsp), %rdi
      	leaq	8(%rsp), %rax
      	movl	(%rdi), %ecx
      	movl	%ecx, (%rax)
      	movss	8(%rsp), %xmm0
      	movss	%xmm0, (%rsp)
      	movsd	(%rsp), %xmm0
      	callq	__Z3foo9DeclGroup
      	addq	$24, %rsp
      	ret
      
      At -O3, it is the difference between this now:
      
      __Z3barP9DeclGroup:
      	movss	(%rdi), %xmm0
      	jmp	__Z3foo9DeclGroup  # TAILCALL
      
      vs this before:
      
      __Z3barP9DeclGroup:
      	movl	(%rdi), %eax
      	movd	%rax, %xmm0
      	jmp	__Z3foo9DeclGroup  # TAILCALL
      
      llvm-svn: 107048
      a7d81ab7
    • Fariborz Jahanian's avatar
      Minor refactorin of my last patch (radar 7860965 related). · c42461e1
      Fariborz Jahanian authored
      llvm-svn: 107047
      c42461e1
    • Fariborz Jahanian's avatar
      Have __func__ and siblings point to block's implementation function · 36ad0e99
      Fariborz Jahanian authored
      name. Fixes radar 7860965.
      
      llvm-svn: 107044
      36ad0e99
    • Chris Lattner's avatar
      Fix UnitTests/2004-02-02-NegativeZero.c, which regressed when · c1028f68
      Chris Lattner authored
      I broke negate of FP values.
      
      llvm-svn: 107019
      c1028f68
  4. Jun 27, 2010
    • Anders Carlsson's avatar
      Correctly destroy reference temporaries with global storage. Remove... · 3f48c603
      Anders Carlsson authored
      Correctly destroy reference temporaries with global storage. Remove ErrorUnsupported call when binding a global reference to a non-lvalue. Fixes PR7326.
      
      llvm-svn: 106983
      3f48c603
    • Anders Carlsson's avatar
    • Anders Carlsson's avatar
    • Anders Carlsson's avatar
      Reduce indentation. · ca68d357
      Anders Carlsson authored
      llvm-svn: 106980
      ca68d357
    • Chris Lattner's avatar
      misc tidying · 818efb64
      Chris Lattner authored
      llvm-svn: 106978
      818efb64
    • Chris Lattner's avatar
      finally get around to doing a significant cleanup to irgen: · 5e016ae9
      Chris Lattner authored
      have CGF create and make accessible standard int32,int64 and 
      intptr types.  This fixes a ton of 80 column violations 
      introduced by LLVMContextification and cleans up stuff a lot.
      
      llvm-svn: 106977
      5e016ae9
    • Chris Lattner's avatar
      tidy up OrderGlobalInits · e000907e
      Chris Lattner authored
      llvm-svn: 106976
      e000907e
    • Chris Lattner's avatar
      If coercing something from int or pointer type to int or pointer type · 055097f0
      Chris Lattner authored
      (potentially after unwrapping it from a struct) do it without going through
      memory.  We now compile:
      
      struct DeclGroup {
        unsigned NumDecls;
      };
      
      int foo(DeclGroup D) {
        return D.NumDecls;
      }
      
      into:
      
      %struct.DeclGroup = type { i32 }
      
      define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
      entry:
        %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
        %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        %coerce.val.ii = trunc i64 %0 to i32            ; <i32> [#uses=1]
        store i32 %coerce.val.ii, i32* %coerce.dive
        %tmp = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        %tmp1 = load i32* %tmp                          ; <i32> [#uses=1]
        ret i32 %tmp1
      }
      
      instead of:
      
      %struct.DeclGroup = type { i32 }
      
      define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
      entry:
        %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
        %tmp = alloca i64                               ; <i64*> [#uses=2]
        %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        store i64 %0, i64* %tmp
        %1 = bitcast i64* %tmp to i32*                  ; <i32*> [#uses=1]
        %2 = load i32* %1, align 1                      ; <i32> [#uses=1]
        store i32 %2, i32* %coerce.dive
        %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        %tmp2 = load i32* %tmp1                         ; <i32> [#uses=1]
        ret i32 %tmp2
      }
      
      ... which is quite a bit less terrifying.
      
      llvm-svn: 106975
      055097f0
    • Chris Lattner's avatar
      Same patch as the previous on the store side. Before we compiled this: · 895c52ba
      Chris Lattner authored
      struct DeclGroup {
        unsigned NumDecls;
      };
      
      int foo(DeclGroup D) {
        return D.NumDecls;
      }
      
      to:
      
      %struct.DeclGroup = type { i32 }
      
      define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
      entry:
        %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
        %tmp = alloca i64                               ; <i64*> [#uses=2]
        store i64 %0, i64* %tmp
        %1 = bitcast i64* %tmp to %struct.DeclGroup*    ; <%struct.DeclGroup*> [#uses=1]
        %2 = load %struct.DeclGroup* %1, align 1        ; <%struct.DeclGroup> [#uses=1]
        store %struct.DeclGroup %2, %struct.DeclGroup* %D
        %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        %tmp2 = load i32* %tmp1                         ; <i32> [#uses=1]
        ret i32 %tmp2
      }
      
      which caused fast isel bailouts due to the FCA load/store of %2.  Now
      we generate this just blissful code:
      
      %struct.DeclGroup = type { i32 }
      
      define i32 @_Z3foo9DeclGroup(i64) nounwind ssp noredzone {
      entry:
        %D = alloca %struct.DeclGroup, align 4          ; <%struct.DeclGroup*> [#uses=2]
        %tmp = alloca i64                               ; <i64*> [#uses=2]
        %coerce.dive = getelementptr %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        store i64 %0, i64* %tmp
        %1 = bitcast i64* %tmp to i32*                  ; <i32*> [#uses=1]
        %2 = load i32* %1, align 1                      ; <i32> [#uses=1]
        store i32 %2, i32* %coerce.dive
        %tmp1 = getelementptr inbounds %struct.DeclGroup* %D, i32 0, i32 0 ; <i32*> [#uses=1]
        %tmp2 = load i32* %tmp1                         ; <i32> [#uses=1]
        ret i32 %tmp2
      }
      
      This avoids fastisel bailing out and is groundwork for future patch.
      This reduces bailouts on CGStmt.ll to 911 from 935.
      
      llvm-svn: 106974
      895c52ba
    • Chris Lattner's avatar
      improve CreateCoercedLoad a bit to generate slightly less awful · 1cd6698a
      Chris Lattner authored
      IR when handling X86-64 by-value struct stuff.  For example, we
      use to compile this:
      
      struct DeclGroup {
        unsigned NumDecls;
      };
      
      int foo(DeclGroup D);
      void bar(DeclGroup *D) {
        foo(*D);
      }
      
      into:
      
      define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) ssp nounwind {
      entry:
        %D.addr = alloca %struct.DeclGroup*, align 8    ; <%struct.DeclGroup**> [#uses=2]
        %agg.tmp = alloca %struct.DeclGroup, align 4    ; <%struct.DeclGroup*> [#uses=2]
        %tmp3 = alloca i64                              ; <i64*> [#uses=2]
        store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr
        %tmp = load %struct.DeclGroup** %D.addr         ; <%struct.DeclGroup*> [#uses=1]
        %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1]
        %tmp2 = bitcast %struct.DeclGroup* %tmp to i8*  ; <i8*> [#uses=1]
        call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false)
        %0 = bitcast i64* %tmp3 to %struct.DeclGroup*   ; <%struct.DeclGroup*> [#uses=1]
        %1 = load %struct.DeclGroup* %agg.tmp           ; <%struct.DeclGroup> [#uses=1]
        store %struct.DeclGroup %1, %struct.DeclGroup* %0, align 1
        %2 = load i64* %tmp3                            ; <i64> [#uses=1]
        call void @_Z3foo9DeclGroup(i64 %2)
        ret void
      }
      
      which would cause fastisel to bail out due to the first class aggregate load %1.  With
      this patch we now compile it into the (still awful):
      
      define void @_Z3barP9DeclGroup(%struct.DeclGroup* %D) nounwind ssp noredzone {
      entry:
        %D.addr = alloca %struct.DeclGroup*, align 8    ; <%struct.DeclGroup**> [#uses=2]
        %agg.tmp = alloca %struct.DeclGroup, align 4    ; <%struct.DeclGroup*> [#uses=2]
        %tmp3 = alloca i64                              ; <i64*> [#uses=2]
        store %struct.DeclGroup* %D, %struct.DeclGroup** %D.addr
        %tmp = load %struct.DeclGroup** %D.addr         ; <%struct.DeclGroup*> [#uses=1]
        %tmp1 = bitcast %struct.DeclGroup* %agg.tmp to i8* ; <i8*> [#uses=1]
        %tmp2 = bitcast %struct.DeclGroup* %tmp to i8*  ; <i8*> [#uses=1]
        call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 4, i32 4, i1 false)
        %coerce.dive = getelementptr %struct.DeclGroup* %agg.tmp, i32 0, i32 0 ; <i32*> [#uses=1]
        %0 = bitcast i64* %tmp3 to i32*                 ; <i32*> [#uses=1]
        %1 = load i32* %coerce.dive                     ; <i32> [#uses=1]
        store i32 %1, i32* %0, align 1
        %2 = load i64* %tmp3                            ; <i64> [#uses=1]
        %call = call i32 @_Z3foo9DeclGroup(i64 %2) noredzone ; <i32> [#uses=0]
        ret void
      }
      
      which doesn't bail out.  On CGStmt.ll, this reduces fastisel bail outs from 958 to 935,
      and is the precursor of better things to come.
      
      llvm-svn: 106973
      1cd6698a
    • Chris Lattner's avatar
      Change IR generation for return (in the simple case) to avoid doing silly · 3fcc790c
      Chris Lattner authored
      load/store nonsense in the epilog.  For example, for:
      
      int foo(int X) {
        int A[100];
        return A[X];
      }
      
      we used to generate:
      
        %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
        %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
        store i32 %tmp1, i32* %retval
        %0 = load i32* %retval                          ; <i32> [#uses=1]
        ret i32 %0
      }
      
      which codegen'd to this code:
      
      _foo:                                   ## @foo
      ## BB#0:                                ## %entry
      	subq	$408, %rsp              ## imm = 0x198
      	movl	%edi, 400(%rsp)
      	movl	400(%rsp), %edi
      	movslq	%edi, %rax
      	movl	(%rsp,%rax,4), %edi
      	movl	%edi, 404(%rsp)
      	movl	404(%rsp), %eax
      	addq	$408, %rsp              ## imm = 0x198
      	ret
      
      Now we generate:
      
        %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i64 %idxprom ; <i32*> [#uses=1]
        %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
        ret i32 %tmp1
      }
      
      and:
      
      _foo:                                   ## @foo
      ## BB#0:                                ## %entry
      	subq	$408, %rsp              ## imm = 0x198
      	movl	%edi, 404(%rsp)
      	movl	404(%rsp), %edi
      	movslq	%edi, %rax
      	movl	(%rsp,%rax,4), %eax
      	addq	$408, %rsp              ## imm = 0x198
      	ret
      
      This actually does matter, cutting out 2000 lines of IR from CGStmt.ll 
      for example.
      
      Another interesting effect is that altivec.h functions which are dead
      now get dce'd by the inliner.  Hence all the changes to 
      builtins-ppc-altivec.c to ensure the calls aren't dead.
      
      llvm-svn: 106970
      3fcc790c
    • Chris Lattner's avatar
      reduce indentation · 726b3d09
      Chris Lattner authored
      llvm-svn: 106967
      726b3d09
    • Chris Lattner's avatar
      Implement rdar://7530813 - collapse multiple GEP instructions in IRgen · 6c5abe88
      Chris Lattner authored
      This avoids generating two gep's for common array operations.  Before
      we would generate something like:
      
        %tmp = load i32* %X.addr                        ; <i32> [#uses=1]
        %arraydecay = getelementptr inbounds [100 x i32]* %A, i32 0, i32 0 ; <i32*> [#uses=1]
        %arrayidx = getelementptr inbounds i32* %arraydecay, i32 %tmp ; <i32*> [#uses=1]
        %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
      
      Now we generate:
      
        %tmp = load i32* %X.addr                        ; <i32> [#uses=1]
        %arrayidx = getelementptr inbounds [100 x i32]* %A, i32 0, i32 %tmp ; <i32*> [#uses=1]
        %tmp1 = load i32* %arrayidx                     ; <i32> [#uses=1]
      
      Less IR is better at -O0.
      
      llvm-svn: 106966
      6c5abe88
Loading