Skip to content
  1. Apr 12, 2008
    • Arnold Schwaighofer's avatar
      This patch corrects the handling of byval arguments for tailcall · 634fc9a3
      Arnold Schwaighofer authored
      optimized x86-64 (and x86) calls so that they work (... at least for
      my test cases).
      
      Should fix the following problems:
      
      Problem 1: When i introduced the optimized handling of arguments for
      tail called functions (using a sequence of copyto/copyfrom virtual
      registers instead of always lowering to top of the stack) i did not
      handle byval arguments correctly e.g they did not work at all :).
      
      Problem 2: On x86-64 after the arguments of the tail called function
      are moved to their registers (which include ESI/RSI etc), tail call
      optimization performs byval lowering which causes xSI,xDI, xCX
      registers to be overwritten. This is handled in this patch by moving
      the arguments to virtual registers first and after the byval lowering
      the arguments are moved from those virtual registers back to
      RSI/RDI/RCX.
      
      llvm-svn: 49584
      634fc9a3
    • Dan Gohman's avatar
      Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal · 544ab2c5
      Dan Gohman authored
      on any current target and aren't optimized in DAGCombiner. Instead
      of using intermediate nodes, expand the operations, choosing between
      simple loads/stores, target-specific code, and library calls,
      immediately.
      
      Previously, the code to emit optimized code for these operations
      was only used at initial SelectionDAG construction time; now it is
      used at all times. This fixes some cases where rep;movs was being
      used for small copies where simple loads/stores would be better.
      
      This also cleans up code that checks for alignments less than 4;
      let the targets make that decision instead of doing it in
      target-independent code. This allows x86 to use rep;movs in
      low-alignment cases.
      
      Also, this fixes a bug that resulted in the use of rep;stos for
      memsets of 0 with non-constant memory size when the alignment was
      at least 4. It's better to use the library in this case, which
      can be significantly faster when the size is large.
      
      This also preserves more SourceValue information when memory
      intrinsics are lowered into simple loads/stores.
      
      llvm-svn: 49572
      544ab2c5
  2. Apr 09, 2008
  3. Mar 21, 2008
  4. Mar 19, 2008
  5. Mar 10, 2008
  6. Mar 09, 2008
  7. Mar 05, 2008
  8. Mar 01, 2008
  9. Feb 26, 2008
  10. Feb 19, 2008
    • Evan Cheng's avatar
      - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should... · 6200c225
      Evan Cheng authored
      - When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type.
      - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.
      
      llvm-svn: 47290
      6200c225
  11. Feb 13, 2008
  12. Feb 11, 2008
  13. Feb 10, 2008
  14. Jan 31, 2008
  15. Jan 30, 2008
  16. Jan 29, 2008
  17. Jan 24, 2008
  18. Jan 18, 2008
  19. Jan 16, 2008
  20. Jan 15, 2008
  21. Jan 05, 2008
  22. Dec 29, 2007
  23. Dec 14, 2007
  24. Nov 24, 2007
    • Chris Lattner's avatar
      Several changes: · f81d5886
      Chris Lattner authored
      1) Change the interface to TargetLowering::ExpandOperationResult to 
         take and return entire NODES that need a result expanded, not just
         the value.  This allows us to handle things like READCYCLECOUNTER,
         which returns two values.
      2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES.
      3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new
         ExpandOperationResult.  This makes the result simpler and fully 
         general.
      4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes.
      5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM
         i64 shifts, allowing them to work with LegalizeDAGTypes.
      6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT,
         allowing them to work with LegalizeDAGTypes.
      
      LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when
      type legalization in LegalizeDAG is ifdef'd out.
      
      llvm-svn: 44300
      f81d5886
  25. Nov 16, 2007
  26. Nov 09, 2007
    • Evan Cheng's avatar
      Much improved pic jumptable codegen: · 797d56ff
      Evan Cheng authored
      Then:
              call    "L1$pb"
      "L1$pb":
              popl    %eax
      		...
      LBB1_1: # entry
              imull   $4, %ecx, %ecx
              leal    LJTI1_0-"L1$pb"(%eax), %edx
              addl    LJTI1_0-"L1$pb"(%ecx,%eax), %edx
              jmpl    *%edx
      
              .align  2
              .set L1_0_set_3,LBB1_3-LJTI1_0
              .set L1_0_set_2,LBB1_2-LJTI1_0
              .set L1_0_set_5,LBB1_5-LJTI1_0
              .set L1_0_set_4,LBB1_4-LJTI1_0
      LJTI1_0:
              .long    L1_0_set_3
              .long    L1_0_set_2
      
      Now:
              call    "L1$pb"
      "L1$pb":
              popl    %eax
      		...
      LBB1_1: # entry
              addl    LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax
              jmpl    *%eax
      
      		.align  2
      		.set L1_0_set_3,LBB1_3-"L1$pb"
      		.set L1_0_set_2,LBB1_2-"L1$pb"
      		.set L1_0_set_5,LBB1_5-"L1$pb"
      		.set L1_0_set_4,LBB1_4-"L1$pb"
      LJTI1_0:
              .long    L1_0_set_3
              .long    L1_0_set_2
      
      llvm-svn: 43924
      797d56ff
  27. Nov 06, 2007
  28. Oct 29, 2007
  29. Oct 26, 2007
    • Evan Cheng's avatar
      Loosen up iv reuse to allow reuse of the same stride but a larger type when... · 7f3d0247
      Evan Cheng authored
      Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free.
      e.g.
      Turns this loop:
      LBB1_1: # entry.bb_crit_edge
              xorl    %ecx, %ecx
              xorw    %dx, %dx
              movw    %dx, %si
      LBB1_2: # bb
              movl    L_X$non_lazy_ptr, %edi
              movw    %si, (%edi)
              movl    L_Y$non_lazy_ptr, %edi
              movw    %dx, (%edi)
      		addw    $4, %dx
      		incw    %si
      		incl    %ecx
      		cmpl    %eax, %ecx
      		jne     LBB1_2  # bb
      	
      into
      
      LBB1_1: # entry.bb_crit_edge
              xorl    %ecx, %ecx
              xorw    %dx, %dx
      LBB1_2: # bb
              movl    L_X$non_lazy_ptr, %esi
              movw    %cx, (%esi)
              movl    L_Y$non_lazy_ptr, %esi
              movw    %dx, (%esi)
              addw    $4, %dx
      		incl    %ecx
              cmpl    %eax, %ecx
              jne     LBB1_2  # bb
      
      llvm-svn: 43375
      7f3d0247
  30. Oct 11, 2007
  31. Oct 09, 2007
Loading