Skip to content
  1. May 06, 2011
  2. Apr 20, 2011
  3. Apr 19, 2011
  4. Apr 15, 2011
  5. Mar 31, 2011
  6. Mar 26, 2011
  7. Mar 24, 2011
  8. Mar 23, 2011
  9. Mar 21, 2011
  10. Mar 19, 2011
    • Daniel Dunbar's avatar
      Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors · 327cd36f
      Daniel Dunbar authored
      to canonicalize IR", it broke a lot of things.
      
      llvm-svn: 127954
      327cd36f
    • Evan Cheng's avatar
      SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR · 824a7113
      Evan Cheng authored
      to have single return block (at least getting there) for optimizations. This
      is general goodness but it would prevent some tailcall optimizations.
      One specific case is code like this:
      int f1(void);
      int f2(void);
      int f3(void);
      int f4(void);
      int f5(void);
      int f6(void);
      int foo(int x) {
        switch(x) {
        case 1: return f1();
        case 2: return f2();
        case 3: return f3();
        case 4: return f4();
        case 5: return f5();
        case 6: return f6();
        }
      }
      
      =>
      LBB0_2:                                 ## %sw.bb
        callq   _f1
        popq    %rbp
        ret
      LBB0_3:                                 ## %sw.bb1
        callq   _f2
        popq    %rbp
        ret
      LBB0_4:                                 ## %sw.bb3
        callq   _f3
        popq    %rbp
        ret
      
      This patch teaches codegenprep to duplicate returns when the return value
      is a phi and where the phi operands are produced by tail calls followed by
      an unconditional branch:
      
      sw.bb7:                                           ; preds = %entry
        %call8 = tail call i32 @f5() nounwind
        br label %return
      sw.bb9:                                           ; preds = %entry
        %call10 = tail call i32 @f6() nounwind
        br label %return
      return:
        %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ]
        ret i32 %retval.0
      
      This allows codegen to generate better code like this:
      
      LBB0_2:                                 ## %sw.bb
              jmp     _f1                     ## TAILCALL
      LBB0_3:                                 ## %sw.bb1
              jmp     _f2                     ## TAILCALL
      LBB0_4:                                 ## %sw.bb3
              jmp     _f3                     ## TAILCALL
      
      rdar://9147433
      
      llvm-svn: 127953
      824a7113
    • Nadav Rotem's avatar
      Add support for legalizing UINT_TO_FP of vectors on platforms which do · e7a101cc
      Nadav Rotem authored
      not have native support for this operation (such as X86).
      The legalized code uses two vector INT_TO_FP operations and is faster
      than scalarizing.
      
      llvm-svn: 127951
      e7a101cc
  11. Mar 18, 2011
  12. Mar 17, 2011
  13. Mar 16, 2011
  14. Mar 11, 2011
  15. Mar 10, 2011
  16. Mar 09, 2011
  17. Mar 08, 2011
  18. Mar 07, 2011
  19. Mar 05, 2011
    • Andrew Trick's avatar
      Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f
      Andrew Trick authored
      regs. This is the only change in this checkin that may affects the
      default scheduler. With better register tracking and heuristics, it
      doesn't make sense to artificially lower the register limit so much.
      
      Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
      give the scheduler a way to account for div and sqrt on targets that
      don't have an itinerary. It is currently defaults to 10 (the actual
      number doesn't matter much), but only takes effect on non-default
      schedulers: list-hybrid and list-ilp.
      
      Added several heuristics that can be individually disabled for the
      non-default sched=list-ilp mode. This helps us determine how much
      better we can do on a given benchmark than the default
      scheduler. Certain compute intensive loops run much faster in this
      mode with the right set of heuristics, and it doesn't seem to have
      much negative impact elsewhere. Not all of the heuristics are needed,
      but we still need to experiment to decide which should be disabled by
      default for sched=list-ilp.
      
      llvm-svn: 127067
      641e2d4f
  20. Mar 02, 2011
  21. Feb 28, 2011
    • David Greene's avatar
      · 20a1cbef
      David Greene authored
      [AVX] Add decode support for VUNPCKLPS/D instructions, both 128-bit
            and 256-bit forms.  Because the number of elements in a vector
            does not determine the vector type (4 elements could be v4f32 or
            v4f64), pass the full type of the vector to decode routines.
      
      llvm-svn: 126664
      20a1cbef
  22. Feb 25, 2011
  23. Feb 24, 2011
  24. Feb 23, 2011
    • David Greene's avatar
      · 9a6040dc
      David Greene authored
      [AVX] General VUNPCKL codegen support.
      
      llvm-svn: 126264
      9a6040dc
  25. Feb 22, 2011
    • Devang Patel's avatar
      Revert r124611 - "Keep track of incoming argument's location while emitting LiveIns." · f3292b21
      Devang Patel authored
      In other words, do not keep track of argument's location.  The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
      This requires some coordination with debugger to get this working. 
       - The debugger needs to be aware of prolog_end attribute attached with line table entries.
       - The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)
      
      llvm-svn: 126155
      f3292b21
  26. Feb 20, 2011
  27. Feb 19, 2011
  28. Feb 17, 2011
    • David Greene's avatar
      · 3a2b508e
      David Greene authored
      [AVX] Recorganize X86ShuffleDecode into its own library
      (LLVMX86Utils.a) to break cyclic library dependencies between
      LLVMX86CodeGen.a and LLVMX86AsmParser.a.  Previously this code was in
      a header file and marked static but AVX requires some additional
      functionality here that won't be used by all clients.  Since including
      unused static functions causes a gcc compiler warning, keeping it as a
      header would break builds that use -Werror.  Putting this in its own
      library solves both problems at once.
      
      llvm-svn: 125765
      3a2b508e
  29. Feb 16, 2011
  30. Feb 13, 2011
    • Chris Lattner's avatar
      Enhance ComputeMaskedBits to know that aligned frameindexes · 46c01a30
      Chris Lattner authored
      have their low bits set to zero.  This allows us to optimize
      out explicit stack alignment code like in stack-align.ll:test4 when
      it is redundant.
      
      Doing this causes the code generator to start turning FI+cst into
      FI|cst all over the place, which is general goodness (that is the
      canonical form) except that various pieces of the code generator
      don't handle OR aggressively.  Fix this by introducing a new
      SelectionDAG::isBaseWithConstantOffset predicate, and using it
      in places that are looking for ADD(X,CST).  The ARM backend in
      particular was missing a lot of addressing mode folding opportunities
      around OR.
      
      llvm-svn: 125470
      46c01a30
  31. Feb 11, 2011
    • David Greene's avatar
      · 79827a5a
      David Greene authored
      [AVX] Implement 256-bit vector lowering for SCALAR_TO_VECTOR.  This
      largely completes support for 128-bit fallback lowering for code that
      is not 256-bit ready.
      
      llvm-svn: 125315
      79827a5a
  32. Feb 10, 2011
    • David Greene's avatar
      · ce318e49
      David Greene authored
      [AVX] Implement 256-bit vector lowering for EXTRACT_VECTOR_ELT.
      
      llvm-svn: 125284
      ce318e49
  33. Feb 09, 2011
    • David Greene's avatar
      · b36195ab
      David Greene authored
      [AVX] Implement 256-bit vector lowering for INSERT_VECTOR_ELT.
      
      llvm-svn: 125187
      b36195ab
Loading