Skip to content
  1. Dec 11, 2012
    • NAKAMURA Takumi's avatar
      llvm/tools: Add #include "llvm/TargetTransformInfo.h" · 256e013d
      NAKAMURA Takumi authored
      llvm-svn: 169817
      256e013d
    • Jyotsna Verma's avatar
      Use multiclass for new-value store instructions with MEMri operand. · 92e71918
      Jyotsna Verma authored
      llvm-svn: 169814
      92e71918
    • Nadav Rotem's avatar
      dbb33281
    • Rafael Espindola's avatar
      Change some functions to take const pointers. · 6ee19d28
      Rafael Espindola authored
      llvm-svn: 169812
      6ee19d28
    • Evan Cheng's avatar
      Stylistic tweak. · c2bd620f
      Evan Cheng authored
      llvm-svn: 169811
      c2bd620f
    • Chad Rosier's avatar
      Add a triple to this test. · d4c0c6cb
      Chad Rosier authored
      llvm-svn: 169803
      d4c0c6cb
    • Chandler Carruth's avatar
      Fix a miscompile in the DAG combiner. Previously, we would incorrectly · b27041c5
      Chandler Carruth authored
      try to reduce the width of this load, and would end up transforming:
      
        (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
      to
        (truncate (zextload i32 <ptr+4> as i64) to i32)
      
      We lost the sext attached to the load while building the narrower i32
      load, and replaced it with a zext because lshr always zext's the
      results. Instead, bail out of this combine when there is a conflict
      between a sextload and a zext narrowing. The rest of the DAG combiner
      still optimize the code down to the proper single instruction:
      
        movswl 6(...),%eax
      
      Which is exactly what we wanted. Previously we read past the end *and*
      missed the sign extension:
      
        movl 6(...), %eax
      
      llvm-svn: 169802
      b27041c5
    • Paul Redmond's avatar
      move X86-specific test · c4550d49
      Paul Redmond authored
      This test case uses -mcpu=corei7 so it belongs in CodeGen/X86
      
      Reviewed by: Nadav
      
      llvm-svn: 169801
      c4550d49
    • Bill Wendling's avatar
      Fix grammar-o. · ceb1577b
      Bill Wendling authored
      llvm-svn: 169798
      ceb1577b
    • Chad Rosier's avatar
      Fall back to the selection dag isel to select tail calls. · df42cf39
      Chad Rosier authored
      This shouldn't affect codegen for -O0 compiles as tail call markers are not
      emitted in unoptimized compiles.  Testing with the external/internal nightly
      test suite reveals no change in compile time performance.  Testing with -O1,
      -O2 and -O3 with fast-isel enabled did not cause any compile-time or
      execution-time failures.  All tests were performed on my x86 machine.
      I'll monitor our arm testers to ensure no regressions occur there.
      
      In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
      and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
      it's theoretically true that this is just an optimization, it's an
      optimization that we very much want to happen even at -O0, or else ARC
      applications become substantially harder to debug.
      
      Part of rdar://12553082
      
      llvm-svn: 169796
      df42cf39
    • Eric Christopher's avatar
      Refactor out the abbreviation handling into a separate class that · c8a310ed
      Eric Christopher authored
      controls each of the abbreviation sets (only a single one at the
      moment) and computes offsets separately as well for each set
      of DIEs.
      
      No real function change, ordering of abbreviations for the skeleton
      CU changed but only because we're computing in a separate order. Fix
      the testcase not to care.
      
      llvm-svn: 169793
      c8a310ed
    • Evan Cheng's avatar
      Some enhancements for memcpy / memset inline expansion. · 79e2ca90
      Evan Cheng authored
      1. Teach it to use overlapping unaligned load / store to copy / set the trailing
         bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
      2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
         x86 and ARM.
      3. When memcpy from a constant string, do *not* replace the load with a constant
         if it's not possible to materialize an integer immediate with a single
         instruction (required a new target hook: TLI.isIntImmLegal()).
      4. Use unaligned load / stores more aggressively if target hooks indicates they
         are "fast".
      5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
         Also increase the threshold to something reasonable (8 for memset, 4 pairs
         for memcpy).
      
      This significantly improves Dhrystone, up to 50% on ARM iOS devices.
      
      rdar://12760078
      
      llvm-svn: 169791
      79e2ca90
    • Arnold Schwaighofer's avatar
      Optimistically analyse Phi cycles · edd62b14
      Arnold Schwaighofer authored
      Analyse Phis under the starting assumption that they are NoAlias. Recursively
      look at their inputs.
      If they MayAlias/MustAlias there must be an input that makes them so.
      
      Addresses bug 14351.
      
      llvm-svn: 169788
      edd62b14
  2. Dec 10, 2012
Loading