Skip to content
  1. Dec 11, 2012
    • Patrik Hagglund's avatar
      Change TargetLowering::getRepRegClassFor to take an MVT, instead of · 57b1694d
      Patrik Hagglund authored
      EVT.
      
      Accordingly, change RegDefIter to contain MVTs instead of EVTs.
      
      llvm-svn: 169838
      57b1694d
    • Patrik Hagglund's avatar
      Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. · 3708e548
      Patrik Hagglund authored
      Accordingly, add helper funtions getSimpleValueType (in parallel to
      getValueType) in SDValue, SDNode, and TargetLowering.
      
      This is the first, in a series of patches.
      
      llvm-svn: 169837
      3708e548
    • Chandler Carruth's avatar
      Fix a miscompile in the DAG combiner. Previously, we would incorrectly · b27041c5
      Chandler Carruth authored
      try to reduce the width of this load, and would end up transforming:
      
        (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
      to
        (truncate (zextload i32 <ptr+4> as i64) to i32)
      
      We lost the sext attached to the load while building the narrower i32
      load, and replaced it with a zext because lshr always zext's the
      results. Instead, bail out of this combine when there is a conflict
      between a sextload and a zext narrowing. The rest of the DAG combiner
      still optimize the code down to the proper single instruction:
      
        movswl 6(...),%eax
      
      Which is exactly what we wanted. Previously we read past the end *and*
      missed the sign extension:
      
        movl 6(...), %eax
      
      llvm-svn: 169802
      b27041c5
    • Chad Rosier's avatar
      Fall back to the selection dag isel to select tail calls. · df42cf39
      Chad Rosier authored
      This shouldn't affect codegen for -O0 compiles as tail call markers are not
      emitted in unoptimized compiles.  Testing with the external/internal nightly
      test suite reveals no change in compile time performance.  Testing with -O1,
      -O2 and -O3 with fast-isel enabled did not cause any compile-time or
      execution-time failures.  All tests were performed on my x86 machine.
      I'll monitor our arm testers to ensure no regressions occur there.
      
      In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
      and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
      it's theoretically true that this is just an optimization, it's an
      optimization that we very much want to happen even at -O0, or else ARC
      applications become substantially harder to debug.
      
      Part of rdar://12553082
      
      llvm-svn: 169796
      df42cf39
    • Eric Christopher's avatar
      Refactor out the abbreviation handling into a separate class that · c8a310ed
      Eric Christopher authored
      controls each of the abbreviation sets (only a single one at the
      moment) and computes offsets separately as well for each set
      of DIEs.
      
      No real function change, ordering of abbreviations for the skeleton
      CU changed but only because we're computing in a separate order. Fix
      the testcase not to care.
      
      llvm-svn: 169793
      c8a310ed
    • Evan Cheng's avatar
      Some enhancements for memcpy / memset inline expansion. · 79e2ca90
      Evan Cheng authored
      1. Teach it to use overlapping unaligned load / store to copy / set the trailing
         bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
      2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
         x86 and ARM.
      3. When memcpy from a constant string, do *not* replace the load with a constant
         if it's not possible to materialize an integer immediate with a single
         instruction (required a new target hook: TLI.isIntImmLegal()).
      4. Use unaligned load / stores more aggressively if target hooks indicates they
         are "fast".
      5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
         Also increase the threshold to something reasonable (8 for memset, 4 pairs
         for memcpy).
      
      This significantly improves Dhrystone, up to 50% on ARM iOS devices.
      
      rdar://12760078
      
      llvm-svn: 169791
      79e2ca90
  2. Dec 10, 2012
  3. Dec 09, 2012
  4. Dec 08, 2012
  5. Dec 07, 2012
    • Jakob Stoklund Olesen's avatar
      Add higher-level API for dealing with bundled MachineInstrs. · fead62d4
      Jakob Stoklund Olesen authored
      This is still a work in progress. The purpose is to make bundling and
      unbundling operations explicit, and to catch errors where bundles are
      broken or created inadvertently.
      
      The old IsInsideBundle flag is replaced by two MI flags: BundledPred
      which has the same meaning as IsInsideBundle, and BundledSucc which is
      set on instructions that are bundled with a successor. Having two flags
      provdes redundancy to detect when a bundle is inadvertently torn by a
      splice() or insert(), and it makes it possible to write bundle iterators
      that don't need to peek at adjacent instructions.
      
      The new flags can't be manipulated directly (once setIsInsideBundle is
      gone). Instead there are MI functions to make and break bundle bonds.
      
      The setIsInsideBundle function will be removed in a future commit. It
      should be replaced by bundleWithPred().
      
      llvm-svn: 169583
      fead62d4
  6. Dec 06, 2012
  7. Dec 05, 2012
  8. Dec 04, 2012
Loading