Skip to content
  1. Apr 09, 2012
    • Bill Wendling's avatar
      8a49d049
    • Craig Topper's avatar
    • Chandler Carruth's avatar
      Cleanup and relax a restriction on the matching of global offsets into · 3779ac10
      Chandler Carruth authored
      x86 addressing modes. This allows PIE-based TLS offsets to fit directly
      into an addressing mode immediate offset, which is the last remaining
      code quality issue from PR12380. With this patch, that PR is completely
      fixed.
      
      To understand why this patch is correct to match these offsets into
      addressing mode immediates, break it down by cases:
      1) 32-bit is trivially correct, and unmodified here.
      2) 64-bit non-small mode is unchanged and never matches.
      3) 64-bit small PIC code which is RIP-relative is handled specially in
         the match to try to fit RIP into the base register. If it fails, it
         now early exits. This behavior is unchanged by the patch.
      4) 64-bit small non-PIC code which is not RIP-relative continues to work
         as it did before. The reason these immediates are safe is because the
         ABI ensures they fit in small mode. This behavior is unchanged.
      5) 64-bit small PIC code which is *not* using RIP-relative addressing.
         This is the only case changed by the patch, and the primary place you
         see it is in TLS, either the win64 section offset TLS or Linux
         local-exec TLS model in a PIC compilation. Here the ABI again ensures
         that the immediates fit because we are in small mode, and any other
         operations required due to the PIC relocation model have been handled
         externally to the Wrapper node (extra loads etc are made around the
         wrapper node in ISelLowering).
      
      I've tested this as much as I can comparing it with GCC's output, and
      everything appears safe. I discussed this with Anton and it made sense
      to him at least at face value. That said, if there are issues with PIC
      code after this patch, yell and we can revert it.
      
      llvm-svn: 154304
      3779ac10
    • Chandler Carruth's avatar
      Fold 15 tiny test cases into a single file that implements the · 84b83426
      Chandler Carruth authored
      comprehensive testing of TLS codegen for x86. Convert all of the ones
      that were still using grep to use FileCheck. Remove some redundancies
      between them.
      
      Perhaps most interestingly expand the test cases so that they actually
      fully list the instruction snippet being tested. TLS operations are
      *very* narrowly defined, and so these seem reasonably stable. More
      importantly, the existing test cases already were crazy fine grained,
      expecting specific registers to be allocated. This just clarifies that
      no *other* instructions are expected, and fills in some crucial gaps
      that weren't being tested at all.
      
      This will make any subsequent changes to TLS much more clear during
      review.
      
      llvm-svn: 154303
      84b83426
    • Nick Kledzik's avatar
    • Nick Kledzik's avatar
    • Craig Topper's avatar
      Optimize code a bit. No functional change intended. · 6148fe65
      Craig Topper authored
      llvm-svn: 154299
      6148fe65
  2. Apr 08, 2012
  3. Apr 07, 2012
    • Craig Topper's avatar
      Move vinsertf128 patterns near the instruction definitions. Add... · aa9aab5a
      Craig Topper authored
      Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
      
      llvm-svn: 154268
      aa9aab5a
    • Craig Topper's avatar
      Remove 'else' after 'if' that ends in return. · e09d1c5c
      Craig Topper authored
      llvm-svn: 154267
      e09d1c5c
    • Nadav Rotem's avatar
      1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new · 71d07ae5
      Nadav Rotem authored
         shuffle node because it could introduce new shuffle nodes that were not
         supported efficiently by the target.
      
      2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
         second shuffle reverses the transformation of the first shuffle.
      
      llvm-svn: 154266
      71d07ae5
    • Duncan Sands's avatar
      Convert floating point division by a constant into multiplication by the · 5f8397a9
      Duncan Sands authored
      reciprocal if converting to the reciprocal is exact.  Do it even if inexact
      if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
      benchmarks.
      
      llvm-svn: 154265
      5f8397a9
    • Chandler Carruth's avatar
      Perform partial SROA on the helper hashing structure. I really wish the · 75a1cf32
      Chandler Carruth authored
      optimizers could do this for us, but expecting partial SROA of classes
      with template methods through cloning is probably expecting too much
      heroics. With this change, the begin/end pointer pairs which indicate
      the status of each loop iteration are actually passed directly into each
      layer of the combine_data calls, and the inliner has a chance to see
      when most of the combine_data function could be deleted by inlining.
      Similarly for 'length'.
      
      We have to be careful to limit the places where in/out reference
      parameters are used as those will also defeat the inliner / optimizers
      from properly propagating constants.
      
      With this change, LLVM is able to fully inline and unroll the hash
      computation of small sets of values, such as two or three pointers.
      These now decompose into essentially straight-line code with no loops or
      function calls.
      
      There is still one code quality problem to be solved with the hashing --
      LLVM is failing to nuke the alloca. It removes all loads from the
      alloca, leaving only lifetime intrinsics and dead(!!) stores to the
      alloca. =/ Very unfortunate.
      
      llvm-svn: 154264
      75a1cf32
Loading