Skip to content
  1. Aug 13, 2012
  2. Aug 12, 2012
  3. Aug 10, 2012
  4. Aug 08, 2012
  5. Aug 06, 2012
  6. Aug 05, 2012
  7. Aug 03, 2012
    • Bob Wilson's avatar
      Fall back to selection DAG isel for calls to builtin functions. · 3e6fa462
      Bob Wilson authored
      Fast isel doesn't currently have support for translating builtin function
      calls to target instructions.  For embedded environments where the library
      functions are not available, this is a matter of correctness and not
      just optimization.  Most of this patch is just arranging to make the
      TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>
      
      llvm-svn: 161232
      3e6fa462
  8. Aug 01, 2012
  9. Jul 25, 2012
  10. Jul 23, 2012
  11. Jul 17, 2012
  12. Jul 16, 2012
    • Evan Cheng's avatar
      For something like · 75315b87
      Evan Cheng authored
      uint32_t hi(uint64_t res)
      {
              uint_32t hi = res >> 32;
              return !hi;
      }
      
      llvm IR looks like this:
      define i32 @hi(i64 %res) nounwind uwtable ssp {
      entry:
        %lnot = icmp ult i64 %res, 4294967296
        %lnot.ext = zext i1 %lnot to i32
        ret i32 %lnot.ext
      }
      
      The optimizer has optimize away the right shift and truncate but the resulting
      constant is too large to fit in the 32-bit immediate field. The resulting x86
      code is worse as a result:
              movabsq $4294967296, %rax       ## imm = 0x100000000
              cmpq    %rax, %rdi
              sbbl    %eax, %eax
              andl    $1, %eax
      
      This patch teaches the x86 lowering code to handle ult against a large immediate
      with trailing zeros. It will issue a right shift and a truncate followed by
      a comparison against a shifted immediate.
              shrq    $32, %rdi
              testl   %edi, %edi
              sete    %al
              movzbl  %al, %eax
      
      It also handles a ugt comparison against a large immediate with trailing bits
      set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1
      
      rdar://11866926
      
      llvm-svn: 160312
      75315b87
  13. Jul 15, 2012
  14. Jul 12, 2012
  15. Jul 11, 2012
    • Nadav Rotem's avatar
      · d2bdcebb
      Nadav Rotem authored
      When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
      
      llvm-svn: 160044
      d2bdcebb
  16. Jul 10, 2012
    • Nadav Rotem's avatar
      · d908ddc1
      Nadav Rotem authored
      Improve the loading of load-anyext vectors by allowing the codegen to load
      multiple scalars and insert them into a vector. Next, we shuffle the elements
      into the correct places, as before.
      Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
      migration of bitcasts happened too late in the SelectionDAG process.
      
      llvm-svn: 159991
      d908ddc1
  17. Jul 05, 2012
  18. Jul 04, 2012
    • Jakob Stoklund Olesen's avatar
      Ensure CopyToReg nodes are always glued to the call instruction. · 2dee8124
      Jakob Stoklund Olesen authored
      The CopyToReg nodes that set up the argument registers before a call
      must be glued to the call instruction. Otherwise, the scheduler may emit
      the physreg copies long before the call, causing long live ranges for
      the fixed registers.
      
      Besides disabling good register allocation, that can also expose
      problems when EmitInstrWithCustomInserter() splits a basic block during
      the live range of a physreg.
      
      llvm-svn: 159721
      2dee8124
  19. Jul 01, 2012
  20. Jun 29, 2012
  21. Jun 26, 2012
    • Elena Demikhovsky's avatar
      Removed unused variable · 863d2d32
      Elena Demikhovsky authored
      llvm-svn: 159197
      863d2d32
    • Bill Wendling's avatar
      Rename to match other X86_64* names. · 8ed44466
      Bill Wendling authored
      llvm-svn: 159196
      8ed44466
    • Elena Demikhovsky's avatar
      Shuffle optimization for AVX/AVX2. · 26088d2e
      Elena Demikhovsky authored
      The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
      Before:
            vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
             vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
             vextractf128    $1, %ymm1, %xmm1
             vextractf128    $1, %ymm0, %xmm0
             vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
             vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
             vinsertf128     $1, %xmm0, %ymm2, %ymm0
      After:
            vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
            vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
            vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]
      
      llvm-svn: 159188
      26088d2e
Loading