Skip to content
  1. Sep 26, 2012
  2. Sep 22, 2012
  3. Sep 21, 2012
  4. Sep 20, 2012
    • Michael Liao's avatar
      Re-work X86 code generation of atomic ops with spin-loop · 3237662b
      Michael Liao authored
      - Rewrite/merge pseudo-atomic instruction emitters to address the
        following issue:
        * Reduce one unnecessary load in spin-loop
      
          previously the spin-loop looks like
      
              thisMBB:
              newMBB:
                ld  t1 = [bitinstr.addr]
                op  t2 = t1, [bitinstr.val]
                not t3 = t2  (if Invert)
                mov EAX = t1
                lcs dest = [bitinstr.addr], t3  [EAX is implicit]
                bz  newMBB
                fallthrough -->nextMBB
      
          the 'ld' at the beginning of newMBB should be lift out of the loop
          as lcs (or CMPXCHG on x86) will load the current memory value into
          EAX. This loop is refined as:
      
              thisMBB:
                EAX = LOAD [MI.addr]
              mainMBB:
                t1 = OP [MI.val], EAX
                LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
                JNE mainMBB
              sinkMBB:
      
        * Remove immopc as, so far, all pseudo-atomic instructions has
          all-register form only, there is no immedidate operand.
      
        * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
          td
      
        * Fix issues in PR13458
      
      - Add comprehensive tests on atomic ops on various data types.
        NOTE: Some of them are turned off due to missing functionality.
      
      - Revise tests due to the new spin-loop generated.
      
      llvm-svn: 164281
      3237662b
  5. Sep 13, 2012
  6. Jun 01, 2012
    • Hans Wennborg's avatar
      Implement the local-dynamic TLS model for x86 (PR3985) · 789acfb6
      Hans Wennborg authored
      This implements codegen support for accesses to thread-local variables
      using the local-dynamic model, and adds a clean-up pass so that the base
      address for the TLS block can be re-used between local-dynamic access on
      an execution path.
      
      llvm-svn: 157818
      789acfb6
  7. May 09, 2012
  8. May 07, 2012
    • Manman Ren's avatar
      X86: optimization for -(x != 0) · ef4e0479
      Manman Ren authored
      This patch will optimize -(x != 0) on X86
      FROM 
      cmpl	$0x01,%edi
      sbbl	%eax,%eax
      notl	%eax
      TO
      negl %edi
      sbbl %eax %eax
      
      In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
      def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;
      
      rdar: 10961709
      llvm-svn: 156312
      ef4e0479
  9. Apr 04, 2012
    • Rafael Espindola's avatar
      Always compute all the bits in ComputeMaskedBits. · ba0a6cab
      Rafael Espindola authored
      This allows us to keep passing reduced masks to SimplifyDemandedBits, but
      know about all the bits if SimplifyDemandedBits fails. This allows instcombine
      to simplify cases like the one in the included testcase.
      
      llvm-svn: 154011
      ba0a6cab
  10. Mar 29, 2012
  11. Mar 19, 2012
  12. Feb 24, 2012
  13. Feb 16, 2012
  14. Jan 16, 2012
  15. Jan 12, 2012
  16. Dec 24, 2011
    • Chandler Carruth's avatar
      Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the · 7e9453e9
      Chandler Carruth authored
      X86ISelLowering C++ code. Because this is lowered via an xor wrapped
      around a bsr, we want the dagcombine which runs after isel lowering to
      have a chance to clean things up. In particular, it is very common to
      see code which looks like:
      
        (sizeof(x)*8 - 1) ^ __builtin_clz(x)
      
      Which is trying to compute the most significant bit of 'x'. That's
      actually the value computed directly by the 'bsr' instruction, but if we
      match it too late, we'll get completely redundant xor instructions.
      
      The more naive code for the above (subtracting rather than using an xor)
      still isn't handled correctly due to the dagcombine getting confused.
      
      Also, while here fix an issue spotted by inspection: we should have been
      expanding the zero-undef variants to the normal variants when there is
      an 'lzcnt' instruction. Do so, and test for this. We don't want to
      generate unnecessary 'bsr' instructions.
      
      These two changes fix some regressions in encoding and decoding
      benchmarks. However, there is still a *lot* to be improve on in this
      type of code.
      
      llvm-svn: 147244
      7e9453e9
  17. Dec 20, 2011
    • Chandler Carruth's avatar
      Begin teaching the X86 target how to efficiently codegen patterns that · 24680c24
      Chandler Carruth authored
      use the zero-undefined variants of CTTZ and CTLZ. These are just simple
      patterns for now, there is more to be done to make real world code using
      these constructs be optimized and codegen'ed properly on X86.
      
      The existing tests are spiffed up to check that we no longer generate
      unnecessary cmov instructions, and that we generate the very important
      'xor' to transform bsr which counts the index of the most significant
      one bit to the number of leading (most significant) zero bits. Also they
      now check that when the variant with defined zero result is used, the
      cmov is still produced.
      
      llvm-svn: 146974
      24680c24
  18. Oct 26, 2011
  19. Sep 13, 2011
  20. Sep 07, 2011
  21. Sep 03, 2011
    • Jakob Stoklund Olesen's avatar
      Pseudo CMOV instructions don't clobber EFLAGS. · 1f72dd40
      Jakob Stoklund Olesen authored
      The explanation about a 0 argument being materialized as xor is no
      longer valid.  Rematerialization will check if EFLAGS is live before
      clobbering it.
      
      The code produced by X86TargetLowering::EmitLoweredSelect does not
      clobber EFLAGS.
      
      This causes one less testb instruction to be generated in the cmov.ll
      test case.
      
      llvm-svn: 139057
      1f72dd40
  22. Aug 30, 2011
  23. Aug 26, 2011
  24. Aug 24, 2011
  25. Aug 10, 2011
  26. Jul 27, 2011
  27. Jun 16, 2011
  28. May 21, 2011
  29. May 20, 2011
  30. May 19, 2011
  31. May 17, 2011
  32. May 11, 2011
Loading