Skip to content
  1. Sep 20, 2012
    • Michael Liao's avatar
      Re-work X86 code generation of atomic ops with spin-loop · 3237662b
      Michael Liao authored
      - Rewrite/merge pseudo-atomic instruction emitters to address the
        following issue:
        * Reduce one unnecessary load in spin-loop
      
          previously the spin-loop looks like
      
              thisMBB:
              newMBB:
                ld  t1 = [bitinstr.addr]
                op  t2 = t1, [bitinstr.val]
                not t3 = t2  (if Invert)
                mov EAX = t1
                lcs dest = [bitinstr.addr], t3  [EAX is implicit]
                bz  newMBB
                fallthrough -->nextMBB
      
          the 'ld' at the beginning of newMBB should be lift out of the loop
          as lcs (or CMPXCHG on x86) will load the current memory value into
          EAX. This loop is refined as:
      
              thisMBB:
                EAX = LOAD [MI.addr]
              mainMBB:
                t1 = OP [MI.val], EAX
                LCMPXCHG [MI.addr], t1, [EAX is implicitly used & defined]
                JNE mainMBB
              sinkMBB:
      
        * Remove immopc as, so far, all pseudo-atomic instructions has
          all-register form only, there is no immedidate operand.
      
        * Remove unnecessary attributes/modifiers in pseudo-atomic instruction
          td
      
        * Fix issues in PR13458
      
      - Add comprehensive tests on atomic ops on various data types.
        NOTE: Some of them are turned off due to missing functionality.
      
      - Revise tests due to the new spin-loop generated.
      
      llvm-svn: 164281
      3237662b
  2. Aug 02, 2012
  3. Jul 29, 2012
  4. Jul 28, 2012
  5. Jul 06, 2012
    • Manman Ren's avatar
      X86: peephole optimization to remove cmp instruction · c9656737
      Manman Ren authored
      For each Cmp, we check whether there is an earlier Sub which make Cmp
      redundant. We handle the case where SUB operates on the same source operands as
      Cmp, including the case where the two source operands are swapped.
      
      llvm-svn: 159838
      c9656737
  6. Jul 04, 2012
  7. Jun 23, 2012
  8. Jun 07, 2012
    • Manman Ren's avatar
      Revert r157755. · 9c964181
      Manman Ren authored
      The commit is intended to fix rdar://11540023.
      It is implemented as part of peephole optimization. We can actually implement
      this in the SelectionDAG lowering phase.
      
      llvm-svn: 158122
      9c964181
  9. Jun 03, 2012
  10. Jun 01, 2012
  11. May 31, 2012
    • Manman Ren's avatar
      X86: replace SUB with CMP if possible · 9bccb64e
      Manman Ren authored
      This patch will optimize the following
              movq    %rdi, %rax
              subq    %rsi, %rax
              cmovsq  %rsi, %rdi
              movq    %rdi, %rax
      to
              cmpq    %rsi, %rdi
              cmovsq  %rsi, %rdi
              movq    %rdi, %rax
      
      Perform this optimization if the actual result of SUB is not used.
      
      rdar: 11540023
      llvm-svn: 157755
      9bccb64e
    • Elena Demikhovsky's avatar
      Added FMA3 Intel instructions. · 602f3a26
      Elena Demikhovsky authored
      I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
      I added tests for GodeGen and intrinsics.
      I did not change llvm.fma.f32/64 - it may be done later.
      
      llvm-svn: 157737
      602f3a26
  12. Mar 17, 2012
  13. Feb 18, 2012
  14. Nov 15, 2011
    • Jakob Stoklund Olesen's avatar
      Break false dependencies before partial register updates. · f8ad336b
      Jakob Stoklund Olesen authored
      Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
      about instructions with partial register updates causing false unwanted
      dependencies.
      
      The ExecutionDepsFix pass will break the false dependencies if the
      updated register was written in the previoius N instructions.
      
      The small loop added to sse-domains.ll runs twice as fast with
      dependency-breaking instructions inserted.
      
      llvm-svn: 144602
      f8ad336b
  15. Sep 29, 2011
  16. Sep 28, 2011
  17. Sep 08, 2011
    • Bruno Cardoso Lopes's avatar
      * Combines Alignment, AuxInfo, and TB_NOT_REVERSABLE flag into a · 23eb5265
      Bruno Cardoso Lopes authored
      single field (Flags), which is a bitwise OR of items from the TB_*
      enum. This makes it easier to add new information in the future.
      
      * Gives every static array an equivalent layout: { RegOp, MemOp, Flags }
      
      * Adds a helper function, AddTableEntry, to avoid duplication of the
      insertion code.
      
      * Renames TB_NOT_REVERSABLE to TB_NO_REVERSE.
      
      * Adds TB_NO_FORWARD, which is analogous to TB_NO_REVERSE, except that
      it prevents addition of the Reg->Mem entry. (This is going to be used
      by Native Client, in the next CL).
      
      Patch by David Meyer
      
      llvm-svn: 139311
      23eb5265
  18. Aug 08, 2011
  19. Jul 25, 2011
  20. Jul 01, 2011
  21. May 25, 2011
  22. Apr 15, 2011
  23. Apr 04, 2011
  24. Mar 05, 2011
    • Andrew Trick's avatar
      Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f
      Andrew Trick authored
      regs. This is the only change in this checkin that may affects the
      default scheduler. With better register tracking and heuristics, it
      doesn't make sense to artificially lower the register limit so much.
      
      Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
      give the scheduler a way to account for div and sqrt on targets that
      don't have an itinerary. It is currently defaults to 10 (the actual
      number doesn't matter much), but only takes effect on non-default
      schedulers: list-hybrid and list-ilp.
      
      Added several heuristics that can be individually disabled for the
      non-default sched=list-ilp mode. This helps us determine how much
      better we can do on a given benchmark than the default
      scheduler. Certain compute intensive loops run much faster in this
      mode with the right set of heuristics, and it doesn't seem to have
      much negative impact elsewhere. Not all of the heuristics are needed,
      but we still need to experiment to decide which should be disabled by
      default for sched=list-ilp.
      
      llvm-svn: 127067
      641e2d4f
    • Andrew Trick's avatar
      whitespace · 27c079e1
      Andrew Trick authored
      llvm-svn: 127065
      27c079e1
  25. Feb 22, 2011
  26. Nov 28, 2010
  27. Nov 15, 2010
  28. Oct 19, 2010
  29. Oct 08, 2010
  30. Oct 03, 2010
  31. Sep 17, 2010
Loading