Skip to content
  1. Jun 03, 2012
  2. Jun 01, 2012
  3. May 31, 2012
    • Manman Ren's avatar
      X86: replace SUB with CMP if possible · 9bccb64e
      Manman Ren authored
      This patch will optimize the following
              movq    %rdi, %rax
              subq    %rsi, %rax
              cmovsq  %rsi, %rdi
              movq    %rdi, %rax
      to
              cmpq    %rsi, %rdi
              cmovsq  %rsi, %rdi
              movq    %rdi, %rax
      
      Perform this optimization if the actual result of SUB is not used.
      
      rdar: 11540023
      llvm-svn: 157755
      9bccb64e
    • Elena Demikhovsky's avatar
      Added FMA3 Intel instructions. · 602f3a26
      Elena Demikhovsky authored
      I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
      I added tests for GodeGen and intrinsics.
      I did not change llvm.fma.f32/64 - it may be done later.
      
      llvm-svn: 157737
      602f3a26
  4. Mar 17, 2012
  5. Feb 18, 2012
  6. Nov 15, 2011
    • Jakob Stoklund Olesen's avatar
      Break false dependencies before partial register updates. · f8ad336b
      Jakob Stoklund Olesen authored
      Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
      about instructions with partial register updates causing false unwanted
      dependencies.
      
      The ExecutionDepsFix pass will break the false dependencies if the
      updated register was written in the previoius N instructions.
      
      The small loop added to sse-domains.ll runs twice as fast with
      dependency-breaking instructions inserted.
      
      llvm-svn: 144602
      f8ad336b
  7. Sep 29, 2011
  8. Sep 28, 2011
  9. Sep 08, 2011
    • Bruno Cardoso Lopes's avatar
      * Combines Alignment, AuxInfo, and TB_NOT_REVERSABLE flag into a · 23eb5265
      Bruno Cardoso Lopes authored
      single field (Flags), which is a bitwise OR of items from the TB_*
      enum. This makes it easier to add new information in the future.
      
      * Gives every static array an equivalent layout: { RegOp, MemOp, Flags }
      
      * Adds a helper function, AddTableEntry, to avoid duplication of the
      insertion code.
      
      * Renames TB_NOT_REVERSABLE to TB_NO_REVERSE.
      
      * Adds TB_NO_FORWARD, which is analogous to TB_NO_REVERSE, except that
      it prevents addition of the Reg->Mem entry. (This is going to be used
      by Native Client, in the next CL).
      
      Patch by David Meyer
      
      llvm-svn: 139311
      23eb5265
  10. Aug 08, 2011
  11. Jul 25, 2011
  12. Jul 01, 2011
  13. May 25, 2011
  14. Apr 15, 2011
  15. Apr 04, 2011
  16. Mar 05, 2011
    • Andrew Trick's avatar
      Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f
      Andrew Trick authored
      regs. This is the only change in this checkin that may affects the
      default scheduler. With better register tracking and heuristics, it
      doesn't make sense to artificially lower the register limit so much.
      
      Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
      give the scheduler a way to account for div and sqrt on targets that
      don't have an itinerary. It is currently defaults to 10 (the actual
      number doesn't matter much), but only takes effect on non-default
      schedulers: list-hybrid and list-ilp.
      
      Added several heuristics that can be individually disabled for the
      non-default sched=list-ilp mode. This helps us determine how much
      better we can do on a given benchmark than the default
      scheduler. Certain compute intensive loops run much faster in this
      mode with the right set of heuristics, and it doesn't seem to have
      much negative impact elsewhere. Not all of the heuristics are needed,
      but we still need to experiment to decide which should be disabled by
      default for sched=list-ilp.
      
      llvm-svn: 127067
      641e2d4f
    • Andrew Trick's avatar
      whitespace · 27c079e1
      Andrew Trick authored
      llvm-svn: 127065
      27c079e1
  17. Feb 22, 2011
  18. Nov 28, 2010
  19. Nov 15, 2010
  20. Oct 19, 2010
  21. Oct 08, 2010
  22. Oct 03, 2010
  23. Sep 17, 2010
  24. Sep 05, 2010
    • Chris Lattner's avatar
      implement rdar://6653118 - fastisel should fold loads where possible. · eeba0c73
      Chris Lattner authored
      Since mem2reg isn't run at -O0, we get a ton of reloads from the stack,
      for example, before, this code:
      
      int foo(int x, int y, int z) {
        return x+y+z;
      }
      
      used to compile into:
      
      _foo:                                   ## @foo
      	subq	$12, %rsp
      	movl	%edi, 8(%rsp)
      	movl	%esi, 4(%rsp)
      	movl	%edx, (%rsp)
      	movl	8(%rsp), %edx
      	movl	4(%rsp), %esi
      	addl	%edx, %esi
      	movl	(%rsp), %edx
      	addl	%esi, %edx
      	movl	%edx, %eax
      	addq	$12, %rsp
      	ret
      
      Now we produce:
      
      _foo:                                   ## @foo
      	subq	$12, %rsp
      	movl	%edi, 8(%rsp)
      	movl	%esi, 4(%rsp)
      	movl	%edx, (%rsp)
      	movl	8(%rsp), %edx
      	addl	4(%rsp), %edx    ## Folded load
      	addl	(%rsp), %edx     ## Folded load
      	movl	%edx, %eax
      	addq	$12, %rsp
      	ret
      
      Fewer instructions and less register use = faster compiles.
      
      llvm-svn: 113102
      eeba0c73
  25. Aug 26, 2010
  26. Aug 19, 2010
  27. Jul 22, 2010
  28. Jul 17, 2010
  29. Jul 13, 2010
  30. Jul 11, 2010
  31. Jul 09, 2010
Loading