Skip to content
  1. Jun 30, 2011
  2. May 20, 2011
  3. May 17, 2011
  4. May 11, 2011
  5. Apr 23, 2011
  6. Apr 22, 2011
    • Benjamin Kramer's avatar
      X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X &... · 4c816247
      Benjamin Kramer authored
      X86: Try to use a smaller encoding by transforming (X << C1) & C2 into (X & (C2 >> C1)) & C1. (Part of PR5039)
      
      This tends to happen a lot with bitfield code generated by clang. A simple example for x86_64 is
      uint64_t foo(uint64_t x) { return (x&1) << 42; }
      which used to compile into bloated code:
      	shlq	$42, %rdi               ## encoding: [0x48,0xc1,0xe7,0x2a]
      	movabsq	$4398046511104, %rax    ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x00,0x04,0x00,0x00]
      	andq	%rdi, %rax              ## encoding: [0x48,0x21,0xf8]
      	ret                             ## encoding: [0xc3]
      
      with this patch we can fold the immediate into the and:
      	andq	$1, %rdi                ## encoding: [0x48,0x83,0xe7,0x01]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	shlq	$42, %rax               ## encoding: [0x48,0xc1,0xe0,0x2a]
      	ret                             ## encoding: [0xc3]
      
      It's possible to save another byte by using 'andl' instead of 'andq' but I currently see no way of doing
      that without making this code even more complicated. See the TODOs in the code.
      
      llvm-svn: 129990
      4c816247
  7. Feb 16, 2011
  8. Feb 13, 2011
    • Chris Lattner's avatar
      Enhance ComputeMaskedBits to know that aligned frameindexes · 46c01a30
      Chris Lattner authored
      have their low bits set to zero.  This allows us to optimize
      out explicit stack alignment code like in stack-align.ll:test4 when
      it is redundant.
      
      Doing this causes the code generator to start turning FI+cst into
      FI|cst all over the place, which is general goodness (that is the
      canonical form) except that various pieces of the code generator
      don't handle OR aggressively.  Fix this by introducing a new
      SelectionDAG::isBaseWithConstantOffset predicate, and using it
      in places that are looking for ADD(X,CST).  The ARM backend in
      particular was missing a lot of addressing mode folding opportunities
      around OR.
      
      llvm-svn: 125470
      46c01a30
  9. Jan 27, 2011
  10. Jan 16, 2011
  11. Jan 14, 2011
  12. Jan 06, 2011
  13. Dec 21, 2010
  14. Dec 05, 2010
    • Chris Lattner's avatar
      it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0
      Chris Lattner authored
      backend that they were all implemented except umul.  This one fell back
      to the default implementation that did a hi/lo multiply and compared the
      top.  Fix this to check the overflow flag that the 'mul' instruction
      sets, so we can avoid an explicit test.  Now we compile:
      
      void *func(long count) {
            return new int[count];
      }
      
      into:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
      	testb	%cl, %cl                ## encoding: [0x84,0xc9]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      instead of:
      
      __Z4funcl:                              ## @_Z4funcl
      	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
      	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
      	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
      	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
      	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
      	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
      	jmp	__Znam                  ## TAILCALL
      
      Other than the silly seto+test, this is using the o bit directly, so it's going in the right
      direction.
      
      llvm-svn: 120935
      364bb0a0
  15. Oct 27, 2010
  16. Oct 06, 2010
  17. Sep 22, 2010
  18. Sep 21, 2010
  19. Sep 04, 2010
  20. Sep 03, 2010
  21. Aug 25, 2010
  22. Aug 05, 2010
  23. Jul 09, 2010
  24. Jul 08, 2010
Loading