Commits · ff392ab3edcf4dcbd056f5559b2ad06815f433a7 · Roger Ferrer / llvm-epi-0.8

Dec 19, 2010

reduce copy/paste programming with the power of for loops. · ae756e19
Chris Lattner authored Dec 19, 2010
```
llvm-svn: 122187
```
ae756e19

X86 supports i8/i16 overflow ops (except i8 multiplies), we should · 1e8c032a

Chris Lattner authored Dec 19, 2010

generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2

llvm-svn: 122186

1e8c032a

Dec 18, 2010
- Remove the MCObjectFormat class. · 8396dd08
  Rafael Espindola authored Dec 18, 2010
```
llvm-svn: 122147
```
  8396dd08
- Move some data to the TargetWriter. · fdaae0d1
  Rafael Espindola authored Dec 18, 2010
```
llvm-svn: 122134
```
  fdaae0d1
- Relax push instructions. · 625ccf82
  Rafael Espindola authored Dec 18, 2010
```
llvm-svn: 122121
```
  625ccf82
Dec 17, 2010
- Add support for matching psign & plendvb to the x86 target · 97b72c99
  Nate Begeman authored Dec 17, 2010
```
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts

llvm-svn: 122098
```
  97b72c99
- Stub out explicit MCELFObjectTargetWriter interface. · 6b5e56c2
  Rafael Espindola authored Dec 17, 2010
```
llvm-svn: 122067
```
  6b5e56c2
- Move createELFObjectWriter to its own header. · f0e24d42
  Rafael Espindola authored Dec 17, 2010
```
llvm-svn: 122064
```
  f0e24d42
- MC/Mach-O: On second thought, use a custom hook for enabling aggressive · 2ee6c9b8
  Daniel Dunbar authored Dec 17, 2010
```
IsSymbolRefDifferenceFullyResolved, it turns out this does change behavior on
enough cases for x86-32 that I would rather wait a bit on it.
 - In practice, we will want to change this eventually because it only means we
   generate less relocations (it also eliminates the need for the horrible
   '.set' hack that Darwin requires in some places).

llvm-svn: 122042
```
  2ee6c9b8
- MC/Target: Remove HasScatteredSymbols target hook variable, which has been · d2867f13
  Daniel Dunbar authored Dec 17, 2010
```
superceded and was effectively dead.

llvm-svn: 122024
```
  d2867f13
Dec 16, 2010
- Make pushq produce signed relocations. · 654cc4a8
  Rafael Espindola authored Dec 16, 2010
```
llvm-svn: 122005
```
  654cc4a8
- MC/Mach-O: Lift some MachObjectWriter arguments into the target specific · 03fcccbb
  Daniel Dunbar authored Dec 16, 2010
```
interface.

llvm-svn: 121981
```
  03fcccbb
- MC/Mach-O: Stub out explicit MCMachObjectTargetWriter interface. · 8888a960
  Daniel Dunbar authored Dec 16, 2010
```
llvm-svn: 121973
```
  8888a960
- MC/Mach-O: Move createMachObjectWriter into MCMachObjectWriter.h. · 73b8713d
  Daniel Dunbar authored Dec 16, 2010
```
llvm-svn: 121971
```
  73b8713d
- MC: Move target specific fixup info descriptors to TargetAsmBackend instead of · 0c9d9fdd
  Daniel Dunbar authored Dec 16, 2010
```
the MCCodeEmitter, which seems like a better organization.
 - Also, cleaned up some magic constants while in the area.

llvm-svn: 121953
```
  0c9d9fdd
Dec 15, 2010
- Only rr forms of ADD*_DB are commutable. · be69d8e2
  Evan Cheng authored Dec 15, 2010
```
llvm-svn: 121908
```
  be69d8e2
Dec 13, 2010
- Disable auto-detection of AVX support since AVX codegen support is not ready. · f8b4c003
  Evan Cheng authored Dec 13, 2010
```
llvm-svn: 121677
```
  f8b4c003
Dec 11, 2010
- Factor the (x & 2^n) ? 2^m : 0 instcombine into its own method and generalize it · c8b035d0
  Benjamin Kramer authored Dec 11, 2010
```
to catch cases where n != m with a shift.

llvm-svn: 121608
```
  c8b035d0
Dec 10, 2010
- Fixed version of 121434 with no new memory leaks. · 0a017a6d
  Rafael Espindola authored Dec 10, 2010
```
llvm-svn: 121471
```
  0a017a6d
- Revert my previous patch to make the valgrind bots happy. · a945a34c
  Rafael Espindola authored Dec 10, 2010
```
llvm-svn: 121461
```
  a945a34c
- Add some missing predicates. · a98b5419
  Nate Begeman authored Dec 10, 2010
```
llvm-svn: 121445
```
  a98b5419
- Formalize the notion that AVX and SSE are non-overlapping extensions from the... · 8b08f523
  Nate Begeman authored Dec 10, 2010
```
Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view.  Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX".  Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.

llvm-svn: 121439
```
  8b08f523
- Initial support for the cfi directives. This is just enough to get · 56eb7412
  Rafael Espindola authored Dec 09, 2010
```
f:
        .cfi_startproc
        nop
        .cfi_endproc

assembled (on ELF).

llvm-svn: 121434
```
  56eb7412
Dec 09, 2010
- Add support for AVX to materialize +0.0 when doing scalar FP. · 073901c8
  Nate Begeman authored Dec 09, 2010
```
llvm-svn: 121415
```
  073901c8
- Rewrite the darwin tlv support to use a chain and return to copying · a8aaaee3
  Eric Christopher authored Dec 09, 2010
```
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.

llvm-svn: 121358
```
  a8aaaee3
- Stop confusing people, it's not really a chain, or a tumor. · 87830740
  Eric Christopher authored Dec 09, 2010
```
llvm-svn: 121340
```
  87830740
- Remove extraneous copy from DAG conversion for darwin tls. This was · d84970ae
  Eric Christopher authored Dec 09, 2010
```
popping up at O0 when it wasn't folded and the fast allocator would
complain.

llvm-svn: 121330
```
  d84970ae
- Add rsp to the uses for the same reason as 32-bit. · c2dc95ae
  Eric Christopher authored Dec 09, 2010
```
llvm-svn: 121328
```
  c2dc95ae
- Allow a slash, '/', as a prefix separator for X86. rdar://8741045 · 87bc591f
  Kevin Enderby authored Dec 08, 2010
```
llvm-svn: 121320
```
  87bc591f
Dec 07, 2010
- lib/Target/X86/X86MCAsmInfo.cpp: [PR8741] On Win64, specify explicit PrivateGlobalPrefix as ".L". · 547cc6f0
  NAKAMURA Takumi authored Dec 07, 2010
```
Or, global symbols @Lxxxx might be treated as temporal symbol by MCSymbol.

llvm-svn: 121103
```
  547cc6f0
Dec 06, 2010
- Remove the instruction fragment to data fragment lowering since it was causing · 0f30fec0
  Rafael Espindola authored Dec 06, 2010
```
freed data to be read. I will open a bug to track it being reenabled.

llvm-svn: 121028
```
  0f30fec0
- Second try at making direct object emission produce the same results · 44bbe36d
  Rafael Espindola authored Dec 06, 2010
```
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.

llvm-svn: 121006
```
  44bbe36d
Dec 05, 2010

Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags · 68861717

Chris Lattner authored Dec 05, 2010

result.  This allows us to compile:

void *test12(long count) {
      return new int[count];
}

into:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	movq	$-1, %rdi
	cmovnoq	%rax, %rdi
	jmp	__Znam                  ## TAILCALL

instead of:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	seto	%cl
	testb	%cl, %cl
	movq	$-1, %rdi
	cmoveq	%rax, %rdi
	jmp	__Znam

Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.

llvm-svn: 120936

68861717

it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0

Chris Lattner authored Dec 05, 2010

backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935

364bb0a0

generalize the previous check to handle -1 on either side of the · 116580a1

Chris Lattner authored Dec 05, 2010

select, inserting a not to compensate.  Add a missing isZero check
that I lost somehow.

This improves codegen of:

void *func(long count) {
      return new int[count];
}

from:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

to:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

llvm-svn: 120932

116580a1

Improve an integer select optimization in two ways: · 342e6ea5

Chris Lattner authored Dec 05, 2010

1. generalize 
    (select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
    (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y

2. Handle the identical pattern that happens with !=:
   (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y

cmov is often high latency and can't fold immediates or
memory operands.  For example for (x == 0) ? -1 : 1, before 
we got:

< 	testb	%sil, %sil
< 	movl	$-1, %ecx
< 	movl	$1, %eax
< 	cmovel	%ecx, %eax

now we get:

> 	cmpb	$1, %sil
> 	sbbl	%eax, %eax
> 	orl	$1, %eax

llvm-svn: 120929

342e6ea5

Initialize HasPOPCNT. · 2bce78e8
Bill Wendling authored Dec 04, 2010
```
llvm-svn: 120923
```
2bce78e8

Dec 04, 2010

Add patterns for the x86 popcnt instruction. · 2f489236

Benjamin Kramer authored Dec 04, 2010

- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917

2f489236

Simplify code. No functionality change. · 8ceebfaa
Benjamin Kramer authored Dec 04, 2010
```
llvm-svn: 120907
```
8ceebfaa

There are two reasons why we might want to use · 1c8ac8f0

Rafael Espindola authored Dec 04, 2010

foo = a - b
.long foo
instead of just
.long a - b

First, on darwin9 64 bits the assembler produces the wrong result. Second,
if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not
consider a - b to be a constant but will if the dummy foo is created.

Split how we handle these cases. The first one is something MC should take care
of. The second one has to be handled by the caller.

llvm-svn: 120889

1c8ac8f0