Commits · a8aaaee3792f3a9167fcaeff2954dc96923dfbe6 · Roger Ferrer / llvm-epi-0.8

Dec 09, 2010
- Rewrite the darwin tlv support to use a chain and return to copying · a8aaaee3
  Eric Christopher authored Dec 09, 2010
```
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.

llvm-svn: 121358
```
  a8aaaee3
- Stop confusing people, it's not really a chain, or a tumor. · 87830740
  Eric Christopher authored Dec 09, 2010
```
llvm-svn: 121340
```
  87830740
- Remove extraneous copy from DAG conversion for darwin tls. This was · d84970ae
  Eric Christopher authored Dec 09, 2010
```
popping up at O0 when it wasn't folded and the fast allocator would
complain.

llvm-svn: 121330
```
  d84970ae
- Add rsp to the uses for the same reason as 32-bit. · c2dc95ae
  Eric Christopher authored Dec 09, 2010
```
llvm-svn: 121328
```
  c2dc95ae
- Allow a slash, '/', as a prefix separator for X86. rdar://8741045 · 87bc591f
  Kevin Enderby authored Dec 08, 2010
```
llvm-svn: 121320
```
  87bc591f
Dec 07, 2010
- lib/Target/X86/X86MCAsmInfo.cpp: [PR8741] On Win64, specify explicit PrivateGlobalPrefix as ".L". · 547cc6f0
  NAKAMURA Takumi authored Dec 07, 2010
```
Or, global symbols @Lxxxx might be treated as temporal symbol by MCSymbol.

llvm-svn: 121103
```
  547cc6f0
Dec 06, 2010
- Remove the instruction fragment to data fragment lowering since it was causing · 0f30fec0
  Rafael Espindola authored Dec 06, 2010
```
freed data to be read. I will open a bug to track it being reenabled.

llvm-svn: 121028
```
  0f30fec0
- Second try at making direct object emission produce the same results · 44bbe36d
  Rafael Espindola authored Dec 06, 2010
```
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.

llvm-svn: 121006
```
  44bbe36d
Dec 05, 2010

Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags · 68861717

Chris Lattner authored Dec 05, 2010

result.  This allows us to compile:

void *test12(long count) {
      return new int[count];
}

into:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	movq	$-1, %rdi
	cmovnoq	%rax, %rdi
	jmp	__Znam                  ## TAILCALL

instead of:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	seto	%cl
	testb	%cl, %cl
	movq	$-1, %rdi
	cmoveq	%rax, %rdi
	jmp	__Znam

Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.

llvm-svn: 120936

68861717

it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0

Chris Lattner authored Dec 05, 2010

backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935

364bb0a0

generalize the previous check to handle -1 on either side of the · 116580a1

Chris Lattner authored Dec 05, 2010

select, inserting a not to compensate.  Add a missing isZero check
that I lost somehow.

This improves codegen of:

void *func(long count) {
      return new int[count];
}

from:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

to:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

llvm-svn: 120932

116580a1

Improve an integer select optimization in two ways: · 342e6ea5

Chris Lattner authored Dec 05, 2010

1. generalize 
    (select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
    (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y

2. Handle the identical pattern that happens with !=:
   (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y

cmov is often high latency and can't fold immediates or
memory operands.  For example for (x == 0) ? -1 : 1, before 
we got:

< 	testb	%sil, %sil
< 	movl	$-1, %ecx
< 	movl	$1, %eax
< 	cmovel	%ecx, %eax

now we get:

> 	cmpb	$1, %sil
> 	sbbl	%eax, %eax
> 	orl	$1, %eax

llvm-svn: 120929

342e6ea5

Initialize HasPOPCNT. · 2bce78e8
Bill Wendling authored Dec 04, 2010
```
llvm-svn: 120923
```
2bce78e8

Dec 04, 2010

Add patterns for the x86 popcnt instruction. · 2f489236

Benjamin Kramer authored Dec 04, 2010

- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917

2f489236

Simplify code. No functionality change. · 8ceebfaa
Benjamin Kramer authored Dec 04, 2010
```
llvm-svn: 120907
```
8ceebfaa

There are two reasons why we might want to use · 1c8ac8f0

Rafael Espindola authored Dec 04, 2010

foo = a - b
.long foo
instead of just
.long a - b

First, on darwin9 64 bits the assembler produces the wrong result. Second,
if "a" is the end of the section all darwin assemblers (9, 10 and mc) will not
consider a - b to be a constant but will if the dummy foo is created.

Split how we handle these cases. The first one is something MC should take care
of. The second one has to be handled by the caller.

llvm-svn: 120889

1c8ac8f0

Dec 03, 2010
- Revert this change since it breaks a couple of the AVX tests. · a6c55a31
  Nate Begeman authored Dec 03, 2010
```
I'm unclear if the tests are actually correct or not, but reverting for now.

llvm-svn: 120847
```
  a6c55a31
- Scalar f32/f64 are also subregs of ymm regs · a3b00dd6
  Nate Begeman authored Dec 03, 2010
```
llvm-svn: 120844
```
  a3b00dd6
- Remove SSE1-4 disable when AVX is enabled. While this may be useful for development, · 84245533
  Nate Begeman authored Dec 03, 2010
```
it completely breaks scalar fp in xmm regs when AVX is enabled.

llvm-svn: 120843
```
  84245533
Dec 02, 2010
- Revert r120580. · d4b02960
  Devang Patel authored Dec 02, 2010
```
llvm-svn: 120630
```
  d4b02960
Dec 01, 2010

Fix and re-enable tail call optimization of expanded libcalls. · 419ea286
Evan Cheng authored Dec 01, 2010
```
llvm-svn: 120622
```
419ea286
Disable debug info for x86-darwin9 and earlier until PR 8715 and radar 8709290 are fixed. · be00735b
Devang Patel authored Dec 01, 2010
```
llvm-svn: 120580
```
be00735b

I don't think it makes any sense to assert that the target supports SSE3 here. · c4fb38b8

Duncan Sands authored Dec 01, 2010

The user (i.e. whoever generated a call to the intrinsic in the first place) is
essentially asking for a particular instruction to be placed in the assembler.
If that instruction won't execute on the target machine, that's their problem
not ours. Two buildbots with processors that don't support SSE3 were barfing
on the apm.ll test in CodeGen/X86 because of this assertion.

llvm-svn: 120574

c4fb38b8

Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. · a695abde
Evan Cheng authored Dec 01, 2010
```
llvm-svn: 120549
```
a695abde

Enable sibling call optimization of libcalls which are expanded during · d4b0873c

Evan Cheng authored Nov 30, 2010

legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777

llvm-svn: 120501

d4b0873c

Nov 30, 2010
- Move X86InstrFPStack.td over to PseudoI as well. · a964f4de
  Eric Christopher authored Nov 30, 2010
```
llvm-svn: 120470
```
  a964f4de
- Migrate X86InstrControl.td to use PseudoI and fix a couple of 80-col violations · a8706580
  Eric Christopher authored Nov 30, 2010
```
while I'm in there.

llvm-svn: 120466
```
  a8706580
- Fix some grammar in comments I noticed. · 3a8ae233
  Eric Christopher authored Nov 30, 2010
```
llvm-svn: 120416
```
  3a8ae233
- This defaults to GenericDomain. · ed13239d
  Eric Christopher authored Nov 30, 2010
```
llvm-svn: 120415
```
  ed13239d
- Implement a PseudoI class and transfer the sse instructions over to use · ef62f57d
  Eric Christopher authored Nov 30, 2010
```
it.

llvm-svn: 120412
```
  ef62f57d
- Fix insertion point in pcmp expander. · 2d1bcf4a
  Eric Christopher authored Nov 30, 2010
```
While I'm there, clean up too many \n even for me.

llvm-svn: 120411
```
  2d1bcf4a
- Fix some cleanups from my last patch. · 1a86e846
  Eric Christopher authored Nov 30, 2010
```
llvm-svn: 120410
```
  1a86e846
- Rewrite mwait and monitor support and custom lower arguments. · fa6657ce
  Eric Christopher authored Nov 30, 2010
```
Fixes PR8573.

llvm-svn: 120404
```
  fa6657ce
Nov 29, 2010
- Merge System into Support. · 447762da
  Michael J. Spencer authored Nov 29, 2010
```
llvm-svn: 120298
```
  447762da
Nov 28, 2010
- Move lowering of TLS_addr32 and TLS_addr64 to X86MCInstLower. · c4774795
  Rafael Espindola authored Nov 28, 2010
```
llvm-svn: 120263
```
  c4774795
- fix PR8686, accepting a 'b' suffix at the end of all the setcc · 7e8a99b1
  Chris Lattner authored Nov 28, 2010
```
instructions.  I choose to handle this with an asmparser hack,
though it could be handled by changing all the instruction definitions
to allow be "setneb" instead of "setne".  The asm parser hack is
better in this case, because we want the disassembler to produce
setne, not setneb.

llvm-svn: 120260
```
  7e8a99b1
- Define generic 1, 2 and 4 byte pc relative relocations. They are common · 8a3a7923
  Rafael Espindola authored Nov 28, 2010
```
and at least the 4 byte one will be needed to implement the .cfi_* directives.

llvm-svn: 120240
```
  8a3a7923
- Move more PEI-related hooks to TFI · 7283b8d1
  Anton Korobeynikov authored Nov 27, 2010
```
llvm-svn: 120229
```
  7283b8d1
- Move callee-saved regs spills / reloads to TFI · d08fbd19
  Anton Korobeynikov authored Nov 27, 2010
```
llvm-svn: 120228
```
  d08fbd19
Nov 27, 2010
- Lower TLS_addr32 and TLS_addr64. · 5d882894
  Rafael Espindola authored Nov 27, 2010
```
llvm-svn: 120225
```
  5d882894