Commits · 846c20d4e6921ddb5c59d5202d4f059c9524881d · Roger Ferrer / llvm-epi-0.8

Dec 20, 2010

Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which · 846c20d4

Chris Lattner authored Dec 20, 2010

their carry depenedencies with MVT::Flag operands) and use clean and beautiful
EFLAGS dependences instead.

We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs
(which is what requires the previous scheduler change) and change X86 ISelLowering
to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes.

With the previous series of changes, this causes no changes in the testsuite, woo.

llvm-svn: 122213

846c20d4

Prevents PerformShuffleCombine from creating a node with an illegal type after legalize types · 1064992c
Mon P Wang authored Dec 19, 2010
```
has run, e.g., prevent creating an i64 node from a v2i64 when i64 is not a legal type.

llvm-svn: 122206
```
1064992c

Dec 19, 2010

improve the setcc -> setcc_carry optimization to happen more · 9edf3f50

Chris Lattner authored Dec 19, 2010

consistently by moving it out of lowering into dag combine.

Add some missing patterns for matching away extended versions of setcc_c.

llvm-svn: 122201

9edf3f50

simplify some code to just reuse a setcc if we can instead of · 6dddab2f
Chris Lattner authored Dec 19, 2010
```
going through the CSE maps to get it.

llvm-svn: 122196
```
6dddab2f
now that generic vector types aren't selected onto MMX operations, · c37bb023
Chris Lattner authored Dec 19, 2010
```
we don't need -disable-mmx anymore.

llvm-svn: 122189
```
c37bb023
reduce copy/paste programming with the power of for loops. · ae756e19
Chris Lattner authored Dec 19, 2010
```
llvm-svn: 122187
```
ae756e19

X86 supports i8/i16 overflow ops (except i8 multiplies), we should · 1e8c032a

Chris Lattner authored Dec 19, 2010

generate them.  

Now we compile:

define zeroext i8 @X(i8 signext %a, i8 signext %b) nounwind ssp {
entry:
  %0 = tail call %0 @llvm.sadd.with.overflow.i8(i8 %a, i8 %b)
  %cmp = extractvalue %0 %0, 1
  br i1 %cmp, label %if.then, label %if.end

into:

_X:                                     ## @X
## BB#0:                                ## %entry
	subl	$12, %esp
	movb	16(%esp), %al
	addb	20(%esp), %al
	jo	LBB0_2

Before we were generating:

_X:                                     ## @X
## BB#0:                                ## %entry
	pushl	%ebp
	movl	%esp, %ebp
	subl	$8, %esp
	movb	12(%ebp), %al
	testb	%al, %al
	setge	%cl
	movb	8(%ebp), %dl
	testb	%dl, %dl
	setge	%ah
	cmpb	%cl, %ah
	sete	%cl
	addb	%al, %dl
	testb	%dl, %dl
	setge	%al
	cmpb	%al, %ah
	setne	%al
	andb	%cl, %al
	testb	%al, %al
	jne	LBB0_2

llvm-svn: 122186

1e8c032a

Dec 17, 2010
- Add support for matching psign & plendvb to the x86 target · 97b72c99
  Nate Begeman authored Dec 17, 2010
```
Remove unnecessary pandn patterns, 'vnot' patfrag looks through bitcasts

llvm-svn: 122098
```
  97b72c99
Dec 10, 2010

Formalize the notion that AVX and SSE are non-overlapping extensions from the... · 8b08f523

Nate Begeman authored Dec 10, 2010

Formalize the notion that AVX and SSE are non-overlapping extensions from the compiler's point of view.  Per email discussion, we either want to always use VEX-prefixed instructions or never use them, and are taking "HasAVX" to mean "Always use VEX".  Passing -mattr=-avx,+sse42 should serve to restore legacy SSE support when desirable.

llvm-svn: 121439

8b08f523

Dec 09, 2010
- Rewrite the darwin tlv support to use a chain and return to copying · a8aaaee3
  Eric Christopher authored Dec 09, 2010
```
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.

llvm-svn: 121358
```
  a8aaaee3
- Stop confusing people, it's not really a chain, or a tumor. · 87830740
  Eric Christopher authored Dec 09, 2010
```
llvm-svn: 121340
```
  87830740
- Remove extraneous copy from DAG conversion for darwin tls. This was · d84970ae
  Eric Christopher authored Dec 09, 2010
```
popping up at O0 when it wasn't folded and the fast allocator would
complain.

llvm-svn: 121330
```
  d84970ae
Dec 05, 2010

Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags · 68861717

Chris Lattner authored Dec 05, 2010

result.  This allows us to compile:

void *test12(long count) {
      return new int[count];
}

into:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	movq	$-1, %rdi
	cmovnoq	%rax, %rdi
	jmp	__Znam                  ## TAILCALL

instead of:

test12:
	movl	$4, %ecx
	movq	%rdi, %rax
	mulq	%rcx
	seto	%cl
	testb	%cl, %cl
	movq	$-1, %rdi
	cmoveq	%rax, %rdi
	jmp	__Znam

Of course it would be even better if the regalloc inverted the cmov to 'cmovoq',
which would eliminate the need for the 'movq %rdi, %rax'.

llvm-svn: 120936

68861717

it turns out that when ".with.overflow" intrinsics were added to the X86 · 364bb0a0

Chris Lattner authored Dec 05, 2010

backend that they were all implemented except umul.  This one fell back
to the default implementation that did a hi/lo multiply and compared the
top.  Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test.  Now we compile:

void *func(long count) {
      return new int[count];
}

into:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	seto	%cl                     ## encoding: [0x0f,0x90,0xc1]
	testb	%cl, %cl                ## encoding: [0x84,0xc9]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

instead of:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL

Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.

llvm-svn: 120935

364bb0a0

generalize the previous check to handle -1 on either side of the · 116580a1

Chris Lattner authored Dec 05, 2010

select, inserting a not to compensate.  Add a missing isZero check
that I lost somehow.

This improves codegen of:

void *func(long count) {
      return new int[count];
}

from:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	testq	%rdx, %rdx              ## encoding: [0x48,0x85,0xd2]
	movq	$-1, %rdi               ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
	cmoveq	%rax, %rdi              ## encoding: [0x48,0x0f,0x44,0xf8]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

to:

__Z4funcl:                              ## @_Z4funcl
	movl	$4, %ecx                ## encoding: [0xb9,0x04,0x00,0x00,0x00]
	movq	%rdi, %rax              ## encoding: [0x48,0x89,0xf8]
	mulq	%rcx                    ## encoding: [0x48,0xf7,0xe1]
	cmpq	$1, %rdx                ## encoding: [0x48,0x83,0xfa,0x01]
	sbbq	%rdi, %rdi              ## encoding: [0x48,0x19,0xff]
	notq	%rdi                    ## encoding: [0x48,0xf7,0xd7]
	orq	%rax, %rdi              ## encoding: [0x48,0x09,0xc7]
	jmp	__Znam                  ## TAILCALL
                                        ## encoding: [0xeb,A]

llvm-svn: 120932

116580a1

Improve an integer select optimization in two ways: · 342e6ea5

Chris Lattner authored Dec 05, 2010

1. generalize 
    (select (x == 0), -1, 0) -> (sign_bit (x - 1))
to:
    (select (x == 0), -1, y) -> (sign_bit (x - 1)) | y

2. Handle the identical pattern that happens with !=:
   (select (x != 0), y, -1) -> (sign_bit (x - 1)) | y

cmov is often high latency and can't fold immediates or
memory operands.  For example for (x == 0) ? -1 : 1, before 
we got:

< 	testb	%sil, %sil
< 	movl	$-1, %ecx
< 	movl	$1, %eax
< 	cmovel	%ecx, %eax

now we get:

> 	cmpb	$1, %sil
> 	sbbl	%eax, %eax
> 	orl	$1, %eax

llvm-svn: 120929

342e6ea5

Dec 04, 2010
- Add patterns for the x86 popcnt instruction. · 2f489236
  Benjamin Kramer authored Dec 04, 2010
```
- Also adds a new POPCNT subtarget feature that is currently enabled if the target
  supports SSE4.2 (nehalem) or SSE4A (barcelona).

llvm-svn: 120917
```
  2f489236
- Simplify code. No functionality change. · 8ceebfaa
  Benjamin Kramer authored Dec 04, 2010
```
llvm-svn: 120907
```
  8ceebfaa
Dec 01, 2010

Fix and re-enable tail call optimization of expanded libcalls. · 419ea286
Evan Cheng authored Dec 01, 2010
```
llvm-svn: 120622
```
419ea286

I don't think it makes any sense to assert that the target supports SSE3 here. · c4fb38b8

Duncan Sands authored Dec 01, 2010

The user (i.e. whoever generated a call to the intrinsic in the first place) is
essentially asking for a particular instruction to be placed in the assembler.
If that instruction won't execute on the target machine, that's their problem
not ours. Two buildbots with processors that don't support SSE3 were barfing
on the apm.ll test in CodeGen/X86 because of this assertion.

llvm-svn: 120574

c4fb38b8

Speculatively disable x86 portion of r120501 to appease the x86_64 buildbot. · a695abde
Evan Cheng authored Dec 01, 2010
```
llvm-svn: 120549
```
a695abde

Enable sibling call optimization of libcalls which are expanded during · d4b0873c

Evan Cheng authored Nov 30, 2010

legalization time. Since at legalization time there is no mapping from
SDNode back to the corresponding LLVM instruction and the return
SDNode is target specific, this requires a target hook to check for
eligibility. Only x86 and ARM support this form of sibcall optimization
right now.
rdar://8707777

llvm-svn: 120501

d4b0873c

Nov 30, 2010
- Fix insertion point in pcmp expander. · 2d1bcf4a
  Eric Christopher authored Nov 30, 2010
```
While I'm there, clean up too many \n even for me.

llvm-svn: 120411
```
  2d1bcf4a
- Fix some cleanups from my last patch. · 1a86e846
  Eric Christopher authored Nov 30, 2010
```
llvm-svn: 120410
```
  1a86e846
- Rewrite mwait and monitor support and custom lower arguments. · fa6657ce
  Eric Christopher authored Nov 30, 2010
```
Fixes PR8573.

llvm-svn: 120404
```
  fa6657ce
Nov 28, 2010
- Move lowering of TLS_addr32 and TLS_addr64 to X86MCInstLower. · c4774795
  Rafael Espindola authored Nov 28, 2010
```
llvm-svn: 120263
```
  c4774795
Nov 27, 2010
- Lower TLS_addr32 and TLS_addr64. · 5d882894
  Rafael Espindola authored Nov 27, 2010
```
llvm-svn: 120225
```
  5d882894
Nov 23, 2010
- Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. · 527da1b6
  Wesley Peck authored Nov 23, 2010
```
llvm-svn: 119990
```
  527da1b6
Nov 18, 2010
- Move hasFP() and few related hooks to TargetFrameInfo. · 0eecf5d2
  Anton Korobeynikov authored Nov 18, 2010
```
llvm-svn: 119740
```
  0eecf5d2
Nov 15, 2010

add targetoperand flags for jump tables, constant pool and block address · edb9d84d

Chris Lattner authored Nov 15, 2010

nodes to indicate when ha16/lo16 modifiers should be used.  This lets
us pass PowerPC/indirectbr.ll.

The one annoying thing about this patch is that the MCSymbolExpr isn't
expressive enough to represent ha16(label1-label2) which we need on
PowerPC.  I have a terrible hack in the meantime, but this will have
to be revisited at some point.

Last major conversion item left is global variable references.

llvm-svn: 119105

edb9d84d

Nov 14, 2010
- move the pic base symbol stuff up to MachineFunction · 7077efe8
  Chris Lattner authored Nov 14, 2010
```
since it is trivial and will be shared between ppc and x86.
This substantially simplifies the X86 backend also.

llvm-svn: 119089
```
  7077efe8
- simplify getPICBaseSymbol a bit. · 239f9a35
  Chris Lattner authored Nov 14, 2010
```
llvm-svn: 119088
```
  239f9a35
Nov 13, 2010
- Recognise 32-bit ror-based bswap implementation used by uclibc · feea10bc
  Peter Collingbourne authored Nov 13, 2010
```
llvm-svn: 119007
```
  feea10bc
- Support ; as asm separator · 1c6437a6
  Peter Collingbourne authored Nov 13, 2010
```
llvm-svn: 119006
```
  1c6437a6
Nov 12, 2010
- Remove possibly useful info from comment, per Chris. · 6d95ed17
  Dale Johannesen authored Nov 12, 2010
```
llvm-svn: 118865
```
  6d95ed17
Nov 03, 2010

Simplify uses of MVT and EVT. An MVT can be compared directly · 14627770

Duncan Sands authored Nov 03, 2010

with a SimpleValueType, while an EVT supports equality and
inequality comparisons with SimpleValueType.

llvm-svn: 118169

14627770

Oct 31, 2010

Factorize the duplicated logic for choosing the right argument · fb0a48ef

Duncan Sands authored Oct 31, 2010

calling convention out of the fast and normal ISel files, and
into the calling convention TD file.

llvm-svn: 117856

fb0a48ef

Oct 29, 2010

Inline asm multiple alternative constraints development phase 2 - improved... · e8360b71

John Thompson authored Oct 29, 2010

Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support.

llvm-svn: 117667

e8360b71

Oct 27, 2010
- x86-Win32: Switch ftol2 calling convention from stdcall to C. · 7db918f1
  Michael J. Spencer authored Oct 27, 2010
```
llvm-svn: 117474
```
  7db918f1
Oct 26, 2010
- An stdcall function calling a non-stdcall function · ec57ac1c
  Dale Johannesen authored Oct 25, 2010
```
cannot use tailcall.  PR 8461.

llvm-svn: 117322
```
  ec57ac1c