Commits · af12138d10920d5cdae9c0415901dcaab1351599 · Roger Ferrer / llvm-epi-0.8

Apr 19, 2011

Force the greedy register allocator to be linked alongside linear scan. · af12138d
Jakob Stoklund Olesen authored Apr 19, 2011
```
This means that the new register allocator can be used with 'clang -mllvm -regalloc=greedy'.

llvm-svn: 129764
```
af12138d
SelectBasicBlock is rather slow even when it doesn't do anything; skip the · bcd09b3a
Eli Friedman authored Apr 19, 2011
```
unnecessary work where possible.

llvm-svn: 129763
```
bcd09b3a
Support nested CALLSEQ_BEGIN/END; necessary for ARM byval support. <rdar://problem/7662569> · 0b68c121
Stuart Hastings authored Apr 19, 2011
```
llvm-svn: 129761
```
0b68c121
Trivial simplification. · 6a85be25
Jay Foad authored Apr 19, 2011
```
llvm-svn: 129759
```
6a85be25

Implement support for x86 fastisel of small fixed-sized memcpys, which are generated · 91328b31

Chris Lattner authored Apr 19, 2011

en-mass for C++ PODs.  On my c++ test file, this cuts the fast isel rejects by 10x 
and shrinks the generated .s file by 5%

llvm-svn: 129755

91328b31

tidy up · 34a08c23
Chris Lattner authored Apr 19, 2011
```
llvm-svn: 129753
```
34a08c23

Implement support for fast isel of calls of i1 arguments, even though they are illegal, · 5f4b7834

Chris Lattner authored Apr 19, 2011

when they are a truncate from something else. This eliminates fully half of all the
fastisel rejections on a test c++ file I'm working with, which should make a substantial
improvement for -O0 compile of c++ code.

This fixed rdar://9297003 - fast isel bails out on all functions taking bools

llvm-svn: 129752

5f4b7834

Handle i1/i8/i16 constant integer arguments to calls by prepromoting them. · d7f7c939

Chris Lattner authored Apr 19, 2011

Before we would bail out on i1 arguments all together, now we just bail on
non-constant ones.  Also, we used to emit extraneous code.  e.g. test12 was:

	movb	$0, %al
	movzbl	%al, %edi
	callq	_test12

and test13 was:
	movb	$0, %al
	xorl	%edi, %edi
	movb	%al, 7(%rsp)
	callq	_test13f

Now we get:

	movl	$0, %edi
	callq	_test12
and:
	movl	$0, %edi
	callq	_test13f

llvm-svn: 129751

d7f7c939

be layout aware, to produce: · c59290a3

Chris Lattner authored Apr 19, 2011

	testb	$1, %al
	je	LBB0_2
## BB#1:                                ## %if.then
	movb	$0, %al

instead of:

	testb	$1, %al
	jne	LBB0_1
	jmp	LBB0_2
LBB0_1:                                 ## %if.then
	movb	$0, %al

how 'bout that.

llvm-svn: 129749

c59290a3

fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry, · 2c8a4c3b
Chris Lattner authored Apr 19, 2011
```
a common cause of fast isel rejects on c++ code.

llvm-svn: 129748
```
2c8a4c3b

Change A9 scheduling itineraries VLD* / VST* entries default to "aligned". That · 7d6cd490

Evan Cheng authored Apr 19, 2011

is, it assumes addresses are 64-bit aligned (which should be the more common
case). If the alignment is found not to be aligned, then getOperandLatency()
would adjust the operand latency computation by one to compensate for it.
rdar://9294833

llvm-svn: 129742

7d6cd490

Do not lose mem_operands while lowering VLD / VST intrinsics. · 40791337
Evan Cheng authored Apr 19, 2011
```
llvm-svn: 129738
```
40791337
Use ArrayRef variants. · 0c773249
Devang Patel authored Apr 18, 2011
```
llvm-svn: 129735
```
0c773249

Add BumpPtrAllocator::getTotalMemory() to allow clients to query how much... · 28af26d8

Ted Kremenek authored Apr 18, 2011

Add BumpPtrAllocator::getTotalMemory() to allow clients to query how much memory a BumpPtrAllocator allocated.

llvm-svn: 129727

28af26d8

Apr 18, 2011
- Trim a few unneeded includes. · ddac5dd2
  Jim Grosbach authored Apr 18, 2011
```
llvm-svn: 129723
```
  ddac5dd2
- Invert the meaning of printAliasInstr's return value. It now returns · 2e3fbaab
  Eric Christopher authored Apr 18, 2011
```
true on success and false on failure. Update callers.

llvm-svn: 129722
```
  2e3fbaab
- Simplify declarations slightly by using typedefs. · ec138b4b
  Eli Friedman authored Apr 18, 2011
```
llvm-svn: 129720
```
  ec138b4b
- malloc elimination: it's a bad idea to use raw_svector_ostream on a · b2545fbc
  Eli Friedman authored Apr 18, 2011
```
small heap-allocated SmallString because it unconditionally forces a malloc.

(Revised version of r129688, with the necessary flush() call.)

llvm-svn: 129716
```
  b2545fbc
- Reduce clutter in asm output. Do not emit source location as comment for each instruction. · 17740e70
  Devang Patel authored Apr 18, 2011
```
llvm-svn: 129715
```
  17740e70
- Handle spilling around an instruction that has an early-clobber re-definition of · 9f294a9e
  Jakob Stoklund Olesen authored Apr 18, 2011
```
the spilled register.

This is quite common on ARM now that some stores have early-clobber defines.

llvm-svn: 129714
```
  9f294a9e
- Small fix to the ARM AsmParser to ensure that a · 5d73033e
  Sean Callanan authored Apr 18, 2011
```
superclass variable is instantiated properly.

llvm-svn: 129713
```
  5d73033e
- Fix a bug where we were counting the alias sets as completely used · c37aa0b2
  Eric Christopher authored Apr 18, 2011
```
registers for fast allocation a different way. This has us updating
used registers only when we're using that exact register.

Fixes rdar://9207598

llvm-svn: 129711
```
  c37aa0b2
- Mark some functions as used which are used within debug-only code. This · 2b1ba48f
  Chandler Carruth authored Apr 18, 2011
```
silences Clang's -Wunused-function when building in release mode.

llvm-svn: 129709
```
  2b1ba48f
- while we're at it, handle 'sdiv exact' of a power of 2 also, · 48f75ad6
  Chris Lattner authored Apr 18, 2011
```
this fixes a few rejects on c++ iterator loops.

llvm-svn: 129694
```
  48f75ad6
- fix rdar://9297011 - udiv by power of two causing fast-isel rejects · 562d6e82
  Chris Lattner authored Apr 18, 2011
```
llvm-svn: 129693
```
  562d6e82
- Add a new bit that ImmLeaf's can opt into, which allows them to duck out of · 80254a53
  Chris Lattner authored Apr 18, 2011
```
the generated FastISel.  X86 doesn't need to generate code to match ADD16ri8 
since ADD16ri will do just fine.  This is a small codesize win in the generated
instruction selector.

llvm-svn: 129692
```
  80254a53
- Revert r129688; it's breaking buildbots. · 3f8ecf5c
  Eli Friedman authored Apr 18, 2011
```
llvm-svn: 129689
```
  3f8ecf5c
- More malloc elimination: it's a bad idea to use raw_svector_ostream on a · 2dc287a1
  Eli Friedman authored Apr 18, 2011
```
small heap-allocated SmallString because it unconditionally forces a malloc.

llvm-svn: 129688
```
  2dc287a1
- Make the StringMaps attached to MCContext use the MCContext's allocator; · 0e40208d
  Eli Friedman authored Apr 18, 2011
```
reduces the number of calls to malloc().

llvm-svn: 129687
```
  0e40208d
- switch the rest of the x86 immediate patterns over to ImmLeaf, · c479e063
  Chris Lattner authored Apr 17, 2011
```
simplifying them and exposing more information to tblgen.  It would be nice
if other target authors adopted this as well, particularly arm since it has fastisel.

llvm-svn: 129676
```
  c479e063
- now that predicates have a decent abstraction layer on them, introduce a new · 2ff8c1a2
  Chris Lattner authored Apr 17, 2011
```
kind of predicate: one that is specific to imm nodes.  The predicate function
specified here just checks an int64_t directly instead of messing around with
SDNode's.  The virtue of this is that it means that fastisel and other things
can reason about these predicates.

llvm-svn: 129675
```
  2ff8c1a2
Apr 17, 2011

Rework our internal representation of node predicates to expose more · 514e292b

Chris Lattner authored Apr 17, 2011

structure and fix some fixmes.  We now have a TreePredicateFn class
that handles all of the decoding of these things.  This is an internal
cleanup that has no impact on the code generated by tblgen.

llvm-svn: 129670

514e292b

1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll · b53ccb8e

Chris Lattner authored Apr 17, 2011

2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts
3. teach tblgen to handle shift immediates that are different sizes than the
shifted operands, eliminating some code from the X86 fast isel backend.
4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function
instead of FastEmit_ri to simplify code.

llvm-svn: 129666

b53ccb8e

fix an x86 fast isel issue where we'd completely give up on folding an address · eb729d48

Chris Lattner authored Apr 17, 2011

when we have a global variable base an an index.  Instead, just give up on
folding the global variable.

Before we'd geenrate:

_test:                                  ## @test
## BB#0:
	movq	_rtx_length@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	addq	%rdi, %rax
	movzbl	(%rax), %eax
	ret

now we generate:

_test:                                  ## @test
## BB#0:
	movq	_rtx_length@GOTPCREL(%rip), %rax
	movzbl	(%rax,%rdi), %eax
	ret

The difference is even more significant when there is a scale
involved.

This fixes rdar://9289558 - total fail with addr mode formation at -O0/x86-64

llvm-svn: 129664

eb729d48

fix an oversight which caused us to compile the testcase (and other · 4832660b

Chris Lattner authored Apr 17, 2011

less trivial things) into a dummy lea.  Before we generated:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	ret

now we produce:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	ret

This is part of rdar://9289558

llvm-svn: 129662

4832660b

tidy up and reduce indentation. · 4b026b96
Chris Lattner authored Apr 17, 2011
```
llvm-svn: 129661
```
4b026b96

Fix rdar://9289512 - not folding load into compare at -O0 · 045c4385

Chris Lattner authored Apr 17, 2011

The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo.  Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:


cmpb    $0, 52(%rax)
je      LBB4_2

instead of:

movb    52(%rax), %cl
cmpb    $0, %cl
je      LBB4_2

This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and 
generating less code.

This was one of the biggest classes of missing load folding.  Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.

llvm-svn: 129656

045c4385

split a complex predicate out to a helper function. Simplify two for loops, · d70ff0d8

Chris Lattner authored Apr 17, 2011

which don't need to check for falling off the end of a block *and* end of phi
nodes, since terminators are never phis.

llvm-svn: 129655

d70ff0d8

Remove working entry from README. · 55f7bf32
Eli Friedman authored Apr 17, 2011
```
llvm-svn: 129654
```
55f7bf32
fix rdar://9289583 - fast isel should handle non-canonical commutative binops · fba7ca63
Chris Lattner authored Apr 17, 2011
```
allowing us to fold the immediate into the 'and' in this case:

int test1(int i) {
  return 8&i;
}

llvm-svn: 129653
```
fba7ca63