Commits · 90d79bdcd26f27dac9c6da07d6dfcbcb505aea27 · Roger Ferrer / llvm-epi-0.8

Apr 21, 2011

Permit remat when a virtual register has multiple defs. · 90d79bdc

Jakob Stoklund Olesen authored Apr 20, 2011

TII::isTriviallyReMaterializable() shouldn't depend on any properties of the
register being defined by the instruction. Rematerialization is going to create
a new virtual register anyway.

llvm-svn: 129882

90d79bdc

Apr 20, 2011

Prefer cheap registers for busy live ranges. · 0e34c1df

Jakob Stoklund Olesen authored Apr 20, 2011

On the x86-64 and thumb2 targets, some registers are more expensive to encode
than others in the same register class.

Add a CostPerUse field to the TableGen register description, and make it
available from TRI->getCostPerUse. This represents the cost of a REX prefix or a
32-bit instruction encoding required by choosing a high register.

Teach the greedy register allocator to prefer cheap registers for busy live
ranges (as indicated by spill weight).

llvm-svn: 129864

0e34c1df

ARM byval support. Will be enabled by another patch to the FE. <rdar://problem/7662569> · 45fe3c38
Stuart Hastings authored Apr 20, 2011
```
llvm-svn: 129858
```
45fe3c38
Remove unused arguments. · e473aaf5
Rafael Espindola authored Apr 20, 2011
```
llvm-svn: 129844
```
e473aaf5

Rewrite the expander for umulo/smulo to remember to sign extend the input · bcaedb5c

Eric Christopher authored Apr 20, 2011

manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).

Fixes rdar://9292577

llvm-svn: 129842

bcaedb5c

ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS · cd01ed5b
Daniel Dunbar authored Apr 20, 2011
```
triple component.

llvm-svn: 129838
```
cd01ed5b

Apr 19, 2011

CodeGen: Eliminate a use of getDarwinMajorNumber(). · 4a7783b0

Daniel Dunbar authored Apr 19, 2011

 - There is a minor semantic change here (evidenced by the test change) for
   Darwin triples that have no version component. I debated changing the default
   behavior of isOSVersionLT, but decided it made more sense for triples to be
   explicit.

llvm-svn: 129802

4a7783b0

Delete unnecessary variable. <rdar://problem/7662569> · 468086d5
Stuart Hastings authored Apr 19, 2011
```
llvm-svn: 129796
```
468086d5

Avoid write-after-write issue hazards for Cortex-A9. · df612ba0

Bob Wilson authored Apr 19, 2011

Add a avoidWriteAfterWrite() target hook to identify register classes that
suffer from write-after-write hazards. For those register classes, try to avoid
writing the same register in two consecutive instructions.

This is currently disabled by default. We should not spill to avoid hazards!
The command line flag -avoid-waw-hazard can be used to enable waw avoidance.

llvm-svn: 129772

df612ba0

Force the greedy register allocator to be linked alongside linear scan. · af12138d
Jakob Stoklund Olesen authored Apr 19, 2011
```
This means that the new register allocator can be used with 'clang -mllvm -regalloc=greedy'.

llvm-svn: 129764
```
af12138d
SelectBasicBlock is rather slow even when it doesn't do anything; skip the · bcd09b3a
Eli Friedman authored Apr 19, 2011
```
unnecessary work where possible.

llvm-svn: 129763
```
bcd09b3a
Support nested CALLSEQ_BEGIN/END; necessary for ARM byval support. <rdar://problem/7662569> · 0b68c121
Stuart Hastings authored Apr 19, 2011
```
llvm-svn: 129761
```
0b68c121

Implement support for x86 fastisel of small fixed-sized memcpys, which are generated · 91328b31

Chris Lattner authored Apr 19, 2011

en-mass for C++ PODs.  On my c++ test file, this cuts the fast isel rejects by 10x 
and shrinks the generated .s file by 5%

llvm-svn: 129755

91328b31

Apr 18, 2011
- Simplify declarations slightly by using typedefs. · ec138b4b
  Eli Friedman authored Apr 18, 2011
```
llvm-svn: 129720
```
  ec138b4b
- Reduce clutter in asm output. Do not emit source location as comment for each instruction. · 17740e70
  Devang Patel authored Apr 18, 2011
```
llvm-svn: 129715
```
  17740e70
- Handle spilling around an instruction that has an early-clobber re-definition of · 9f294a9e
  Jakob Stoklund Olesen authored Apr 18, 2011
```
the spilled register.

This is quite common on ARM now that some stores have early-clobber defines.

llvm-svn: 129714
```
  9f294a9e
- Fix a bug where we were counting the alias sets as completely used · c37aa0b2
  Eric Christopher authored Apr 18, 2011
```
registers for fast allocation a different way. This has us updating
used registers only when we're using that exact register.

Fixes rdar://9207598

llvm-svn: 129711
```
  c37aa0b2
- while we're at it, handle 'sdiv exact' of a power of 2 also, · 48f75ad6
  Chris Lattner authored Apr 18, 2011
```
this fixes a few rejects on c++ iterator loops.

llvm-svn: 129694
```
  48f75ad6
- fix rdar://9297011 - udiv by power of two causing fast-isel rejects · 562d6e82
  Chris Lattner authored Apr 18, 2011
```
llvm-svn: 129693
```
  562d6e82
Apr 17, 2011

1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll · b53ccb8e

Chris Lattner authored Apr 17, 2011

2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts
3. teach tblgen to handle shift immediates that are different sizes than the
shifted operands, eliminating some code from the X86 fast isel backend.
4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function
instead of FastEmit_ri to simplify code.

llvm-svn: 129666

b53ccb8e

fix an oversight which caused us to compile the testcase (and other · 4832660b

Chris Lattner authored Apr 17, 2011

less trivial things) into a dummy lea.  Before we generated:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	leaq	(%rax), %rax
	ret

now we produce:

_test:                                  ## @test
	movq	_G@GOTPCREL(%rip), %rax
	ret

This is part of rdar://9289558

llvm-svn: 129662

4832660b

Fix rdar://9289512 - not folding load into compare at -O0 · 045c4385

Chris Lattner authored Apr 17, 2011

The basic issue here is that bottom-up isel is matching the branch
and compare, and was failing to fold the load into the branch/compare
combo.  Fixing this (by allowing folding into any instruction of a
sequence that is selected) allows us to produce things like:


cmpb    $0, 52(%rax)
je      LBB4_2

instead of:

movb    52(%rax), %cl
cmpb    $0, %cl
je      LBB4_2

This makes the generated -O0 code run a bit faster, but also speeds up
compile time by putting less pressure on the register allocator and 
generating less code.

This was one of the biggest classes of missing load folding.  Implementing
this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm)
line count.

llvm-svn: 129656

045c4385

split a complex predicate out to a helper function. Simplify two for loops, · d70ff0d8

Chris Lattner authored Apr 17, 2011

which don't need to check for falling off the end of a block *and* end of phi
nodes, since terminators are never phis.

llvm-svn: 129655

d70ff0d8

fix rdar://9289583 - fast isel should handle non-canonical commutative binops · fba7ca63
Chris Lattner authored Apr 17, 2011
```
allowing us to fold the immediate into the 'and' in this case:

int test1(int i) {
  return 8&i;
}

llvm-svn: 129653
```
fba7ca63

PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext. · 55b0acd6

Eli Friedman authored Apr 16, 2011

Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.

llvm-svn: 129650

55b0acd6

Apr 16, 2011

Unbreak the MSVC 2010 build. · beb17d93

Francois Pichet authored Apr 16, 2011

For further information on this particular issue see: http://connect.microsoft.com/VisualStudio/feedback/details/520043/error-converting-from-null-to-a-pointer-type-in-std-pair

llvm-svn: 129642

beb17d93

Remove unused variable. · 659bfb34
Benjamin Kramer authored Apr 16, 2011
```
llvm-svn: 129639
```
659bfb34
Put each personality function in a section. This fixes the gnu ld warning: · a83b1770
Rafael Espindola authored Apr 16, 2011
```
  error in foo.o; no .eh_frame_hdr table will be created.

llvm-svn: 129635
```
a83b1770

Fix divmod libcall lowering. Convert to {S|U}DIVREM first and then expand the... · b14ce09f

Evan Cheng authored Apr 16, 2011

Fix divmod libcall lowering. Convert to {S|U}DIVREM first and then expand the node to a libcall. rdar://9280991

llvm-svn: 129633

b14ce09f

Introduce support to encode Objective-C property information in debugging... · 514b4006

Devang Patel authored Apr 16, 2011

Introduce support to encode Objective-C property information in debugging information generated for an interface.

llvm-svn: 129624

514b4006

Apr 15, 2011

Some refactoring suggested by Anton Korobeynikov. · beb74c3f
Rafael Espindola authored Apr 15, 2011
```
llvm-svn: 129600
```
beb74c3f

Teach the SplitKit blitter to handle multiply defined values as well. · 1af8b4dc

Jakob Stoklund Olesen authored Apr 15, 2011

The transferValues() function can now handle both singly and multiply defined
values, as long as the resulting live range is known. Only rematerialized values
have their live range recomputed by extendRange().

The updateSSA() function can now insert PHI values in bulk across multiple
values in multiple target registers in one pass. The list of blocks received
from transferValues() is in layout order which seems to work well for the
iterative algorithm. Blocks from extendRange() are still in reverse BFS order,
but this function is used so rarely now that it doesn't matter.

llvm-svn: 129580

1af8b4dc

Remember to set flag. · 871f7060
Jakob Stoklund Olesen authored Apr 15, 2011
```
llvm-svn: 129579
```
871f7060

Add 129518 back with a fix for when we are producing eh just because of debug info. · a01cdb0e

Rafael Espindola authored Apr 15, 2011

Change ELF systems to use CFI for producing the EH tables. This reduces the
size of the clang binary in Debug builds from 690MB to 679MB.

llvm-svn: 129571

a01cdb0e

Fix a ton of comment typos found by codespell. Patch by · 0ab5e2cd
Chris Lattner authored Apr 15, 2011
```
Luis Felipe Strano Moraes!

llvm-svn: 129558
```
0ab5e2cd
Revert r129518, "Change ELF systems to use CFI for producing the EH tables. This reduces the" · b5e3e9dd
NAKAMURA Takumi authored Apr 15, 2011
```
It broke several builds.

llvm-svn: 129557
```
b5e3e9dd

Apr 14, 2011

Fix another instance of the DAG combiner not using the correct type for the RHS of a shift. · a519284f
Owen Anderson authored Apr 14, 2011
```
llvm-svn: 129522
```
a519284f
Change ELF systems to use CFI for producing the EH tables. This reduces the · aa2a7cd8
Rafael Espindola authored Apr 14, 2011
```
size of the clang binary in Debug builds from 690MB to 679MB.

llvm-svn: 129518
```
aa2a7cd8

In the pre-RA scheduler, maintain cmp+br proximity. · bfbd972b

Andrew Trick authored Apr 14, 2011

This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.

The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.

Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump

Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion

llvm-svn: 129508

bfbd972b

sink a call into its only use. · 493b3e72
Chris Lattner authored Apr 14, 2011
```
llvm-svn: 129503
```
493b3e72