Commits · 27e0a4ab86a9066f19e4027d2a6ead4e51e40d76 · Roger Ferrer / llvm-epi-0.8

Mar 05, 2011

Jakob Stoklund Olesen authored Mar 05, 2011

The coalescer can in very rare cases leave too large live intervals around after
rematerializing cheap-as-a-move instructions.

Linear scan doesn't really care, but live range splitting gets very confused
when a live range is killed by a ghost instruction.

I will fix this properly in the coalescer after 2.9 branches.

llvm-svn: 127096

27e0a4ab

Be explicit with abs(). Visual Studio workaround. · 25cedf3f
Andrew Trick authored Mar 05, 2011
```
llvm-svn: 127075
```
25cedf3f
Fix for -sched-high-latency-cycles in sched=list-ilp mode. · d7f4c216
Andrew Trick authored Mar 05, 2011
```
llvm-svn: 127071
```
d7f4c216
Missing comment. · b8390b7a
Andrew Trick authored Mar 05, 2011
```
llvm-svn: 127068
```
b8390b7a

Increased the register pressure limit on x86_64 from 8 to 12 · 641e2d4f

Andrew Trick authored Mar 05, 2011

regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.

Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.

Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.

llvm-svn: 127067

641e2d4f

Rework the global split cost calculation. · 1a9b66c7

Jakob Stoklund Olesen authored Mar 05, 2011

The global cost is the sum of block frequencies for spill code that must be
inserted because preferences weren't met.

llvm-svn: 127062

1a9b66c7

Compute the constraints for global live range splitting from an interference pattern. · 4b598e15

Jakob Stoklund Olesen authored Mar 05, 2011

This simplifies the code and makes it faster too.

The interference patterns are saved for each candidate register. It will be
reused for actually executing the split. Work in progress.

llvm-svn: 127054

4b598e15

Teach the register scavenger to take subregs into account when finding a free register. · dc55428d
Jim Grosbach authored Mar 05, 2011
```
llvm-svn: 127049
```
dc55428d

Mar 04, 2011

Improve readability with some whitespace! · 40326989
Eric Christopher authored Mar 04, 2011
```
llvm-svn: 127043
```
40326989
Extract a method. No functional change. · 05a2f517
Jakob Stoklund Olesen authored Mar 04, 2011
```
llvm-svn: 127040
```
05a2f517

Go back to comparing spill weights when deciding if interference can be evicted. · d7e1bb80

Jakob Stoklund Olesen authored Mar 04, 2011

It gives better results. Sometimes, a live range can be large and still have
high spill weight. Such a range should not be spilled.

llvm-svn: 127036

d7e1bb80

Renumber slot indexes locally when possible. · b8e6fdc2

Jakob Stoklund Olesen authored Mar 04, 2011

Initially, slot indexes are quad-spaced. There is room for inserting up to 3
new instructions between the original instructions.

When we run out of indexes between two instructions, renumber locally using
double-spaced indexes. The original quad-spacing means that we catch up quickly,
and we only have to renumber a handful of instructions to get a monotonic
sequence. This is much faster than renumbering the whole function as we did
before.

llvm-svn: 127023

b8e6fdc2

Number SlotIndexes uniformly without looking at the number of defs on each instruction. · 348d8e8b

Jakob Stoklund Olesen authored Mar 04, 2011

You can't really predict how many indexes will be needed from the number of
defs, so let's keep it simple.

Also remove an extra empty index that was inserted after each basic block. It
was intended for live-out ranges, but it was never used that way.

llvm-svn: 127014

348d8e8b

Add SlotIndex statistics. · b88f6adf
Jakob Stoklund Olesen authored Mar 04, 2011
```
llvm-svn: 127007
```
b88f6adf
Tweak debug output. No functional changes. · d4f78895
Jakob Stoklund Olesen authored Mar 04, 2011
```
llvm-svn: 127006
```
d4f78895

Revert commit 126684 "Use the correct shift amount type". It is only the correct · 6bd10442

Duncan Sands authored Mar 04, 2011

type after type legalization has completed.  Before then it may simply not be big
enough to hold the shift amount, particularly on x86 which uses a very small type
for shifts (this issue broke stuff in the past which is why LegalizeTypes carefully
uses a large type for shift amounts).

llvm-svn: 127000

6bd10442

Minor pre-RA-sched fixes and cleanup. · c88b7ecb

Andrew Trick authored Mar 04, 2011

Fix the PendingQueue, then disable it because it's not required for
the current schedulers' heuristics.
Fix the logic for the unused list-ilp scheduler.

llvm-svn: 126981

c88b7ecb

Precompute block frequencies, pow() isn't free. · c332e727
Jakob Stoklund Olesen authored Mar 04, 2011
```
llvm-svn: 126975
```
c332e727
Use an IndexedMap instead of a DenseMap for the live-out cache. · 1a69e233
Jakob Stoklund Olesen authored Mar 04, 2011
```
This speeds up updateSSA() so it only accounts for 5% of the live range
splitting time.

llvm-svn: 126972
```
1a69e233

There are times when the landing pad won't have a call to 'eh.selector' in · f3658f38

Bill Wendling authored Mar 03, 2011

it. It's been assumed up til now that it would be in its immediate
successor. However, this isn't necessarily the case. It could be in one of its
successor's successors.

Modify the code to more thoroughly check for an 'eh.selector' call in
successors. It only looks at a successor if we get there as a result of an
unconditional branch.

Testcase ObjC/exceptions-4.m in r126968.

llvm-svn: 126969

f3658f38

Mar 03, 2011

Revert r123908; the code in question is completely untested and wrong. · d8a555bb
Eli Friedman authored Mar 03, 2011
```
llvm-svn: 126964
```
d8a555bb
Fix typo. · 63b3e763
Devang Patel authored Mar 03, 2011
```
llvm-svn: 126962
```
63b3e763
Fix thinko in previous check-in. · 34a7ab40
Devang Patel authored Mar 03, 2011
```
Add comment.

llvm-svn: 126959
```
34a7ab40

llvm::Function argument count is not a good indicator of how many arugments... · 4ab660b0

Devang Patel authored Mar 03, 2011

llvm::Function argument count is not a good indicator of how many arugments does the function have at source level. If we need more space, just resize vector conservatively. This vector is only used once per function.

llvm-svn: 126957

4ab660b0

Allow a target to choose whether to prefer the scavenger emergency spill slot · 7e200664
Jim Grosbach authored Mar 03, 2011
```
be next to the frame pointer or the stack pointer.

llvm-svn: 126956
```
7e200664

Renumber slot indexes uniformly instead of spacing according to the number of defs. · bfdbc115

Jakob Stoklund Olesen authored Mar 03, 2011

There are probably much larger speedups to be had by renumbering locally instead
of looping over the whole function. For now, the greedy register allocator is
25% faster.

llvm-svn: 126926

bfdbc115

Represent sentinel slot indexes with a null pointer. · 4ec757d5

Jakob Stoklund Olesen authored Mar 03, 2011

This is much faster than using a pointer to a ManagedStatic object accessed with
a function call. The greedy register allocator is 5% faster overall just from
the SlotIndex default constructor savings.

llvm-svn: 126925

4ec757d5

Avoid comparing invalid slot indexes, and assert that it doesn't happen. · 67a84d08

Jakob Stoklund Olesen authored Mar 03, 2011

The SlotIndex created by the default construction does not represent a position
in the function, and it doesn't make sense to compare it to other indexes.

llvm-svn: 126924

67a84d08

Avoid comparing invalid slot indexes. · a04dddf7
Jakob Stoklund Olesen authored Mar 03, 2011
```
llvm-svn: 126922
```
a04dddf7
Cache basic block bounds instead of asking SlotIndexes::getMBBRange all the time. · 9a6382fc
Jakob Stoklund Olesen authored Mar 03, 2011
```
This speeds up the greedy register allocator by 15%.
DenseMap is not as fast as one might hope.

llvm-svn: 126921
```
9a6382fc
Change the SplitEditor interface to a single instance can be shared for multiple splits. · c9601988
Jakob Stoklund Olesen authored Mar 03, 2011
```
llvm-svn: 126912
```
c9601988
Only run the updateSSA loop when we have actually seen multiple values. · 5ea0712e
Jakob Stoklund Olesen authored Mar 03, 2011
```
When only a single value has been seen, new PHIDefs are never needed.

llvm-svn: 126911
```
5ea0712e

Fix PHI handling in LiveIntervals::shrinkToUses(). · d58c8d12

Jakob Stoklund Olesen authored Mar 03, 2011

We need to wait until we meet a PHIDef in its defining block before resurrecting
PHIKills in the predecessors.

This should unbreak the llvm-gcc-build-x86_64-darwin10-x-mingw32-x-armeabi bot.

llvm-svn: 126905

d58c8d12

Avoid exponential blow-up when printing DAGs. · 24b3ba59

Bob Wilson authored Mar 02, 2011

David Greene changed CannotYetSelect() to print the full DAG including multiple
copies of operands reached through different paths in the DAG. Unfortunately
this blows up exponentially in some cases. The depth limit of 100 is way too
high to prevent this -- I'm seeing a message string of 150MB with a depth of
only 40 in one particularly bad case, even though the DAG has less than 200
nodes. Part of the problem is that the printing code is following chain
operands, so if you fail to select an operation with a chain, the printer will
follow all the chained operations back to the entry node.

llvm-svn: 126899

24b3ba59

Turn the Edit member into a pointer so it can change dynamically. · 815196ca
Jakob Stoklund Olesen authored Mar 02, 2011
```
No functional change.

llvm-svn: 126898
```
815196ca

Transfer simply defined values directly without recomputing liveness and SSA. · 503b143a

Jakob Stoklund Olesen authored Mar 02, 2011

Values that map to a single new value in a new interval after splitting don't
need new PHIDefs, and if the parent value was never rematerialized the live
range will be the same.

llvm-svn: 126894

503b143a

Extract a method. No functional change. · 3648263a
Jakob Stoklund Olesen authored Mar 02, 2011
```
llvm-svn: 126893
```
3648263a

Mar 02, 2011
- Can't introduce floating-point immediate constants after legalization. · 6b4007de
  Stuart Hastings authored Mar 02, 2011
```
Radar 9056407.

llvm-svn: 126864
```
  6b4007de
- Fix some typos. · daed6f6c
  Cameron Zwarich authored Mar 02, 2011
```
llvm-svn: 126829
```
  daed6f6c
- Move extendRange() into SplitEditor and delete the LiveRangeMap class. · 48af8923
  Jakob Stoklund Olesen authored Mar 02, 2011
```
Extract the updateSSA() method from the too long extendRange().

LiveOutCache can be shared among all the new intervals since there is at most
one of the new ranges live out from each basic block.

llvm-svn: 126818
```
  48af8923