Commits · f3ac7336848a3dfd09b624b07b2296f8afaa4f20 · Roger Ferrer / llvm-epi-0.8

Jan 05, 2011
- Add a hidden command line option to display edge bundle graphs as they are · f3ac7336
  Jakob Stoklund Olesen authored Jan 05, 2011
```
calculated.

llvm-svn: 122912
```
  f3ac7336
- 80-cols. · c673b21a
  Eric Christopher authored Jan 05, 2011
```
llvm-svn: 122909
```
  c673b21a
Jan 04, 2011
- Remove TODO, these appear to be implemented. · 98851810
  Eric Christopher authored Jan 04, 2011
```
llvm-svn: 122849
```
  98851810
- Turn the EdgeBundles class into a stand-alone machine CFG analysis pass. · f96ae684
  Jakob Stoklund Olesen authored Jan 04, 2011
```
The analysis will be needed by both the greedy register allocator and the
X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't
change.

This pass is very fast, usually showing up as 0.0% wall time.

llvm-svn: 122832
```
  f96ae684
- Switch to path halving from path compression for a small speedup. This also · 5cd3d718
  Cameron Zwarich authored Jan 04, 2011
```
makes getLeader() nonrecursive.

llvm-svn: 122811
```
  5cd3d718
- Eliminate repeated allocation of a per-BB DenseMap for a 4.6% reduction of time · 82e8332a
  Cameron Zwarich authored Jan 04, 2011
```
spent in StrongPHIElimination on 403.gcc.

llvm-svn: 122803
```
  82e8332a
- Clean up a funky pass registration that got passed over when I got rid of static constructors. · 2e28697c
  Owen Anderson authored Jan 04, 2011
```
llvm-svn: 122795
```
  2e28697c
Jan 03, 2011
- Use a RecyclingAllocator to allocate values for MachineCSE's ScopedHashTable for · 18f164f7
  Cameron Zwarich authored Jan 03, 2011
```
a 28% speedup of MachineCSE time on 403.gcc.

llvm-svn: 122735
```
  18f164f7
Jan 02, 2011

split dom frontier handling stuff out to its own DominanceFrontier header, · bf0aa927
Chris Lattner authored Jan 02, 2011
```
so that Dominators.h is *just* domtree.  Also prune #includes a bit.

llvm-svn: 122714
```
bf0aa927

Try to reuse the value when lowering memset. · 25e6e06e

Benjamin Kramer authored Jan 02, 2011

This allows us to compile:
  void test(char *s, int a) {
    __builtin_memset(s, a, 15);
  }
into 1 mul + 3 stores instead of 3 muls + 3 stores.

llvm-svn: 122710

25e6e06e

Lower the i8 extension in memset to a multiply instead of a potentially long... · 2fdea4c8

Benjamin Kramer authored Jan 02, 2011

Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.

We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.

Example code:
  void test(char *s, int a) {
      __builtin_memset(s, a, 4);
  }
before:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    movl  %eax, %ecx
    shll  $8, %ecx
    orl %eax, %ecx
    movl  %ecx, %eax
    shll  $16, %eax
    orl %ecx, %eax
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret
after:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    imull $16843009, %eax, %eax   ## imm = 0x1010101
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret

llvm-svn: 122707

2fdea4c8

Dec 30, 2010
- Use getVRegDef() instead of def_iterator. This leads to fewer defs being added · 2f6dc10c
  Cameron Zwarich authored Dec 30, 2010
```
with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on
403.gcc.

llvm-svn: 122635
```
  2f6dc10c
Dec 29, 2010
- None of the other pass names in CodeGen have terminating periods. · 329cd49c
  Cameron Zwarich authored Dec 29, 2010
```
llvm-svn: 122628
```
  329cd49c
- Instead of processing every instruction when splitting interferences, only · 0507f446
  Cameron Zwarich authored Dec 29, 2010
```
process those instructions that define phi sources. This is a 47% speedup of
StrongPHIElimination compile time on 403.gcc.

llvm-svn: 122627
```
  0507f446
- Add a missing word to a comment. · bfef0751
  Cameron Zwarich authored Dec 29, 2010
```
llvm-svn: 122625
```
  bfef0751
- Add text explaining an assertion. · 458fd305
  Cameron Zwarich authored Dec 29, 2010
```
llvm-svn: 122617
```
  458fd305
- Simplify some code in MachineVerifier that was doing the correct thing, but not · 6fe33fdd
  Cameron Zwarich authored Dec 28, 2010
```
in the most obvious way.

llvm-svn: 122610
```
  6fe33fdd
- Revert the optimization in r122596. It is correct for all current targets, but · 146666ea
  Cameron Zwarich authored Dec 28, 2010
```
it relies on assumptions that may not be true in the future.

llvm-svn: 122608
```
  146666ea
Dec 28, 2010

Avoid iterating every operand of an instruction in StrongPHIElimination, since · 92f6e429

Cameron Zwarich authored Dec 28, 2010

we are only interested in the defs when discovering interferences.

This is a 28% speedup running StrongPHIElimination on 403.gcc.

llvm-svn: 122596

92f6e429

Pacify the compiler. BestWeight cannot in fact be used uninitialized · 496770de
Duncan Sands authored Dec 28, 2010
```
in this function, but the compiler was warning that it might be when
doing a release build.

llvm-svn: 122595
```
496770de

Dec 27, 2010

Change an assertion to assert what the code actually relies upon. · 5e5cfbe8
Cameron Zwarich authored Dec 27, 2010
```
llvm-svn: 122586
```
5e5cfbe8

Land a first cut at StrongPHIElimination. There are only 5 new test failures · 25d046ce

Cameron Zwarich authored Dec 27, 2010

when running without the verifier, and I have not yet checked them to see if
the new results are still correct. There are more verifier failures, but they
all seem to be additional occurrences of verifier failures that occur with the
existing PHIElimination pass. There are a few obvious issues with the code:

1) It doesn't properly update the register equivalence classes during copy
insertion, and instead recomputes them before merging live intervals and
renaming registers. I wanted to keep this first patch simple for debugging
purposes, but it shouldn't be very hard to do this.

2) It doesn't mix the renaming and live interval merging with the copy insertion
process, which leads to a lot of virtual register churn. Virtual registers and
live intervals are created, only to later be merged into others. The code should
be smarter and only create a new virtual register if there is no existing
register in the same congruence class.

3) In one place the code uses a DenseMap per basic block, which is unnecessary
heap allocation. There should be an inline storage version of DenseMap.

I did a quick compile-time test of running llc on 403.gcc with and without
StrongPHIElimination. It is slightly slower with StrongPHIElimination, because
the small decrease in the coalescer runtime can't beat the increase in phi
elimination runtime. Perhaps fixing the above performance issues will narrow
the gap.

I also haven't yet run any tests of the quality of the generated code.

llvm-svn: 122582

25d046ce

Add knowledge of phi-def and phi-kill valnos to MachineVerifier's predecessor · b95bfe16

Cameron Zwarich authored Dec 27, 2010

valno verification. The "Different value live out of predecessor" check is
incorrect in the case of phi-def valnos, so just skip that check for phi-def
valnos and instead check that all of the valnos for predecessors have phi-kill.
Fixes PR8863.

llvm-svn: 122581

b95bfe16

Dec 24, 2010

Minor cleanup related to my latest scheduler changes. · 5ce945ca
Andrew Trick authored Dec 24, 2010
```
llvm-svn: 122545
```
5ce945ca

Fix a few cases where the scheduler is not checking for phys reg copies. The... · c9405669

Andrew Trick authored Dec 24, 2010

Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck.

llvm-svn: 122544

c9405669

Various bits of framework needed for precise machine-level selection · 10ffc2b6

Andrew Trick authored Dec 24, 2010

DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.

llvm-svn: 122541

10ffc2b6

whitespace · c416ba61
Andrew Trick authored Dec 24, 2010
```
llvm-svn: 122539
```
c416ba61
Simplify a check for implicit defs and remove a FIXME. · ab434079
Cameron Zwarich authored Dec 24, 2010
```
llvm-svn: 122537
```
ab434079

Dec 23, 2010

flags -> glue for selectiondag · 11a33811
Chris Lattner authored Dec 23, 2010
```
llvm-svn: 122509
```
11a33811
sdisel flag -> glue. · f647e95b
Chris Lattner authored Dec 23, 2010
```
llvm-svn: 122507
```
f647e95b
Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue. · 528fad91
Andrew Trick authored Dec 23, 2010
```
llvm-svn: 122491
```
528fad91
Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows... · a52f325c
Andrew Trick authored Dec 23, 2010
```
Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle.

llvm-svn: 122474
```
a52f325c
In CheckForLiveRegDef use TRI->getOverlaps. · 12acde11
Andrew Trick authored Dec 23, 2010
```
llvm-svn: 122473
```
12acde11

Fixes PR8823: add-with-overflow-128.ll · 033efdf4

Andrew Trick authored Dec 23, 2010

In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472

033efdf4

Change all self assignments X=X to (void)X, so that we can turn on a · 9b43f336
Jeffrey Yasskin authored Dec 23, 2010
```
new gcc warning that complains on self-assignments and
self-initializations.

llvm-svn: 122458
```
9b43f336

DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. ... · 1f4dfbbc

Benjamin Kramer authored Dec 22, 2010

DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal.  The latter usually compiles into smaller code.

example code:
unsigned foo(unsigned x, unsigned y) {
  if (x != 0) y--;
  return y;
}

before:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
    notl  %eax                    ## encoding: [0xf7,0xd0]
    addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
    ret                           ## encoding: [0xc3]

after:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
    adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
    ret                           ## encoding: [0xc3]

llvm-svn: 122455

1f4dfbbc

Dec 22, 2010
- When RegAllocGreedy decides to spill the interferences of the current register, · 0acb69d5
  Jakob Stoklund Olesen authored Dec 22, 2010
```
pick the victim with the lowest total spill weight.

llvm-svn: 122445
```
  0acb69d5
- Include a shadow of the original CFG edges in the edge bundle graph. · 29836e65
  Jakob Stoklund Olesen authored Dec 22, 2010
```
llvm-svn: 122444
```
  29836e65
- Fix a bug in ReduceLoadWidth that wasn't handling extending · cafc1e60
  Chris Lattner authored Dec 22, 2010
```
loads properly.  We miscompiled the testcase into:

_test:                                  ## @test
	movl	$128, (%rdi)
	movzbl	1(%rdi), %eax
	ret

Now we get a proper:

_test:                                  ## @test
	movl	$128, (%rdi)
	movsbl	(%rdi), %eax
	movzbl	%ah, %eax
	ret

This fixes PR8757.

llvm-svn: 122392
```
  cafc1e60
- more cleanups, move a check for "roundedness" earlier to reject · 9a499e96
  Chris Lattner authored Dec 22, 2010
```
unhanded cases faster and simplify code.

llvm-svn: 122391
```
  9a499e96