Commits · 9953d86e635131a1e76ff76194143ae8ebbb4f07 · Roger Ferrer / llvm-epi-0.8

May 23, 2004
- Fixed up my changes to add support for cloning Machine Instructions. · 9953d86e
  Tanya Lattner authored May 23, 2004
```
llvm-svn: 13665
```
  9953d86e
- Adding support to clone MachineInstr · e6a4a7db
  Tanya Lattner authored May 23, 2004
```
llvm-svn: 13661
```
  e6a4a7db
- Inline both direct and indirect callees in the CBU phase because · 159ed21b
  Vikram S. Adve authored May 23, 2004
```
a direct callee may have indirect callees and so may have changed.

llvm-svn: 13649
```
  159ed21b
- Fix a typo in a comment. · d4889be5
  Brian Gaeke authored May 23, 2004
```
llvm-svn: 13643
```
  d4889be5
May 21, 2004
- Fix for PR340: Verifier misses malformed switch instruction · ab5aa144
  Chris Lattner authored May 21, 2004
```
llvm-svn: 13618
```
  ab5aa144
May 20, 2004
- Fix to make the CBE always emit comparisons inline. Hopefully this will · 83c139d5
  Chris Lattner authored May 20, 2004
```
fix the really bad code we're getting on PPC.

llvm-svn: 13609
```
  83c139d5
- Fix a bug in outputting branches to constant PCs. Since the PC is supplied as · 7b7a14d7
  Brian Gaeke authored May 20, 2004
```
a full 64-bit address, it must be adjusted to fit in the branch instruction's
immediate field. (This is only used in the reoptimizer, for now.)

llvm-svn: 13608
```
  7b7a14d7
May 19, 2004
- Minor simplification: eliminate a dyn_cast. · e8949b30
  Brian Gaeke authored May 19, 2004
```
Fix a typo in a debug message.

llvm-svn: 13607
```
  e8949b30
- Add CloneTraceInto(), which is based on (and has mostly the same · 72185765
  Brian Gaeke authored May 19, 2004
```
effects as) CloneFunctionInto().

llvm-svn: 13601
```
  72185765
- Move RemapInstruction() to ValueMapper, so that it can be shared with · 6182acf9
  Brian Gaeke authored May 19, 2004
```
CloneTrace, and because it is primarily an operation on ValueMaps.  It
is now a global (non-static) function which can be pulled in using
ValueMapper.h.

llvm-svn: 13600
```
  6182acf9
May 17, 2004
- Fold two assertions with backwards error messages into one with a · e8a6bf3d
  Brian Gaeke authored May 17, 2004
```
correct error message.

llvm-svn: 13590
```
  e8a6bf3d
May 14, 2004

Brian Gaeke authored May 14, 2004

Add better comments, including a better head-of-file comment.
Prune #includes.
Fix a FIXME that Chris put here by using doInitialization().
Use DEBUG() to print out debug msgs.
Give names to basic blocks inserted by this pass.
Expand tabs.
Use InsertProfilingInitCall() from ProfilingUtils to insert the initialize call.

llvm-svn: 13581

27e49435

Don't keep track of references to LLVM BasicBlocks while emitting; use · de5ccc18
Brian Gaeke authored May 14, 2004
```
MachineBasicBlocks instead.

llvm-svn: 13568
```
de5ccc18

Support MachineBasicBlock operands on RawFrm instructions. · 2b3a81cd

Brian Gaeke authored May 14, 2004

Get rid of separate numbering for LLVM BasicBlocks; use the automatically
generated MachineBasicBlock numbering.

llvm-svn: 13567

2b3a81cd

Generate branch machine instructions with MachineBasicBlock operands instead of · 35e73e1c
Brian Gaeke authored May 14, 2004
```
LLVM BasicBlock operands.

llvm-svn: 13566
```
35e73e1c

May 13, 2004

This was not meant to be committed · 0026512b
Chris Lattner authored May 13, 2004
```
llvm-svn: 13565
```
0026512b

Fix a nasty bug that caused us to unroll EXTREMELY large loops due to overflow · c12c945c

Chris Lattner authored May 13, 2004

in the size calculation.

This is not something you want to see:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - UNROLLING!

The problem was that 2*2147483648 == 0.

Now we get:
Loop Unroll: F[main] Loop %no_exit Loop Size = 2 Trip Count = 2147483648 - TOO LARGE: 4294967296>100

Thanks to some anonymous person playing with the demo page that repeatedly
caused zion to go into swapping land.  That's one way to ensure you'll get
a quick bugfix.  :)

Testcase here: Transforms/LoopUnroll/2004-05-13-DontUnrollTooMuch.ll

llvm-svn: 13564

c12c945c

Two more improvements for null pointer handling: storing a null pointer · 8e7aea02

Chris Lattner authored May 13, 2004

and passing a null pointer into a function.

For this testcase:

void %test(int** %X) {
  store int* null, int** %X
  call void %test(int** null)
  ret void
}

we now generate this:

test:
        sub %ESP, 12
        mov %EAX, DWORD PTR [%ESP + 16]
        mov DWORD PTR [%EAX], 0
        mov DWORD PTR [%ESP], 0
        call test
        add %ESP, 12
        ret

instead of this:

test:
        sub %ESP, 12
        mov %EAX, DWORD PTR [%ESP + 16]
        mov %ECX, 0
        mov DWORD PTR [%EAX], %ECX
        mov %EAX, 0
        mov DWORD PTR [%ESP], %EAX
        call test
        add %ESP, 12
        ret

llvm-svn: 13558

8e7aea02

Second half of my fixed-sized-alloca patch. This folds the LEA to compute · 593d22d6

Chris Lattner authored May 13, 2004

the alloca address into common operations like loads/stores.

In a simple testcase like this (which is just designed to excersize the
alloca A, nothing more):

int %test(int %X, bool %C) {
        %A = alloca int
        store int %X, int* %A
        store int* %A, int** %G
        br bool %C, label %T, label %F
T:
        call int %test(int 1, bool false)
        %V = load int* %A
        ret int %V
F:
        call int %test(int 123, bool true)
        %V2 = load int* %A
        ret int %V2
}

We now generate:

test:
        sub %ESP, 12
        mov %EAX, DWORD PTR [%ESP + 16]
        mov %CL, BYTE PTR [%ESP + 20]
***     mov DWORD PTR [%ESP + 8], %EAX
        mov %EAX, OFFSET G
        lea %EDX, DWORD PTR [%ESP + 8]
        mov DWORD PTR [%EAX], %EDX
        test %CL, %CL
        je .LBB2 # PC rel: F
.LBB1:  # T
        mov DWORD PTR [%ESP], 1
        mov DWORD PTR [%ESP + 4], 0
        call test
***     mov %EAX, DWORD PTR [%ESP + 8]
        add %ESP, 12
        ret
.LBB2:  # F
        mov DWORD PTR [%ESP], 123
        mov DWORD PTR [%ESP + 4], 1
        call test
***     mov %EAX, DWORD PTR [%ESP + 8]
        add %ESP, 12
        ret

Instead of:

test:
        sub %ESP, 20
        mov %EAX, DWORD PTR [%ESP + 24]
        mov %CL, BYTE PTR [%ESP + 28]
***     lea %EDX, DWORD PTR [%ESP + 16]
***     mov DWORD PTR [%EDX], %EAX
        mov %EAX, OFFSET G
        mov DWORD PTR [%EAX], %EDX
        test %CL, %CL
***     mov DWORD PTR [%ESP + 12], %EDX
        je .LBB2 # PC rel: F
.LBB1:  # T
        mov DWORD PTR [%ESP], 1
        mov %EAX, 0
        mov DWORD PTR [%ESP + 4], %EAX
        call test
***     mov %EAX, DWORD PTR [%ESP + 12]
***     mov %EAX, DWORD PTR [%EAX]
        add %ESP, 20
        ret
.LBB2:  # F
        mov DWORD PTR [%ESP], 123
        mov %EAX, 1
        mov DWORD PTR [%ESP + 4], %EAX
        call test
***     mov %EAX, DWORD PTR [%ESP + 12]
***     mov %EAX, DWORD PTR [%EAX]
        add %ESP, 20
        ret

llvm-svn: 13557

593d22d6

Substantially improve code generation for address exposed locals (aka fixed · 2bb33259

Chris Lattner authored May 13, 2004

sized allocas in the entry block).  Instead of generating code like this:

entry:
  reg1024 = ESP+1234
... (much later)
  *reg1024 = 17


Generate code that looks like this:
entry:
  (no code generated)
... (much later)
  t = ESP+1234
  *t = 17

The advantage being that we DRAMATICALLY reduce the register pressure for these
silly temporaries (they were all being spilled to the stack, resulting in very
silly code).  This is actually a manual implementation of rematerialization :)

I have a patch to fold the alloca address computation into loads & stores, which
will make this much better still, but just getting this right took way too much time
and I'm sleepy.

llvm-svn: 13554

2bb33259

May 12, 2004

Fix a really nasty bug from my changes on Monday to PHIElim. These changes · 6307b8d0

Chris Lattner authored May 12, 2004

broke obsequi and a lot of other things.  It all boiled down to MBB being
overloaded in an inner scope and me confusing it with the one in the outer
scope.  Ugh!

llvm-svn: 13517

6307b8d0

Start NextMBBNumber out at zero. · 4cf17775
Brian Gaeke authored May 12, 2004
```
llvm-svn: 13515
```
4cf17775

Add non-const MachineBasicBlock::getParent() accessor method. · cb5d22a0

Brian Gaeke authored May 12, 2004

MBBs start out as #-1. When a MBB is added to a MachineFunction, it
gets the next available unique MBB number. If it is removed from a
MachineFunction, it goes back to being #-1.

llvm-svn: 13514

cb5d22a0

Pass boolean constants into function calls more efficiently, generating: · e2d382e1

Chris Lattner authored May 12, 2004

        mov DWORD PTR [%ESP + 4], 1

instead of:

        mov %EAX, 1
        mov DWORD PTR [%ESP + 4], %EAX

llvm-svn: 13494

e2d382e1

Do not pass in the same argument to the extracted function more than once, and · 66219aba
Chris Lattner authored May 12, 2004
```
give the extracted function a more useful name than just foo_code.

llvm-svn: 13493
```
66219aba
Implement support for code extracting basic blocks that have a return · 13d2ddfe
Chris Lattner authored May 12, 2004
```
instruction in them.

llvm-svn: 13490
```
13d2ddfe

Implement splitting of PHI nodes, allowing block extraction of BB's that have · 795c9933

Chris Lattner authored May 12, 2004

PHI node entries from multiple outside-the-region blocks.  This also fixes
extraction of the entry block in a function.  Yaay.

This has successfully block extracted all (but one) block from the score_move
function in obsequi (out of 33).  Hrm, I wonder which block the bug is in.  :)

llvm-svn: 13489

795c9933

* Pull some code out into the definedInRegion/definedInCaller methods · 3b2917bf

Chris Lattner authored May 12, 2004

* Add a stub for the severSplitPHINodes which will allow us to bbextract
  bb's with PHI nodes in them soon.
* Remove unused arguments from findInputsOutputs
* Dramatically simplify the code in findInputsOutputs.  In particular,
  nothing really cares whether or not a PHI node is using something.
* Move moveCodeToFunction to after emitCallAndSwitchStatement as that's the
  order they get called.
* Fix a bug where we would code extract a region that included a call to
  vastart.  Like 'alloca', calls to vastart must stay in the function that
  they are defined in.
* Add some comments.

llvm-svn: 13482

3b2917bf

Generate substantially better code when there are a limited number of exits · ffc49262

Chris Lattner authored May 12, 2004

from the extracted region.  If the return has 0 or 1 exit blocks, the new
function returns void.  If it has 2 exits, it returns bool, otherwise it
returns a ushort as before.

This allows us to use a conditional branch instruction when there are two
exit blocks, as often happens during block extraction.

llvm-svn: 13481

ffc49262

Two minor improvements: · 3d1ca67f

Chris Lattner authored May 12, 2004

  1. Get rid of the silly abort block.  When doing bb extraction, we get one
     abort block for every block extracted, which is kinda annoying.
  2. If the switch ends up having a single destination, turn it into an
     unconditional branch.

I would like to add support for conditional branches, but to do this we will
want to have the function return a bool instead of a ushort.

llvm-svn: 13478

3d1ca67f

May 10, 2004
- Switch this from using an std::map to using a DenseMap. This speeds up · 24f200ad
  Chris Lattner authored May 10, 2004
```
phi-elimination from 0.6 to 0.54s on kc++.

llvm-svn: 13454
```
  24f200ad
- Use a new VRegPHIUseCount to compute uses of PHI values by other phi values · 39a1e0a3
  Chris Lattner authored May 10, 2004
```
in the basic block being processed.  This fixes PhiElimination on kimwitu++
from taking 105s to taking a much more reasonable 0.6s (in a debug build).

llvm-svn: 13453
```
  39a1e0a3
- Now that we use an ilist of machine instructions, iterators are more robust · a2f7b9bd
  Chris Lattner authored May 10, 2004
```
than before.  Because this is the case, we can compute the first non-phi
instruction once when de-phi'ing a block.  This shaves ~4s off of
phielimination of _Z7yyparsev in kimwitu++ from 109s -> 105s.  There are
still much more important gains to come.

llvm-svn: 13452
```
  a2f7b9bd
- Fix a fairly serious pessimizaion that was preventing us from efficiently · 72fb3256
  Chris Lattner authored May 10, 2004
```
compiling things like 'add long %X, 1'.  The problem is that we were switching
the order of the operands for longs even though we can't fold them yet.

llvm-svn: 13451
```
  72fb3256
- Patch to fix PR337. Make sure to mark all aliased physical registers as used · 5eb8094c
  Chris Lattner authored May 10, 2004
```
when we see a read of a register.  This is important in cases like:

AL = ...
AH = ...

   = AX

The read of AX must make both the AL and AH defs live until the use.

llvm-svn: 13444
```
  5eb8094c
- Fix some comments, avoid sign extending booleans when zero extend works fine · a367dd74
  Chris Lattner authored May 09, 2004
```
llvm-svn: 13440
```
  a367dd74
- Generate more efficient code for casting booleans to integers (no sign extension required) · 1542a98e
  Chris Lattner authored May 09, 2004
```
llvm-svn: 13439
```
  1542a98e
May 09, 2004
- syntactically loopify natural loops so that the GCC loop optimizer can find... · f719a520
  Chris Lattner authored May 09, 2004
```
syntactically loopify natural loops so that the GCC loop optimizer can find them.  This should *dramatically* improve the performance of CBE compiled code on targets that depend on GCC's loop optimizations (like PPC)

llvm-svn: 13438
```
  f719a520
- Do not emit prototypes for setjmp/longjmp, as they are handled specially · 87d036ca
  Chris Lattner authored May 09, 2004
```
llvm-svn: 13437
```
  87d036ca
- Fine grainify namespacification · 17fcb67a
  Chris Lattner authored May 09, 2004
```
llvm-svn: 13436
```
  17fcb67a