Commits · 00c436824fd4027799b3e6c344babd24b1d5861d · Roger Ferrer / llvm-epi-0.8

Jan 19, 2005

When an instruction moves, make sure to update the VarInfo::Kills list as · 00c43682
Chris Lattner authored Jan 19, 2005
```
well as all of teh other stuff in livevar. This fixes the compiler crash
on fourinarow last night.

llvm-svn: 19695
```
00c43682
When commuting these instructions, make sure to actually swap the operands too. · 25be208e
Chris Lattner authored Jan 19, 2005
```
llvm-svn: 19694
```
25be208e
Fix 'raise' to work with packed types. Patch by Morten Ofstad. · a3cc1835
Chris Lattner authored Jan 19, 2005
```
llvm-svn: 19693
```
a3cc1835

Implement Regression/CodeGen/X86/rotate.ll: emit rotate instructions (which · de87d146

Chris Lattner authored Jan 19, 2005

typically cost 1 cycle) instead of shld/shrd instruction (which are typically
6 or more cycles).  This also saves code space.

For example, instead of emitting:

rotr:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %CL, BYTE PTR [%ESP + 8]
        shrd %EAX, %EAX, %CL
        ret
rotli:
        mov %EAX, DWORD PTR [%ESP + 4]
        shrd %EAX, %EAX, 27
        ret

Emit:

rotr32:
        mov %CL, BYTE PTR [%ESP + 8]
        mov %EAX, DWORD PTR [%ESP + 4]
        ror %EAX, %CL
        ret
rotli32:
        mov %EAX, DWORD PTR [%ESP + 4]
        ror %EAX, 27
        ret

We also emit byte rotate instructions which do not have a sh[lr]d counterpart
at all.

llvm-svn: 19692

de87d146

Add rotate instructions. · 0edf9535
Chris Lattner authored Jan 19, 2005
```
llvm-svn: 19690
```
0edf9535
Match 16-bit shld/shrd instructions as well, implementing shift-double.llx:test5 · 29f58191
Chris Lattner authored Jan 19, 2005
```
llvm-svn: 19689
```
29f58191
Improve coverage of the X86 instruction set by adding 16-bit shift doubles. · d54845f5
Chris Lattner authored Jan 19, 2005
```
llvm-svn: 19687
```
d54845f5

Teach the code generator that shrd/shld is commutable if it has an immediate. · 29478017

Chris Lattner authored Jan 19, 2005

This allows us to generate this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, DWORD PTR [%ESP + 8]
        shld %EDX, %EDX, 2
        shl %EAX, 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
        mov %EDX, %EAX
        shrd %EDX, %ECX, 30
        shl %EAX, 2
        ret

Note the magically transmogrifying immediate.

llvm-svn: 19686

29478017

Use the TargetInstrInfo::commuteInstruction method to commute instructions · ea42c15d
Chris Lattner authored Jan 19, 2005
```
instead of doing it manually.

llvm-svn: 19685
```
ea42c15d

Finegrainify namespacification · f6932b70

Chris Lattner authored Jan 19, 2005

Add default impl of commuteInstruction
Add notes about ugly V9 code.

llvm-svn: 19684

f6932b70

Codegen long >> 2 to this: · 41fe201b

Chris Lattner authored Jan 19, 2005

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EDX, DWORD PTR [%ESP + 8]
        shrd %EAX, %EDX, 2
        sar %EDX, 2
        ret

instead of this:

test1:
        mov %ECX, DWORD PTR [%ESP + 4]
        shr %ECX, 2
        mov %EDX, DWORD PTR [%ESP + 8]
        mov %EAX, %EDX
        shl %EAX, 30
        or %EAX, %ECX
        sar %EDX, 2
        ret

and long << 2 to this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
***     mov %EDX, %EAX
        shrd %EDX, %ECX, 30
        shl %EAX, 2
        ret

instead of this:

foo:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        shr %ECX, 30
        mov %EDX, DWORD PTR [%ESP + 8]
        shl %EDX, 2
        or %EDX, %ECX
        shl %EAX, 2
        ret

The extra copy (marked ***) can be eliminated when I teach the code generator
that shrd32rri8 is really commutative.

llvm-svn: 19681

41fe201b

Implement a way of expanding shifts. This applies to targets that offer · 2a7f8a94

Chris Lattner authored Jan 19, 2005

select operations or to shifts that are by a constant.  This automatically
implements (with no special code) all of the special cases for shift by 32,
shift by < 32 and shift by > 32.

llvm-svn: 19679

2a7f8a94

X86 shifts mask the amount. · d8d30660
Chris Lattner authored Jan 19, 2005
```
llvm-svn: 19678
```
d8d30660

Add a hook to find out how the target handles shift amounts that are out of · a05cd83d

Chris Lattner authored Jan 19, 2005

range.  Either they are undefined (the default), they mask the shift amount
to the size of the register (X86, Alpha, etc), or they extend the shift (PPC).

This defaults to undefined, which is conservatively correct.

llvm-svn: 19677

a05cd83d

Jan 18, 2005

Zero is cheaper than sign extend. · 42993e45
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19675
```
42993e45
Code to handle FP_EXTEND is dead now. X86 doesn't support any data types to · 14947c34
Chris Lattner authored Jan 18, 2005
```
FP_EXTEND from!

llvm-svn: 19674
```
14947c34
Remove more dead code. · c6e928cb
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19673
```
c6e928cb
The selection dag code handles the promotions from F32 to F64 for us, so we · 0616fa6b
Chris Lattner authored Jan 18, 2005
```
don't need to even think about F32 in the X86 code anymore.

llvm-svn: 19672
```
0616fa6b
Fix some fixmes (promoting bools for select and brcond), fix promotion · d65c3f31
Chris Lattner authored Jan 18, 2005
```
of zero and sign extends.

llvm-svn: 19671
```
d65c3f31
Keep track of the retval type as well. · a9d53f9f
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19670
```
a9d53f9f

Teach legalize to promote copy(from|to)reg, instead of making the isel pass · 9f2c4a52

Chris Lattner authored Jan 18, 2005

do it.  This results in better code on X86 for floats (because if strict
precision is not required, we can elide some more expensive double -> float
conversions like the old isel did), and allows other targets to emit
CopyFromRegs that are not legal for arguments.

llvm-svn: 19668

9f2c4a52

Fix 124.m88ksim. · 479c7118
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19667
```
479c7118
Do not emit loads multiple times, potentially in the wrong places. · ed246ec0
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19661
```
ed246ec0
Minor changes. · c227ad26
Tanya Lattner authored Jan 18, 2005
```
llvm-svn: 19660
```
c227ad26
Eliminate bad assertions. · 28a205e0
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19659
```
28a205e0

* Eliminate the TokenSet and just use the ExprMap for both tokens and values. · 78d30283

Chris Lattner authored Jan 18, 2005

* Insert some really pedantic assertions that will notice when we emit the
  same loads more than one time, exposing bugs.  This turns a miscompilation in
  bzip2 into a compile-fail.  yaay.

llvm-svn: 19658

78d30283

Teach legalize to promote SetCC results. · 2cb338d7
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19657
```
2cb338d7
Allow setcc operations to have nonbool types. · b07e2d20
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19656
```
b07e2d20

Rely on the code in MatchAddress to do this work. Otherwise we fail to · d7f93950

Chris Lattner authored Jan 18, 2005

match (X+Y)+(Z << 1), because we match the X+Y first, consuming the index
register, then there is no place to put the Z.

llvm-svn: 19652

d7f93950

Fix the completely broken FP constant folds for setcc's. · 2b4b7958
Chris Lattner authored Jan 18, 2005
```
llvm-svn: 19651
```
2b4b7958

Fix a problem where probing for addressing modes caused expressions to be · a7acdda0

Chris Lattner authored Jan 18, 2005

emitted too early.  In particular, this fixes
Regression/CodeGen/X86/regpressure.ll:regpressure3.

This also improves the 2nd basic block in 164.gzip:flush_block, which went from

.LBBflush_block_1:      # loopentry.1.i
        movzx %EAX, WORD PTR [dyn_ltree + 20]
        movzx %ECX, WORD PTR [dyn_ltree + 16]
        mov DWORD PTR [%ESP + 32], %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 12]
        movzx %EDX, WORD PTR [dyn_ltree + 8]
        movzx %EBX, WORD PTR [dyn_ltree + 4]
        mov DWORD PTR [%ESP + 36], %EBX
        movzx %EBX, WORD PTR [dyn_ltree]
        add DWORD PTR [%ESP + 36], %EBX
        add %EDX, DWORD PTR [%ESP + 36]
        add %ECX, %EDX
        add DWORD PTR [%ESP + 32], %ECX
        add %EAX, DWORD PTR [%ESP + 32]
        movzx %ECX, WORD PTR [dyn_ltree + 24]
        add %EAX, %ECX
        mov %ECX, 0
        mov %EDX, %ECX

to

.LBBflush_block_1:      # loopentry.1.i
        movzx %EAX, WORD PTR [dyn_ltree]
        movzx %ECX, WORD PTR [dyn_ltree + 4]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 8]
        add %EAX, %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 12]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 16]
        add %EAX, %ECX
        movzx %ECX, WORD PTR [dyn_ltree + 20]
        add %ECX, %EAX
        movzx %EAX, WORD PTR [dyn_ltree + 24]
        add %ECX, %EAX
        mov %EAX, 0
        mov %EDX, %EAX

... which results in less spilling in the function.

This change alone speeds up 164.gzip from 37.23s to 36.24s on apoc.  The
default isel takes 37.31s.

llvm-svn: 19650

a7acdda0

Fix indentation. · b93409f3
Chris Lattner authored Jan 17, 2005
```
llvm-svn: 19649
```
b93409f3
Don't bother using max here. · a5d137f4
Chris Lattner authored Jan 17, 2005
```
llvm-svn: 19647
```
a5d137f4

Jan 17, 2005

Do not give token factor nodes outrageous weights · ca318edb
Chris Lattner authored Jan 17, 2005
```
llvm-svn: 19645
```
ca318edb

Non-volatile loads can be freely reordered against each other. This fixes · 4d9651c7

Chris Lattner authored Jan 17, 2005

X86/reg-pressure.ll again, and allows us to do nice things in other cases.
For example, we now codegen this sort of thing:

int %loadload(int *%X, int* %Y) {
  %Z = load int* %Y
  %Y = load int* %X      ;; load between %Z and store
  %Q = add int %Z, 1
  store int %Q, int* %Y
  ret int %Y
}

Into this:

loadload:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%EAX]
        mov %ECX, DWORD PTR [%ESP + 8]
        inc DWORD PTR [%ECX]
        ret

where we weren't able to form the 'inc [mem]' before.  This also lets the
instruction selector emit loads in any order it wants to, which can be good
for register pressure as well.

llvm-svn: 19644

4d9651c7

Two changes: · e86c933d

Chris Lattner authored Jan 17, 2005

 1. Fold  [mem] += (1|-1) into inc [mem]/dec [mem] to save some icache space.
 2. Do not let token factor nodes prevent forming '[mem] op= val' folds.

llvm-svn: 19643

e86c933d

Don't call SelectionDAG.getRoot() directly, go through a forwarding method. · 4108bb01
Chris Lattner authored Jan 17, 2005
```
llvm-svn: 19642
```
4108bb01
Refactor load/op/store folding into it's own method, no functionality changes. · 96113fd0
Chris Lattner authored Jan 17, 2005
```
llvm-svn: 19641
```
96113fd0

Implement a target independent optimization to codegen arguments only into · e3c2cf48

Chris Lattner authored Jan 17, 2005

the basic block that uses them if possible.  This is a big win on X86, as it
lets us fold the argument loads into instructions and reduce register pressure
(by not loading all of the arguments in the entry block).

For this (contrived to show the optimization) testcase:

int %argtest(int %A, int %B) {
        %X = sub int 12345, %A
        br label %L
L:
        %Y = add int %X, %B
        ret int %Y
}

we used to produce:

argtest:
        mov %ECX, DWORD PTR [%ESP + 4]
        mov %EAX, 12345
        sub %EAX, %ECX
        mov %EDX, DWORD PTR [%ESP + 8]
.LBBargtest_1:  # L
        add %EAX, %EDX
        ret


now we produce:

argtest:
        mov %EAX, 12345
        sub %EAX, DWORD PTR [%ESP + 4]
.LBBargtest_1:  # L
        add %EAX, DWORD PTR [%ESP + 8]
        ret

This also fixes the FIXME in the code.

BTW, this occurs in real code.  164.gzip shrinks from 8623 to 8608 lines of
.s file.  The stack frame in huft_build shrinks from 1644->1628 bytes,
inflate_codes shrinks from 116->108 bytes, and inflate_block from 2620->2612,
due to fewer spills.

Take that alkis. :-)

llvm-svn: 19639

e3c2cf48

Fix a major regression last night that prevented us from producing [mem] op= reg · 90988794
Chris Lattner authored Jan 17, 2005
```
operations.

The body of the if is less indented but unmodified in this patch.

llvm-svn: 19638
```
90988794