Commits · 6b7652fae5505436a9b1a2932c6e8445bcc28a54 · Roger Ferrer / llvm-epi-0.8

Nov 16, 2004
- Remove a dead function, which died when we got GAS emission working (phwew, · 6b7652fa
  Chris Lattner authored Nov 16, 2004
```
hold your nose!)

llvm-svn: 17869
```
  6b7652fa
- Implement a simple FIXME: if we are emitting a basic block address that has · c927072b
  Chris Lattner authored Nov 16, 2004
```
already been emitted, we don't have to remember it and deal with it later,
just emit it directly.

llvm-svn: 17868
```
  c927072b
- * Merge some win32 ifdefs together · 2e182fc3
  Chris Lattner authored Nov 16, 2004
```
* Get rid of "emitMaybePCRelativeValue", either we want to emit a PC relative
  value or not: drop the maybe BS.  As it turns out, the only places where
  the bool was a variable coming in, the bool was a dynamic constant.

llvm-svn: 17867
```
  2e182fc3
- Add debug-only=jit printout, so we see when lazily resolved symbols are · 9cc2dac7
  Chris Lattner authored Nov 15, 2004
```
set up.

llvm-svn: 17862
```
  9cc2dac7
- Simplify and rearrange long shift code · 34b754d9
  Chris Lattner authored Nov 15, 2004
```
llvm-svn: 17861
```
  34b754d9
Nov 14, 2004
- GhostLinkage should not reach asm printing stage · 7f245d47
  Misha Brukman authored Nov 14, 2004
```
llvm-svn: 17750
```
  7f245d47
- Don't print unneeded labels · 56c4c99c
  Chris Lattner authored Nov 13, 2004
```
llvm-svn: 17714
```
  56c4c99c
Nov 13, 2004

· 049d33a7

Chris Lattner authored Nov 13, 2004

shld is a very high latency operation. Instead of emitting it for shifts of
two or three, open code the equivalent operation which is faster on athlon
and P4 (by a substantial margin).

For example, instead of compiling this:

long long X2(long long Y) { return Y << 2; }

to:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $2, %eax, %edx
        shll $2, %eax
        ret

Compile it to:

X2:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $30, %edx
        leal (%edx,%ecx,4), %edx
        shll $2, %eax
        ret

Likewise, for << 3, compile to:

X3:
        movl 4(%esp), %eax
        movl 8(%esp), %ecx
        movl %eax, %edx
        shrl $29, %edx
        leal (%edx,%ecx,8), %edx
        shll $3, %eax
        ret

This matches icc, except that icc open codes the shifts as adds on the P4.

llvm-svn: 17707

049d33a7

Add missing check · ef6bd92a
Chris Lattner authored Nov 13, 2004
```
llvm-svn: 17706
```
ef6bd92a

Compile: · 8d521bb1

Chris Lattner authored Nov 13, 2004

long long X3_2(long long Y) { return Y+Y; }
int X(int Y) { return Y+Y; }

into:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        addl %eax, %eax
        adcl %edx, %edx
        ret
X:
        movl 4(%esp), %eax
        addl %eax, %eax
        ret

instead of:

X3_2:
        movl 4(%esp), %eax
        movl 8(%esp), %edx
        shldl $1, %eax, %edx
        shll $1, %eax
        ret

X:
        movl 4(%esp), %eax
        shll $1, %eax
        ret

llvm-svn: 17705

8d521bb1

Nov 10, 2004
- Correct the name of stosd for the AT&T syntax: · 04570265
  John Criswell authored Nov 10, 2004
```
It's stosl (l for long == 32 bit).

llvm-svn: 17658
```
  04570265
Nov 05, 2004
- Fix compilation problem; make the cast and the LHS be the same type. · ab79288e
  John Criswell authored Nov 05, 2004
```
llvm-svn: 17488
```
  ab79288e
- Quiet VC++ warnings · 429aaa58
  Chris Lattner authored Nov 05, 2004
```
llvm-svn: 17484
```
  429aaa58
Nov 02, 2004
- Fix a warning · 99d7bb33
  Chris Lattner authored Nov 02, 2004
```
llvm-svn: 17431
```
  99d7bb33
Nov 01, 2004
- Add placeholder variable to make Win32 work, applied for Morten Ofstad · 720eb217
  Chris Lattner authored Nov 01, 2004
```
llvm-svn: 17406
```
  720eb217
Oct 28, 2004
- Change Library Names Not To Conflict With Others When Installed · 57cbe39d
  Reid Spencer authored Oct 27, 2004
```
llvm-svn: 17286
```
  57cbe39d
Oct 22, 2004
- Adjust to changes in Makefile.rules · 30d8baea
  Reid Spencer authored Oct 22, 2004
```
llvm-svn: 17167
```
  30d8baea
- We won't use automake · c1c320c3
  Reid Spencer authored Oct 22, 2004
```
llvm-svn: 17155
```
  c1c320c3
Oct 19, 2004
- Initial automake generated Makefile template · 6a11a75f
  Reid Spencer authored Oct 18, 2004
```
llvm-svn: 17136
```
  6a11a75f
Oct 18, 2004
- Improve compatibility with VC++, patch contributed by Morten Ofstad! · fbc070bf
  Chris Lattner authored Oct 18, 2004
```
llvm-svn: 17126
```
  fbc070bf
Oct 17, 2004

Don't print stuff out from the code generator. This broke the JIT horribly · 06855531
Chris Lattner authored Oct 17, 2004
```
last night. :)  bork!

llvm-svn: 17093
```
06855531

Rewrite support for cast uint -> FP. In particular, we used to compile this: · 839abf57

Chris Lattner authored Oct 17, 2004

double %test(uint %X) {
        %tmp.1 = cast uint %X to double         ; <double> [#uses=1]
        ret double %tmp.1
}

into:

test:
        sub %ESP, 8
        mov %EAX, DWORD PTR [%ESP + 12]
        mov %ECX, 0
        mov DWORD PTR [%ESP], %EAX
        mov DWORD PTR [%ESP + 4], %ECX
        fild QWORD PTR [%ESP]
        add %ESP, 8
        ret

... which basically zero extends to 8 bytes, then does an fild for an
8-byte signed int.

Now we generate this:


test:
        sub %ESP, 4
        mov %EAX, DWORD PTR [%ESP + 8]
        mov DWORD PTR [%ESP], %EAX
        fild DWORD PTR [%ESP]
        shr %EAX, 31
        fadd DWORD PTR [.CPItest_0 + 4*%EAX]
        add %ESP, 4
        ret

        .section .rodata
        .align  4
.CPItest_0:
        .quad   5728578726015270912

This does a 32-bit signed integer load, then adds in an offset if the sign
bit of the integer was set.

It turns out that this is substantially faster than the preceeding sequence.
Consider this testcase:

unsigned a[2]={1,2};
volatile double G;

void main() {
    int i;
    for (i=0; i<100000000; ++i )
        G += a[i&1];
}

On zion (a P4 Xeon, 3Ghz), this patch speeds up the testcase from 2.140s
to 0.94s.

On apoc, an athlon MP 2100+, this patch speeds up the testcase from 1.72s
to 1.34s.

Note that the program takes 2.5s/1.97s on zion/apoc with GCC 3.3 -O3
-fomit-frame-pointer.

llvm-svn: 17083

839abf57

Unify handling of constant pool indexes with the other code paths, allowing · 112fd88a
Chris Lattner authored Oct 17, 2004
```
us to use index registers for CPI's

llvm-svn: 17082
```
112fd88a
Give the asmprinter the ability to print memrefs with a constant pool index, · af19d396
Chris Lattner authored Oct 17, 2004
```
index reg and scale

llvm-svn: 17081
```
af19d396

fold: · 653d8663

Chris Lattner authored Oct 17, 2004

  %X = and Y, constantint
  %Z = setcc %X, 0

instead of emitting:

        and %EAX, 3
        test %EAX, %EAX
        je .LBBfoo2_2   # UnifiedReturnBlock

We now emit:

        test %EAX, 3
        je .LBBfoo2_2   # UnifiedReturnBlock

This triggers 581 times on 176.gcc for example.

llvm-svn: 17080

653d8663

Oct 16, 2004
- Teach the X86 backend about unreachable and undef. Among other things, we · e4bea062
  Chris Lattner authored Oct 16, 2004
```
now compile:

'foo() {}' into "ret" instead of "mov EAX, 0; ret"

llvm-svn: 17049
```
  e4bea062
Oct 15, 2004
- Instruction select globals with offsets better. For example, on this test · 15914416
  Chris Lattner authored Oct 15, 2004
```
case:

int C[100];
int foo() {
  return C[4];
}

We now codegen:

foo:
        mov %EAX, DWORD PTR [C + 16]
        ret

instead of:

foo:
        mov %EAX, OFFSET C
        mov %EAX, DWORD PTR [%EAX + 16]
        ret

Other impressive features may be coming later.

This patch is contributed by Jeff Cohen!

llvm-svn: 17011
```
  15914416
- Give the X86 JIT the ability to encode global+disp constants. Patch · 3b78938b
  Chris Lattner authored Oct 15, 2004
```
contributed by Jeff Cohen!

llvm-svn: 17010
```
  3b78938b
- Give the X86 asm printer the ability to print out addressing modes that have · 19025d5a
  Chris Lattner authored Oct 15, 2004
```
constant displacements from global variables.  Patch by Jeff Cohen!

llvm-svn: 17009
```
  19025d5a
- Allow X86 addressing modes to represent globals with offsets. Patch contributed · df7b984f
  Chris Lattner authored Oct 15, 2004
```
by Jeff Cohen!

llvm-svn: 17008
```
  df7b984f
Oct 13, 2004
- Update to reflect changes in Makefile rules. · ace94df7
  Reid Spencer authored Oct 13, 2004
```
llvm-svn: 16950
```
  ace94df7
Oct 11, 2004
- Initial version of automake Makefile.am file. · 97327f05
  Reid Spencer authored Oct 10, 2004
```
llvm-svn: 16893
```
  97327f05
Oct 09, 2004

The person who was planning to add SSE support isn't anymore, so disable · 23c8d0b6

Chris Lattner authored Oct 08, 2004

the -sse* options (to avoid misleading people).

Also, the stack alignment of the target doesn't depend on whether SSE is
eventually implemented, so remove a comment.

llvm-svn: 16860

23c8d0b6

Fix a major regression from the bugfix for 2004-10-08-SelectSetCCFold.llx, · 97ea4206

Chris Lattner authored Oct 08, 2004

which prevented setcc's from being folded into branches.  It appears that
conditional branchinst's CC operand is actually operand(2), not operand(0)
as we might expect. :(

llvm-svn: 16859

97ea4206

Oct 08, 2004
- Fix bug: 2004-10-08-SelectSetCCFold.llx. Normally this is hidden by the · 0be2f504
  Chris Lattner authored Oct 08, 2004
```
instcombine xform, which is why we didn't notice it before.

llvm-svn: 16840
```
  0be2f504
Oct 06, 2004

Remove debugging code, fix encoding problem. This fixes the problems · 93867e51
Chris Lattner authored Oct 06, 2004
```
the JIT had last night.

llvm-svn: 16766
```
93867e51

Codegen signed mod by 2 or -2 more efficiently. Instead of generating: · 6835dedb

Chris Lattner authored Oct 06, 2004

t:
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %ECX, 2
        mov %EAX, %EDX
        sar %EDX, 31
        idiv %ECX
        mov %EAX, %EDX
        ret

Generate:
t:
        mov %ECX, DWORD PTR [%ESP + 4]
***     mov %EAX, %ECX
        cdq
        and %ECX, 1
        xor %ECX, %EDX
        sub %ECX, %EDX
***     mov %EAX, %ECX
        ret

Note that the two marked moves are redundant, and should be eliminated by the
register allocator, but aren't.

Compare this to GCC, which generates:

t:
        mov     %eax, DWORD PTR [%esp+4]
        mov     %edx, %eax
        shr     %edx, 31
        lea     %ecx, [%edx+%eax]
        and     %ecx, -2
        sub     %eax, %ecx
        ret

or ICC 8.0, which generates:

t:
        movl      4(%esp), %ecx                                 #3.5
        movl      $-2147483647, %eax                            #3.25
        imull     %ecx                                          #3.25
        movl      %ecx, %eax                                    #3.25
        sarl      $31, %eax                                     #3.25
        addl      %ecx, %edx                                    #3.25
        subl      %edx, %eax                                    #3.25
        addl      %eax, %eax                                    #3.25
        negl      %eax                                          #3.25
        subl      %eax, %ecx                                    #3.25
        movl      %ecx, %eax                                    #3.25
        ret                                                     #3.25

We would be in great shape if not for the moves.

llvm-svn: 16763

6835dedb

Fix a scary bug with signed division by a power of two. We used to generate: · 7bd8f133

Chris Lattner authored Oct 06, 2004

s:   ;; X / 4
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        mov %EDX, %EAX
        add %EDX, %ECX
        sar %EAX, 2
        ret

When we really meant:

s:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, %EAX
        sar %ECX, 1
        shr %ECX, 30
        add %EAX, %ECX
        sar %EAX, 2
        ret

Hey, this also reduces register pressure too :)

llvm-svn: 16761

7bd8f133

Codegen signed divides by 2 and -2 more efficiently. In particular · 147edd2f

Chris Lattner authored Oct 06, 2004

instead of:

s:   ;; X / 2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        ret

t:   ;; X / -2
        movl 4(%esp), %eax
        movl %eax, %ecx
        shrl $31, %ecx
        movl %eax, %edx
        addl %ecx, %edx
        sarl $1, %eax
        negl %eax
        ret

Emit:

s:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        ret

t:
        movl 4(%esp), %eax
        cmpl $-2147483648, %eax
        sbbl $-1, %eax
        sarl $1, %eax
        negl %eax
        ret

llvm-svn: 16760

147edd2f

Add some new instructions. Fix the asm string for sbb32rr · e9bfa5a2
Chris Lattner authored Oct 06, 2004
```
llvm-svn: 16759
```
e9bfa5a2