Commits · f436286cf66e81f79a24f49ed89af867c43e898b · Roger Ferrer / llvm-epi-0.8

Apr 30, 2005
- Do not use deprecated APIs · d7e534b2
  Alkis Evlogimenos authored Apr 30, 2005
```
llvm-svn: 21639
```
  d7e534b2
- Codegen and legalize sin/cos/llvm.sqrt as FSIN/FCOS/FSQRT calls. This patch · 8002640e
  Chris Lattner authored Apr 30, 2005
```
was contributed by Morten Ofstad, with some minor tweaks and bug fixes added
by me.

llvm-svn: 21636
```
  8002640e
- Lower llvm.sqrt -> fsqrt/sqrt · 30fe4ac2
  Chris Lattner authored Apr 30, 2005
```
llvm-svn: 21629
```
  30fe4ac2
Apr 28, 2005
- Legalize FSQRT, FSIN, FCOS nodes, patch contributed by Morten Ofstad · 9d6fa98e
  Chris Lattner authored Apr 28, 2005
```
llvm-svn: 21606
```
  9d6fa98e
- Add FSQRT, FSIN, FCOS nodes, patch contributed by Morten Ofstad · 2f82d2d5
  Chris Lattner authored Apr 28, 2005
```
llvm-svn: 21605
```
  2f82d2d5
Apr 27, 2005

Implement Value* tracking for loads and stores in the selection DAG. This... · 4a73c2cf

Andrew Lenharth authored Apr 27, 2005

Implement Value* tracking for loads and stores in the selection DAG.  This enables one to use alias analysis in the backends.

(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*.  Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.

llvm-svn: 21599

4a73c2cf

Apr 26, 2005
- Fold (X > -1) | (Y > -1) --> (X&Y > -1) · cfa7ddd6
  Chris Lattner authored Apr 26, 2005
```
llvm-svn: 21552
```
  cfa7ddd6
Apr 25, 2005

implement some more logical compares with constants, so that: · f806459d

Chris Lattner authored Apr 25, 2005

int foo1(int x, int y) {
  int t1 = x >= 0;
  int t2 = y >= 0;
  return t1 & t2;
}
int foo2(int x, int y) {
  int t1 = x == -1;
  int t2 = y == -1;
  return t1 & t2;
}

produces:

_foo1:
        or r2, r4, r3
        srwi r2, r2, 31
        xori r3, r2, 1
        blr
_foo2:
        and r2, r4, r3
        addic r2, r2, 1
        li r2, 0
        addze r3, r2
        blr

instead of:

_foo1:
        srwi r2, r4, 31
        xori r2, r2, 1
        srwi r3, r3, 31
        xori r3, r3, 1
        and r3, r2, r3
        blr
_foo2:
        addic r2, r4, 1
        li r2, 0
        addze r2, r2
        addic r3, r3, 1
        li r3, 0
        addze r3, r3
        and r3, r2, r3
        blr

llvm-svn: 21547

f806459d

Codegen x < 0 | y < 0 as (x|y) < 0. This allows us to compile this to: · d373ff64

Chris Lattner authored Apr 25, 2005

_foo:
        or r2, r4, r3
        srwi r3, r2, 31
        blr

instead of:

_foo:
        srwi r2, r4, 31
        srwi r3, r3, 31
        or r3, r2, r3
        blr

llvm-svn: 21544

d373ff64

Apr 22, 2005
- Convert tabs to spaces · 77451163
  Misha Brukman authored Apr 22, 2005
```
llvm-svn: 21439
```
  77451163
- Remove trailing whitespace · 835702a0
  Misha Brukman authored Apr 21, 2005
```
llvm-svn: 21420
```
  835702a0
Apr 21, 2005

Improve and elimination. On PPC, for: · f6302441

Chris Lattner authored Apr 21, 2005

bool %test(int %X) {
        %Y = and int %X, 8
        %Z = setne int %Y, 0
        ret bool %Z
}

we now generate this:

        rlwinm r2, r3, 0, 28, 28
        srwi r3, r2, 3

instead of this:

        rlwinm r2, r3, 0, 28, 28
        srwi r2, r2, 3
        rlwinm r3, r2, 0, 31, 31

I'll leave it to Nate to get it down to one instruction. :)

---------------------------------------------------------------------

llvm-svn: 21391

f6302441

Fold (x & 8) != 0 and (x & 8) == 8 into (x & 8) >> 3. · ab1ed775

Chris Lattner authored Apr 21, 2005

This turns this PPC code:

        rlwinm r2, r3, 0, 28, 28
        cmpwi cr7, r2, 8
        mfcr r2
        rlwinm r3, r2, 31, 31, 31

into this:

        rlwinm r2, r3, 0, 28, 28
        srwi r2, r2, 3
        rlwinm r3, r2, 0, 31, 31

Next up, nuking the extra and.

llvm-svn: 21390

ab1ed775

Apr 18, 2005

Fold setcc of MVT::i1 operands into logical operations · b61ecb58
Chris Lattner authored Apr 18, 2005
```
llvm-svn: 21319
```
b61ecb58
Another minor simplification: handle setcc (zero_extend x), c -> setcc(x, c') · 6d40fd01
Chris Lattner authored Apr 18, 2005
```
llvm-svn: 21318
```
6d40fd01
Another simple xform · 868d4730
Chris Lattner authored Apr 18, 2005
```
llvm-svn: 21317
```
868d4730

Fold: · bd22d83d

Chris Lattner authored Apr 18, 2005

        // (X != 0) | (Y != 0) -> (X|Y != 0)
        // (X == 0) & (Y == 0) -> (X|Y == 0)

Compiling this:

int %bar(int %a, int %b) {
        entry:
        %tmp.1 = setne int %a, 0
        %tmp.2 = setne int %b, 0
        %tmp.3 = or bool %tmp.1, %tmp.2
        %retval = cast bool %tmp.3 to int
        ret int %retval
        }

to this:

_bar:
        or r2, r3, r4
        addic r3, r2, -1
        subfe r3, r3, r2
        blr

instead of:

_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r3, r2, r3
        blr

llvm-svn: 21316

bd22d83d

Make the AND elimination operation recursive and significantly more powerful, · d929f8bc

Chris Lattner authored Apr 18, 2005

eliminating an and for Nate's testcase:

int %bar(int %a, int %b) {
        entry:
        %tmp.1 = setne int %a, 0
        %tmp.2 = setne int %b, 0
        %tmp.3 = or bool %tmp.1, %tmp.2
        %retval = cast bool %tmp.3 to int
        ret int %retval
        }

generating:

_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r3, r2, r3
        blr

instead of:

_bar:
        addic r2, r3, -1
        subfe r2, r2, r3
        addic r3, r4, -1
        subfe r3, r3, r4
        or r2, r2, r3
        rlwinm r3, r2, 0, 31, 31
        blr

llvm-svn: 21315

d929f8bc

Apr 14, 2005
- Add a couple missing transforms in getSetCC that were triggering assertions · 80c095f4
  Nate Begeman authored Apr 14, 2005
```
in the PPC Pattern ISel

llvm-svn: 21297
```
  80c095f4
Apr 13, 2005
- Disbale the broken fold of shift + sz[ext] for now · 4ddd8165
  Nate Begeman authored Apr 13, 2005
```
Move the transform for select (a < 0) ? b : 0 into the dag from ppc isel
Enable the dag to fold and (setcc, 1) -> setcc for targets where setcc
  always produces zero or one.

llvm-svn: 21291
```
  4ddd8165
- fix an infinite loop · 56d177a3
  Chris Lattner authored Apr 13, 2005
```
llvm-svn: 21289
```
  56d177a3
- fix some serious miscompiles on ia64, alpha, and ppc · e3d17d82
  Chris Lattner authored Apr 13, 2005
```
llvm-svn: 21288
```
  e3d17d82
- avoid work when possible, perhaps fix the problem nate and andrew are seeing · 8c3d409d
  Chris Lattner authored Apr 13, 2005
```
with != 0 comparisons vanishing.

llvm-svn: 21287
```
  8c3d409d
- Implement expansion of unsigned i64 -> FP. · e69ad5fd
  Chris Lattner authored Apr 13, 2005
```
Note that this probably only works for little endian targets, but is enough
to get siod working :)

llvm-svn: 21280
```
  e69ad5fd
- Make expansion of uint->fp cast assert out instead of infinitely recurse. · 0efd77ed
  Chris Lattner authored Apr 13, 2005
```
llvm-svn: 21275
```
  0efd77ed
- add back the optimization that Nate added for shl X, (zext_inreg y) · b1f25ac1
  Chris Lattner authored Apr 13, 2005
```
llvm-svn: 21273
```
  b1f25ac1
- Oops, remove these too. · 39844ac3
  Chris Lattner authored Apr 13, 2005
```
llvm-svn: 21272
```
  39844ac3
- Instead of making ZERO_EXTEND_INREG nodes, use the helper method in · 0e852afb
  Chris Lattner authored Apr 13, 2005
```
SelectionDAG to do the job with AND.  Don't legalize Z_E_I anymore as
it is gone

llvm-svn: 21266
```
  0e852afb
- Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodes · 2b4e3fca
  Chris Lattner authored Apr 13, 2005
```
instead.  OVerall, this increases the amount of folding we can do.

llvm-svn: 21265
```
  2b4e3fca
- Fold shift x, [sz]ext(y) -> shift x, y · ca916ba4
  Nate Begeman authored Apr 12, 2005
```
llvm-svn: 21262
```
  ca916ba4
- Fold shift by size larger than type size to undef · af1c0f7a
  Nate Begeman authored Apr 12, 2005
```
Make llvm undef values generate ISD::UNDEF nodes

llvm-svn: 21261
```
  af1c0f7a
Apr 12, 2005

promote extload i1 -> extload i8 · 0b73a6d8
Chris Lattner authored Apr 12, 2005
```
llvm-svn: 21258
```
0b73a6d8

Remove some redundant checks, add a couple of new ones. This allows us to · af5b25f1

Chris Lattner authored Apr 12, 2005

compile this:

int foo (unsigned long a, unsigned long long g) {
  return a >= g;
}

To:

foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        cmpl $0, 12(%esp)
        sete %cl
        andb %al, %cl
        movzbl %cl, %eax
        ret

instead of:

foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        movzbw %al, %cx
        movl 12(%esp), %edx
        cmpl $0, %edx
        sete %al
        movzbw %al, %ax
        cmpl $0, %edx
        cmove %cx, %ax
        movzbl %al, %eax
        ret

llvm-svn: 21244

af5b25f1

Emit comparisons against the sign bit better. Codegen this: · aedcabe8

Chris Lattner authored Apr 12, 2005

bool %test1(long %X) {
        %A = setlt long %X, 0
        ret bool %A
}

like this:

test1:
        cmpl $0, 8(%esp)
        setl %al
        movzbl %al, %eax
        ret

instead of:

test1:
        movl 8(%esp), %ecx
        cmpl $0, %ecx
        setl %al
        movzbw %al, %ax
        cmpl $0, 4(%esp)
        setb %dl
        movzbw %dl, %dx
        cmpl $0, %ecx
        cmove %dx, %ax
        movzbl %al, %eax
        ret

llvm-svn: 21243

aedcabe8

Emit long comparison against -1 better. Instead of this (x86): · 71ff44e4

Chris Lattner authored Apr 12, 2005

test2:
        movl 8(%esp), %eax
        notl %eax
        movl 4(%esp), %ecx
        notl %ecx
        orl %eax, %ecx
        cmpl $0, %ecx
        sete %al
        movzbl %al, %eax
        ret

or this (PPC):

_test2:
        nor r2, r4, r4
        nor r3, r3, r3
        or r2, r2, r3
        cntlzw r2, r2
        srwi r3, r2, 5
        blr

Emit this:

test2:
        movl 8(%esp), %eax
        andl 4(%esp), %eax
        cmpl $-1, %eax
        sete %al
        movzbl %al, %eax
        ret

or this:

_test2:
.LBB_test2_0:   ;
        and r2, r4, r3
        cmpwi cr0, r2, -1
        li r3, 1
        li r2, 0
        beq .LBB_test2_2        ;
.LBB_test2_1:   ;
        or r3, r2, r2
.LBB_test2_2:   ;
        blr

it seems like the PPC isel could do better for R32 == -1 case.

llvm-svn: 21242

71ff44e4

canonicalize x <u 1 -> x == 0. On this testcase: · 87bd6988

Chris Lattner authored Apr 12, 2005

unsigned long long g;
unsigned long foo (unsigned long a) {
  return (a >= g) ? 1 : 0;
}

It changes the ppc code from:

_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cmplwi cr0, r4, 1
        li r3, 1
        li r5, 0
        blt .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r3, r5, r5
.LBB_foo_4:     ; entry
        cmpwi cr0, r4, 0
        beq .LBB_foo_6  ; entry
.LBB_foo_5:     ; entry
        or r2, r3, r3
.LBB_foo_6:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr


to:

_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cntlzw r3, r4
        srwi r3, r3, 5
        cmpwi cr0, r4, 0
        beq .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r2, r3, r3
.LBB_foo_4:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr

llvm-svn: 21241

87bd6988

Apr 11, 2005

Teach the dag mechanism that this: · 8ffd0049

Chris Lattner authored Apr 11, 2005

long long test2(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) + B;
}

is equivalent to this:

long long test1(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) | B;
}

Now they are both codegen'd to this on ppc:

_test2:
        blr

or this on x86:

test2:
        movl 4(%esp), %edx
        movl 8(%esp), %eax
        ret

llvm-svn: 21231

8ffd0049

Fix expansion of shifts by exactly NVT bits on arch's (like X86) that have · edd19706

Chris Lattner authored Apr 11, 2005

masking shifts.

This fixes the miscompilation of this:

long long test1(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) | B;
}

into this:

test1:
        movl 4(%esp), %edx
        movl %edx, %eax
        orl 8(%esp), %eax
        ret

allowing us to generate this instead:

test1:
        movl 4(%esp), %edx
        movl 8(%esp), %eax
        ret

llvm-svn: 21230

edd19706

Fix libcall code to not pass a NULL Chain to LowerCallTo · add0c63a

Nate Begeman authored Apr 11, 2005

Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node
  when it is known that there is no ADJCALLSTACKDOWN to match.
Expand i64 multiply when ISD::MULHU is legal for the target.

llvm-svn: 21214

add0c63a

Don't bother sign/zext_inreg'ing the result of an and operation if we know · e2427c9a

Chris Lattner authored Apr 10, 2005

the result does change as a result of the extend.

This improves codegen for Alpha on this testcase:

int %a(ushort* %i) {
        %tmp.1 = load ushort* %i
        %tmp.2 = cast ushort %tmp.1 to int
        %tmp.4 = and int %tmp.2, 1
        ret int %tmp.4
}

Generating:

a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        ret $31,($26),1

instead of:

a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        addl $0,0,$0
        ret $31,($26),1

btw, alpha really should switch to livein/outs for args :)

llvm-svn: 21213

e2427c9a