Commits · 87e645c5bd3086c4efab0562e196a13eac732557 · Roger Ferrer / llvm-epi-0.8

Jan 11, 2005

Implement the LOADABLE_MODULE option when building a shared library. This · 87e645c5

Reid Spencer authored Jan 11, 2005

passes the -module option on the libtool command line to ensure that the
shared library being built can be dlopened and dlsym can work on that
module. LOADABLE_MODULE should be sent only in conjunction with the
SHARED_LIBRARY directive. It should generally be used for any module that
is intended to be the target of an LLVM -load option. Note that loadable
modules will not have the lib prefix but otherwise look like shared
libraries. This is per the libtool recommendations and prevents these
special shared libraries from being linked in via -l option to the linker.

llvm-svn: 19454

87e645c5

shift X, 0 -> X · a86fa445
Chris Lattner authored Jan 11, 2005
```
llvm-svn: 19453
```
a86fa445
Fix a bug emitting branches that broke a lot of programs. · 37ed2855
Chris Lattner authored Jan 11, 2005
```
llvm-svn: 19452
```
37ed2855

Be more careful where we set ContainsFPCode. We were missing a set in the · e44e6d16

Chris Lattner authored Jan 11, 2005

int -> FP casting code.  Note that we don't have to set it for FP operations
that take FP values as operands: whatever produces the FP value will set the
flag.

llvm-svn: 19451

e44e6d16

Fix a major bug in setcc/cmov folding, where we accidentally · 8fea42bd
Chris Lattner authored Jan 11, 2005
```
inverted the sense of the comparison.

llvm-svn: 19450
```
8fea42bd

Take register pressure into account when we have to decide whether to · 0d1f82ac

Chris Lattner authored Jan 11, 2005

evaluate the LHS or the RHS of an operation first.  This causes good things
to happen.  For example, instead of compiling a loop to this:

.LBBstrength_result7_1: # loopentry
        movl 16(%esp), %edi
        movl (%edi), %edi             ;;; LOAD
        movl (%ecx), %ebx
        movl $2, (%eax,%ebx,4)
        movl (%edx), %ebx
        movl %esi, %ebp
        addl $21, %ebp
        addl $42, %esi
        cmpl $0, %edi                 ;;; USE
        cmovne %esi, %ebp
        cmpl %ebp, %ebx
        movl %ebp, %esi
        jg .LBBstrength_result7_1

We now compile it to this:

.LBBstrength_result7_1: # loopentry
        movl %edi, %ebx
        addl $42, %ebx
        addl $21, %edi
        movl (%ecx), %ebp              ;; LOAD
        cmpl $0, %ebp                  ;; USE
        cmovne %ebx, %edi
        movl (%edx), %ebx
        movl $2, (%eax,%ebx,4)
        movl (%esi), %ebx
        cmpl %edi, %ebx
        jg .LBBstrength_result7_1

Which reduces register pressure enough (in this case) to avoid spilling in the
loop.

As another example, consider the CodeGen/X86/regpressure.ll testcase.  We
used to generate this code for both cases:

regpressure1:
        subl $32, %esp
        movl %esi, 12(%esp)
        movl %edi, 8(%esp)
        movl %ebx, 4(%esp)
        movl %ebp, (%esp)
        movl 36(%esp), %ecx
        movl (%ecx), %eax
        movl 4(%ecx), %edx
        movl %edx, 24(%esp)
        movl 8(%ecx), %edx
        movl %edx, 16(%esp)
        movl 12(%ecx), %edx
        movl 16(%ecx), %esi
        movl 20(%ecx), %edi
        movl 24(%ecx), %ebx
        movl %ebx, 28(%esp)
        movl 28(%ecx), %ebx
        movl 32(%ecx), %ebp
        movl %ebp, 20(%esp)
        movl 36(%ecx), %ecx
        imull 24(%esp), %eax
        imull 16(%esp), %eax
        imull %edx, %eax
        imull %esi, %eax
        imull %edi, %eax
        imull 28(%esp), %eax
        imull %ebx, %eax
        imull 20(%esp), %eax
        imull %ecx, %eax
        movl (%esp), %ebp
        movl 4(%esp), %ebx
        movl 8(%esp), %edi
        movl 12(%esp), %esi
        addl $32, %esp
        ret

This code is basically trying to do all of the loads first, then execute all
of the multiplies.  Because we run out of registers, lots of spill code happens.
We now generate this code for both cases:

regpressure1:
        movl 4(%esp), %ecx
        movl (%ecx), %eax
        movl 4(%ecx), %edx
        imull %edx, %eax
        movl 8(%ecx), %edx
        imull %edx, %eax
        movl 12(%ecx), %edx
        imull %edx, %eax
        movl 16(%ecx), %edx
        imull %edx, %eax
        movl 20(%ecx), %edx
        imull %edx, %eax
        movl 24(%ecx), %edx
        imull %edx, %eax
        movl 28(%ecx), %edx
        imull %edx, %eax
        movl 32(%ecx), %edx
        imull %edx, %eax
        movl 36(%ecx), %ecx
        imull %ecx, %eax
        ret

which is much nicer (when we fold loads into the muls it will be even better).
The old instruction selector used to produce the good code for regpressure1
but not for regpressure2, as it depended on the order of operations in the
LLVM code.

llvm-svn: 19449

0d1f82ac

The pattern isel is aggressively codegen'ing all of the loads in these · 788bdba1

Chris Lattner authored Jan 11, 2005

functions together at the start of the basic block, causing massive spillage.
The old isel codegened the loads wherever they happened to land, so it
generated good code for the first case, but bad code for the second.

We really want the pattern isel to generate (the same) good code for both.

llvm-svn: 19448

788bdba1

Print SelectionDAGs bottom up, include extra info in the node labels · 1308b488
Chris Lattner authored Jan 11, 2005
```
llvm-svn: 19447
```
1308b488
Add support for bottom-up graphs. · 39c5808f
Chris Lattner authored Jan 11, 2005
```
llvm-svn: 19446
```
39c5808f
Add a marker for the graph root. · b241b443
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19445
```
b241b443
Put the operation name in each node, put the function name on the graph. · 12be0272
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19444
```
12be0272
Split out SDNode::getOperationName into its own method. · 9e4c7612
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19443
```
9e4c7612
Add a helper method. · 7fa992e9
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19442
```
7fa992e9
Implement initial selectiondag printing support. This gets us a nice · 7f65075b
Chris Lattner authored Jan 10, 2005
```
graph with no labels! :)

llvm-svn: 19441
```
7f65075b
Add support for graph operations, and add a viewGraph method to SelectionDAG. · e32371ba
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19440
```
e32371ba
Add a helper method · da7c0504
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19439
```
da7c0504

Jan 10, 2005

Fold setcc instructions into selects. · 1d13a92a
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19438
```
1d13a92a
Add conditional moves for the parity flag. · 5b589ec0
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19437
```
5b589ec0
Lower to the correct functions. This fixes FreeBench/fourinarow · be02d430
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19436
```
be02d430
Implement 8-bit multiply for X86. · 750d38b5
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19435
```
750d38b5
Rework constant pool handling so that function constant pools are no longer · 5326f357
Chris Lattner authored Jan 10, 2005
```
leaked to the system.  Now they are destroyed with the JITMemoryManager is
destroyed.

llvm-svn: 19434
```
5326f357
Apply feedback from Chris. · 3e62e7c6
Jeff Cohen authored Jan 10, 2005
```
llvm-svn: 19432
```
3e62e7c6

Apply feed back from Chris: · 703f7db2

Jeff Cohen authored Jan 10, 2005

  1. Rename createLoaderPass to CreateProfileLoaderPass
  2. Opt shouldn't use the pass registered in CodeGen.

llvm-svn: 19431

703f7db2

Implement a couple of more simplifications. This lets us codegen: · 41b76414

Chris Lattner authored Jan 10, 2005

int test2(int * P, int* Q, int A, int B) {
        return P+A == P;
}

into:

test2:
        movl 4(%esp), %eax
        movl 12(%esp), %eax
        shll $2, %eax
        cmpl $0, %eax
        sete %al
        movzbl %al, %eax
        ret

instead of:

test2:
        movl 4(%esp), %eax
        movl 12(%esp), %ecx
        leal (%eax,%ecx,4), %ecx
        cmpl %eax, %ecx
        sete %al
        movzbl %al, %eax
        ret

ICC is producing worse code:

test2:
        movl      4(%esp), %eax                                 #8.5
        movl      12(%esp), %edx                                #8.5
        lea       (%edx,%edx), %ecx                             #9.9
        addl      %ecx, %ecx                                    #9.9
        addl      %eax, %ecx                                    #9.9
        cmpl      %eax, %ecx                                    #9.16
        movl      $0, %eax                                      #9.16
        sete      %al                                           #9.16
        ret                                                     #9.16

as is GCC (looks like our old code):

test2:
        movl    4(%esp), %edx
        movl    12(%esp), %eax
        leal    (%edx,%eax,4), %ecx
        cmpl    %edx, %ecx
        sete    %al
        movzbl  %al, %eax
        ret

llvm-svn: 19430

41b76414

Fix incorrect constant folds, fixing Stepanov after the SHR patch. · 00c231ba
Chris Lattner authored Jan 10, 2005
```
llvm-svn: 19429
```
00c231ba
Update System project in Visual Studio to reflect renamed files. · c783e07a
Jeff Cohen authored Jan 10, 2005
```
llvm-svn: 19428
```
c783e07a

Constant fold shifts, turning this loop: · 0966a75e

Chris Lattner authored Jan 10, 2005

.LBB_Z5test0PdS__3:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        movl $16000, %ecx
        sarl $3, %ecx
        cmpl %eax, %ecx
        fstpl 16(%esp)
        #FP_REG_KILL
        jg .LBB_Z5test0PdS__3   # no_exit.1

into:

.LBB_Z5test0PdS__3:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        cmpl $2000, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__3   # no_exit.1

llvm-svn: 19427

0966a75e

Rename Unix/*.cpp and Win32/*.cpp to have a *.inc suffix so that the silly · c892a0db

Reid Spencer authored Jan 09, 2005

gdb debugger doesn't get confused on which file it is reading (the one in
lib/System or the one in lib/System/{Win32,Unix})

llvm-svn: 19426

c892a0db

Jan 09, 2005

Add some folds for == and != comparisons. This allows us to · fde3a212

Chris Lattner authored Jan 09, 2005

codegen this loop in stepanov:

no_exit.i:              ; preds = %entry, %no_exit.i, %then.i, %_Z5checkd.exit
        %i.0.0 = phi int [ 0, %entry ], [ %i.0.0, %no_exit.i ], [ %inc.0, %_Z5checkd.exit ], [ %inc.012, %then.i ]              ; <int> [#uses=3]
        %indvar = phi uint [ %indvar.next, %no_exit.i ], [ 0, %entry ], [ 0, %then.i ], [ 0, %_Z5checkd.exit ]          ; <uint> [#uses=3]
        %result_addr.i.0 = phi double [ %tmp.4.i.i, %no_exit.i ], [ 0.000000e+00, %entry ], [ 0.000000e+00, %then.i ], [ 0.000000e+00, %_Z5checkd.exit ]          ; <double> [#uses=1]
        %first_addr.0.i.2.rec = cast uint %indvar to int                ; <int> [#uses=1]
        %first_addr.0.i.2 = getelementptr [2000 x double]* %data, int 0, uint %indvar           ; <double*> [#uses=1]
        %inc.i.rec = add int %first_addr.0.i.2.rec, 1           ; <int> [#uses=1]
        %inc.i = getelementptr [2000 x double]* %data, int 0, int %inc.i.rec            ; <double*> [#uses=1]
        %tmp.3.i.i = load double* %first_addr.0.i.2             ; <double> [#uses=1]
        %tmp.4.i.i = add double %result_addr.i.0, %tmp.3.i.i            ; <double> [#uses=2]
        %tmp.2.i = seteq double* %inc.i, getelementptr ([2000 x double]* %data, int 0, int 2000)                ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2.i, label %_Z10accumulateIPddET0_T_S2_S1_.exit, label %no_exit.i

To this:

.LBB_Z4testIPddEvT_S1_T0__1:    # no_exit.i
        fldl data(,%eax,8)
        fldl 16(%esp)
        faddp %st(1)
        fstpl 16(%esp)
        incl %eax
        movl %eax, %ecx
        shll $3, %ecx
        cmpl $16000, %ecx
        #FP_REG_KILL
        jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i

instead of this:

.LBB_Z4testIPddEvT_S1_T0__1:    # no_exit.i
        fldl data(,%eax,8)
        fldl 16(%esp)
        faddp %st(1)
        fstpl 16(%esp)
        incl %eax
        leal data(,%eax,8), %ecx
        leal data+16000, %edx
        cmpl %edx, %ecx
        #FP_REG_KILL
        jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i

llvm-svn: 19425

fde3a212

Add last four createXxxPass functions · 292845d2
Jeff Cohen authored Jan 09, 2005
```
llvm-svn: 19424
```
292845d2
Fix VC++ compilation error · 7d1670da
Jeff Cohen authored Jan 09, 2005
```
llvm-svn: 19423
```
7d1670da
Print the DAG out more like a DAG in nested format. · e6f7882c
Chris Lattner authored Jan 09, 2005
```
llvm-svn: 19422
```
e6f7882c
Print out nodes sorted by their address to make it easier to find them in a list. · 1270acc1
Chris Lattner authored Jan 09, 2005
```
llvm-svn: 19421
```
1270acc1

Codegen (Reg|imm)+&GV as an LEA, because we cannot put it into the immediate field · cf8fd0c0

Chris Lattner authored Jan 09, 2005

of an ADDri (due to current restrictions on MachineOperand :( ).  This allows
us to generate:

        leal Data+16000, %edx

instead of:

        movl $Data, %edx
        addl $16000, %edx

llvm-svn: 19420

cf8fd0c0

Add a simple transformation. This allows us to compile one of the inner · 3d5d5022

Chris Lattner authored Jan 09, 2005

loops in stepanov to this:

.LBB_Z5test0PdS__2:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        cmpl $2000, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2

instead of this:

.LBB_Z5test0PdS__2:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        movl $data, %ecx
        movl %ecx, %edx
        addl $16000, %edx
        subl %ecx, %edx
        movl %edx, %ecx
        sarl $2, %ecx
        shrl $29, %ecx
        addl %ecx, %edx
        sarl $3, %edx
        cmpl %edx, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2

The old instruction selector produced:

.LBB_Z5test0PdS__2:     # no_exit.1
        fldl 24(%esp)
        faddl data(,%eax,8)
        fstl 24(%esp)
        movl %eax, %ecx
        incl %ecx
        incl %eax
        leal data+16000, %edx
        movl $data, %edi
        subl %edi, %edx
        movl %edx, %edi
        sarl $2, %edi
        shrl $29, %edi
        addl %edi, %edx
        sarl $3, %edx
        cmpl %edx, %ecx
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2   # no_exit.1

Which is even worse!

llvm-svn: 19419

3d5d5022

Fix copy and pasto's for FP -> Int. This fixes fldry · 66d34302
Chris Lattner authored Jan 09, 2005
```
llvm-svn: 19418
```
66d34302
Fix a bug legalizing call instructions (make sure to remember all result · 9242c504
Chris Lattner authored Jan 09, 2005
```
values), and eliminate some switch statements.

llvm-svn: 19417
```
9242c504

Fix a minor bug legalizing dynamic_stackalloc. This allows us to compile · 02f5ce20

Chris Lattner authored Jan 09, 2005

std::__pad<wchar_t, std::char_traits<wchar_t> >::_S_pad(std::ios_base&, wchar_t, wchar_t*, wchar_t const*, int, int, bool)

from libstdc++

llvm-svn: 19416

02f5ce20

Teach legalize to deal with DYNAMIC_STACKALLOC (aka a dynamic llvm alloca) · ec26b48d
Chris Lattner authored Jan 09, 2005
```
llvm-svn: 19415
```
ec26b48d
Initial implementation of FP->INT and INT->FP casts · 282781c7
Chris Lattner authored Jan 09, 2005
```
Also, fix zero_extend from bool to i8, which fixes Shootout/objinst.

llvm-svn: 19414
```
282781c7