Commits · b7ebe65c561630310d580d1df7692fddff10b9c8 · Lorenzo Albano / LLVM bpEVL

Aug 12, 2005
- Change break critical edges to not remove, then insert, PHI node entries. · b7ebe65c
  Chris Lattner authored Aug 12, 2005
```
Instead, just update the BB in-place.  This is both faster, and it prevents
split-critical-edges from shuffling the PHI argument list unneccesarily.

llvm-svn: 22765
```
  b7ebe65c
- match gcc's use of tabs, makes diffs easier · 8c6701be
  Andrew Lenharth authored Aug 12, 2005
```
llvm-svn: 22764
```
  8c6701be
- .section cleanup, patch from Nicholas Riley · ca94102d
  Andrew Lenharth authored Aug 12, 2005
```
llvm-svn: 22763
```
  ca94102d
Aug 11, 2005

1. Added the function isOpcWithIntImmediate to simplify testing of operand with · a50f770a

Jim Laskey authored Aug 11, 2005

specified opcode and an integer constant right operand.

2. Modified ISD::SHL, ISD::SRL, ISD::SRA to use rlwinm when applied after a mask.

llvm-svn: 22761

a50f770a

Tidied up the use of dyn_cast<ConstantSDNode> by using isIntImmediate more. · d418d752
Chris Lattner authored Aug 11, 2005
```
Patch by Jim Laskey.

llvm-svn: 22760
```
d418d752

Use a more efficient method of creating integer and float virtual registers · c5e1312b

Chris Lattner authored Aug 11, 2005

(avoids an extra level of indirection in MakeReg).

  defined MakeIntReg using RegMap->createVirtualRegister(PPC32::GPRCRegisterClass)
  defined MakeFPReg using RegMap->createVirtualRegister(PPC32::FPRCRegisterClass)

  s/MakeReg(MVT::i32)/MakeIntReg/
  s/MakeReg(MVT::f64)/MakeFPReg/

Patch by Jim Laskey!

llvm-svn: 22759

c5e1312b

Add a select_cc optimization for recognizing abs(int). This speeds up an · 5c7656fd
Nate Begeman authored Aug 11, 2005
```
integer MPEG encoding loop by a factor of two.

llvm-svn: 22758
```
5c7656fd

Some SELECT_CC cleanups: · 180b0889

Nate Begeman authored Aug 11, 2005

1. move assertions for node creation to getNode()
2. legalize the values returned in ExpandOp immediately
3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's,
   allowing them to be cleaned up significantly.

This paves the way to pick up additional optimizations on SELECT_CC, such
as sum-of-absolute-differences.

llvm-svn: 22757

180b0889

Aug 10, 2005

Make SELECT illegal on PPC32, switch to using SELECT_CC, which more closely · 5646b181

Nate Begeman authored Aug 10, 2005

reflects what the hardware is capable of.  This significantly simplifies
the CC handling logic throughout the ISel.

llvm-svn: 22756

5646b181

Add new node, SELECT_CC. This node is for targets that don't natively · e5b86d74
Nate Begeman authored Aug 10, 2005
```
implement SELECT.

llvm-svn: 22755
```
e5b86d74

Changes for PPC32ISelPattern.cpp · 3428b956

Chris Lattner authored Aug 10, 2005

1. Clean up how SelectIntImmediateExpr handles use counts.
2. "Subtract from" was not clearing hi 16 bits.

Patch by Jim Laskey

llvm-svn: 22754

3428b956

Fix an oversight that may be causing PR617. · 21c0fd9e
Chris Lattner authored Aug 10, 2005
```
llvm-svn: 22753
```
21c0fd9e
remove some trickiness that broke yacr2 and some other programs last night · 62df7989
Chris Lattner authored Aug 10, 2005
```
llvm-svn: 22751
```
62df7989
Changed the XOR case to use the isOprNot predicate. · aeedcc7f
Chris Lattner authored Aug 10, 2005
```
Patch by Jim Laskey!

llvm-svn: 22750
```
aeedcc7f

1. Refactored handling of integer immediate values for add, or, xor and sub. · 67d07537

Chris Lattner authored Aug 10, 2005

  New routine: ISel::SelectIntImmediateExpr
  2. Now checking use counts of large constants.  If use count is > 2 then drop
  thru so that the constant gets loaded into a register.
  Source:

int %test1(int %a) {
entry:
       %tmp.1 = add int %a,      123456789      ; <int> [#uses=1]
       %tmp.2 = or  int %tmp.1,  123456789      ; <int> [#uses=1]
       %tmp.3 = xor int %tmp.2,  123456789      ; <int> [#uses=1]
       %tmp.4 = sub int %tmp.3, -123456789      ; <int> [#uses=1]
       ret int %tmp.4
}

Did Emit:

       .machine ppc970


       .text
       .align  2
       .globl  _test1
_test1:
.LBB_test1_0:   ; entry
       addi r2, r3, -13035
       addis r2, r2, 1884
       ori r2, r2, 52501
       oris r2, r2, 1883
       xori r2, r2, 52501
       xoris r2, r2, 1883
       addi r2, r2, 52501
       addis r3, r2, 1883
       blr


Now Emits:

       .machine ppc970


       .text
       .align  2
       .globl  _test1
_test1:
.LBB_test1_0:   ; entry
       lis r2, 1883
       ori r2, r2, 52501
       add r3, r3, r2
       or r3, r3, r2
       xor r3, r3, r2
       add r3, r3, r2
       blr

Patch by Jim Laskey!

llvm-svn: 22749

67d07537

sorry!! this is temporary; for some reason the nasty constmul code seems to · 1c2f9fdf

Duraid Madina authored Aug 10, 2005

be an infinite loop when using g++-4.0.1*, this kills the ia64 nightly
tester. A proper fix shall be forthcoming!!! thanks for not killing me. :)

llvm-svn: 22748

1c2f9fdf

Fix a bug compiling: select (i32 < i32), f32, f32 · 5f56d71c
Chris Lattner authored Aug 10, 2005
```
llvm-svn: 22747
```
5f56d71c

Make loop-simplify produce better loops by turning PHI nodes like X = phi [X, Y] · f83ce5fa

Chris Lattner authored Aug 10, 2005

into just Y.  This often occurs when it seperates loops that have collapsed loop
headers.  This implements LoopSimplify/phi-node-simplify.ll

llvm-svn: 22746

f83ce5fa

Allow indvar simplify to canonicalize ANY affine IV, not just affine IVs with · 677d8578
Chris Lattner authored Aug 10, 2005
```
constant stride.  This implements Transforms/IndVarsSimplify/variable-stride-ivs.ll

llvm-svn: 22744
```
677d8578
Fix an obvious oops · 35c0e2ee
Chris Lattner authored Aug 10, 2005
```
llvm-svn: 22742
```
35c0e2ee

Teach LSR to strength reduce IVs that have a loop-invariant but non-constant stride. · edff91a4

Chris Lattner authored Aug 10, 2005

For code like this:

void foo(float *a, float *b, int n, int stride_a, int stride_b) {
  int i;
  for (i=0; i<n; i++)
      a[i*stride_a] = b[i*stride_b];
}

we now emit:

.LBB_foo2_2:    ; no_exit
        lfs f0, 0(r4)
        stfs f0, 0(r3)
        addi r7, r7, 1
        add r4, r2, r4
        add r3, r6, r3
        cmpw cr0, r7, r5
        blt .LBB_foo2_2 ; no_exit

instead of:

.LBB_foo_2:     ; no_exit
        mullw r8, r2, r7     ;; multiply!
        slwi r8, r8, 2
        lfsx f0, r4, r8
        mullw r8, r2, r6     ;; multiply!
        slwi r8, r8, 2
        stfsx f0, r3, r8
        addi r2, r2, 1
        cmpw cr0, r2, r5
        blt .LBB_foo_2  ; no_exit

loops with variable strides occur pretty often.  For example, in SPECFP2K
there are 317 variable strides in 177.mesa, 3 in 179.art, 14 in 188.ammp,
56 in 168.wupwise, 36 in 172.mgrid.

Now we can allow indvars to turn functions written like this:

void foo2(float *a, float *b, int n, int stride_a, int stride_b) {
  int i, ai = 0, bi = 0;
  for (i=0; i<n; i++)
    {
      a[ai] = b[bi];
      ai += stride_a;
      bi += stride_b;
    }
}

into code like the above for better analysis.  With this patch, they generate
identical code.

llvm-svn: 22740

edff91a4

Fix Regression/Transforms/LoopStrengthReduce/phi_node_update_multiple_preds.ll · dde7dc52
Chris Lattner authored Aug 10, 2005
```
by being more careful about updating PHI nodes

llvm-svn: 22739
```
dde7dc52

Fix some 80 column violations. · c6c4d99a

Chris Lattner authored Aug 09, 2005

Once we compute the evolution for a GEP, tell SE about it.  This allows users
of the GEP to know it, if the users are not direct.  This allows us to compile
this testcase:

void fbSolidFillmmx(int w, unsigned char *d) {
    while (w >= 64) {
        *(unsigned long long *) (d +  0) = 0;
        *(unsigned long long *) (d +  8) = 0;
        *(unsigned long long *) (d + 16) = 0;
        *(unsigned long long *) (d + 24) = 0;
        *(unsigned long long *) (d + 32) = 0;
        *(unsigned long long *) (d + 40) = 0;
        *(unsigned long long *) (d + 48) = 0;
        *(unsigned long long *) (d + 56) = 0;
        w -= 64;
        d += 64;
    }
}

into:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r2, 0
        stw r2, 0(r4)
        stw r2, 4(r4)
        stw r2, 8(r4)
        stw r2, 12(r4)
        stw r2, 16(r4)
        stw r2, 20(r4)
        stw r2, 24(r4)
        stw r2, 28(r4)
        stw r2, 32(r4)
        stw r2, 36(r4)
        stw r2, 40(r4)
        stw r2, 44(r4)
        stw r2, 48(r4)
        stw r2, 52(r4)
        stw r2, 56(r4)
        stw r2, 60(r4)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

instead of:

.LBB_fbSolidFillmmx_2:  ; no_exit
        li r11, 0
        stw r11, 0(r4)
        stw r11, 4(r4)
        stwx r11, r10, r4
        add r12, r10, r4
        stw r11, 4(r12)
        stwx r11, r9, r4
        add r12, r9, r4
        stw r11, 4(r12)
        stwx r11, r8, r4
        add r12, r8, r4
        stw r11, 4(r12)
        stwx r11, r7, r4
        add r12, r7, r4
        stw r11, 4(r12)
        stwx r11, r6, r4
        add r12, r6, r4
        stw r11, 4(r12)
        stwx r11, r5, r4
        add r12, r5, r4
        stw r11, 4(r12)
        stwx r11, r2, r4
        add r12, r2, r4
        stw r11, 4(r12)
        addi r4, r4, 64
        addi r3, r3, -64
        cmpwi cr0, r3, 63
        bgt .LBB_fbSolidFillmmx_2       ; no_exit

llvm-svn: 22737

c6c4d99a

implement two helper methods · b310ac4a
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22736
```
b310ac4a
Fix spelling, fix some broken canonicalizations by my last patch · 679f5b0b
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22734
```
679f5b0b
add a optimization note · 54ee86ac
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22732
```
54ee86ac

Aug 09, 2005

add cc nodes to the AllNodes list so they show up in Graphviz output · 14e060f7
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22731
```
14e060f7
Update the targets to the new SETCC/CondCodeSDNode interfaces. · 6ec7745e
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22729
```
6ec7745e

Eliminate the SetCCSDNode in favor of a CondCodeSDNode class. This pulls the · d47675ed

Chris Lattner authored Aug 09, 2005

CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf.  This will make it possible for other node to use
CC's as operands in the future...

llvm-svn: 22728

d47675ed

Minor cleanup patch, no functionality changes. Written by Jim Laskey. · 2035c4f7
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22727
```
2035c4f7
Fix CodeGen/Generic/div-neg-power-2.ll, a regression from last night. · 4c62c647
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22726
```
4c62c647
SCEVAddExpr::get() of an empty list is invalid. · 02742710
Chris Lattner authored Aug 09, 2005
```
llvm-svn: 22724
```
02742710

Implement: LoopStrengthReduce/share_ivs.ll · a091ff17

Chris Lattner authored Aug 09, 2005

Two changes:
  * Only insert one PHI node for each stride.  Other values are live in
    values.  This cannot introduce higher register pressure than the
    previous approach, and can take advantage of reg+reg addressing modes.
  * Factor common base values out of uses before moving values from the
    base to the immediate fields.  This improves codegen by starting the
    stride-specific PHI node out at a common place for each IV use.

As an example, we used to generate this for a loop in swim:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfd f0, 0(r8)
        stfd f0, 0(r3)
        lfd f0, 0(r6)
        stfd f0, 0(r7)
        lfd f0, 0(r2)
        stfd f0, 0(r5)
        addi r9, r9, 1
        addi r2, r2, 8
        addi r5, r5, 8
        addi r6, r6, 8
        addi r7, r7, 8
        addi r8, r8, 8
        addi r3, r3, 8
        cmpw cr0, r9, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

now we emit:

.LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_2:        ; no_exit.7.i
        lfdx f0, r8, r2
        stfdx f0, r9, r2
        lfdx f0, r5, r2
        stfdx f0, r7, r2
        lfdx f0, r3, r2
        stfdx f0, r6, r2
        addi r10, r10, 1
        addi r2, r2, 8
        cmpw cr0, r10, r4
        bgt .LBB_main_no_exit_2E_6_2E_i_no_exit_2E_7_2E_i_1

As another more dramatic example, we used to emit this:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfd f0, 8(r21)
        lfd f4, 8(r3)
        lfd f5, 8(r27)
        lfd f6, 8(r22)
        lfd f7, 8(r5)
        lfd f8, 8(r6)
        lfd f9, 8(r30)
        lfd f10, 8(r11)
        lfd f11, 8(r12)
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfd f0, 8(r4)
        lfd f0, 8(r25)
        lfd f5, 8(r26)
        lfd f6, 8(r23)
        lfd f9, 8(r28)
        lfd f10, 8(r10)
        lfd f12, 8(r9)
        lfd f13, 8(r29)
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfd f0, 8(r24)
        lfd f0, 8(r8)
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfd f0, 8(r2)
        addi r20, r20, 1
        addi r2, r2, 8
        addi r8, r8, 8
        addi r10, r10, 8
        addi r12, r12, 8
        addi r6, r6, 8
        addi r29, r29, 8
        addi r28, r28, 8
        addi r26, r26, 8
        addi r25, r25, 8
        addi r24, r24, 8
        addi r5, r5, 8
        addi r23, r23, 8
        addi r22, r22, 8
        addi r3, r3, 8
        addi r9, r9, 8
        addi r11, r11, 8
        addi r30, r30, 8
        addi r27, r27, 8
        addi r21, r21, 8
        addi r4, r4, 8
        cmpw cr0, r20, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

we now emit:

.LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_2:       ; no_exit.1.i19
        lfdx f0, r21, r20
        lfdx f4, r3, r20
        lfdx f5, r27, r20
        lfdx f6, r22, r20
        lfdx f7, r5, r20
        lfdx f8, r6, r20
        lfdx f9, r30, r20
        lfdx f10, r11, r20
        lfdx f11, r12, r20
        fsub f10, f10, f11
        fadd f5, f4, f5
        fmul f5, f5, f1
        fadd f6, f6, f7
        fadd f6, f6, f8
        fadd f6, f6, f9
        fmadd f0, f5, f6, f0
        fnmsub f0, f10, f2, f0
        stfdx f0, r4, r20
        lfdx f0, r25, r20
        lfdx f5, r26, r20
        lfdx f6, r23, r20
        lfdx f9, r28, r20
        lfdx f10, r10, r20
        lfdx f12, r9, r20
        lfdx f13, r29, r20
        fsub f11, f13, f11
        fadd f4, f4, f5
        fmul f4, f4, f1
        fadd f5, f6, f9
        fadd f5, f5, f10
        fadd f5, f5, f12
        fnmsub f0, f4, f5, f0
        fnmsub f0, f11, f3, f0
        stfdx f0, r24, r20
        lfdx f0, r8, r20
        fsub f4, f7, f8
        fsub f5, f12, f10
        fnmsub f0, f5, f2, f0
        fnmsub f0, f4, f3, f0
        stfdx f0, r2, r20
        addi r19, r19, 1
        addi r20, r20, 8
        cmpw cr0, r19, r7
        bgt .LBB_main_L_90_no_exit_2E_0_2E_i16_no_exit_2E_1_2E_i19_1

llvm-svn: 22722

a091ff17

Suck the base value out of the UsersToProcess vector into the BasedUser · 37c24cc9
Chris Lattner authored Aug 08, 2005
```
class to simplify the code.  Fuse two loops.

llvm-svn: 22721
```
37c24cc9

Split MoveLoopVariantsToImediateField out from MoveImmediateValues. The · 37ed895b

Chris Lattner authored Aug 08, 2005

first is a correctness thing, and the later is an optzn thing.  This also
is needed to support a future change.

llvm-svn: 22720

37ed895b

Factor out some common code, and be smarter about when to emit load hi/lo · c92787e1
Nate Begeman authored Aug 08, 2005
```
code sequences.

llvm-svn: 22719
```
c92787e1

Aug 08, 2005
- Allow tools with "consume after" options (like lli) to take more positional · d09a9a78
  Chris Lattner authored Aug 08, 2005
```
opts than they take directly.  Thanks to John C for pointing this problem
out to me!

llvm-svn: 22717
```
  d09a9a78
- Remove getImmediateForOpcode, which is now dead. · 64068eb7
  Chris Lattner authored Aug 08, 2005
```
Patch by Jim Laskey.

llvm-svn: 22716
```
  64068eb7
- Add new immediate handling support for mul/div. · 25388199
  Chris Lattner authored Aug 08, 2005
```
Patch by Jim Laskey!

llvm-svn: 22715
```
  25388199
- Add support for OR/XOR/SUB immediates that are handled with the new immediate · 8e9dc319
  Chris Lattner authored Aug 08, 2005
```
way.  This allows ORI/ORIS pairs, for example.

llvm-svn: 22714
```
  8e9dc319