Commits · 0c26a0b9024b77f7e8c78e648ee0bbd2e15ae3f8 · Roger Ferrer / llvm-epi-0.8

Aug 07, 2005

add a small simplification that can be exposed after promotion/expansion · 0c26a0b9
Chris Lattner authored Aug 07, 2005
```
llvm-svn: 22691
```
0c26a0b9

* Use the new PHINode::hasConstantValue method to simplify some code · f4dd8c44

Chris Lattner authored Aug 07, 2005

* Teach this code to move allocas out of the loop when tail call eliminating
  a call marked 'tail'.  This implements TailCallElim/move_alloca_for_tail_call.ll
* Do not perform this transformation if a call is marked 'tail' and if there
  are allocas that we cannot move out of the loop in #2.  Doing so would increase
  the stack usage of the function.  This implements fixes
  PR615 and TailCallElim/dont-tce-tail-marked-call.ll.

llvm-svn: 22690

f4dd8c44

Aug 06, 2005

Consolidate the GPOpt stuff to all use the Subtarget, instead of still · 983a415b

Chris Lattner authored Aug 05, 2005

depending on the command line option.  Now the command line option just
sets the subtarget as appropriate.  G5 opts will now default to on on
G5-enabled nightly testers among other machines.

llvm-svn: 22688

983a415b

Aug 05, 2005
- adjust to change in getSubtarget() api · 158acab9
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22687
```
  158acab9
- Enable gp optimizations by default when available, even when a target triple · 431b8d80
  Chris Lattner authored Aug 05, 2005
```
is available, since the target triple doesn't specify whether to use gpopts
or not.

llvm-svn: 22685
```
  431b8d80
- add a note · 11fc319b
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22681
```
  11fc319b
- Change FindEarliestCallSeqEnd (used by libcall insertion) to use a set to · 96ad3132
  Chris Lattner authored Aug 05, 2005
```
avoid revisiting nodes more than once.  This eliminates a source of
potentially exponential behavior.  For a small function in 191.fma3d
(hexah_stress_divergence_), this speeds up isel from taking > 20mins to
taking 0.07s.

llvm-svn: 22680
```
  96ad3132
- Fix a use-of-dangling-pointer bug, from the introduction of SrcValue's. · 1095dc94
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22679
```
  1095dc94
- Fix a latent bug in the libcall inserter that was exposed by Nate's patch · cabdc345
  Chris Lattner authored Aug 05, 2005
```
yesterday.  This fixes whetstone and a bunch of programs in the External tests.

llvm-svn: 22678
```
  cabdc345
- don't crash when running the PPC backend on non-ppc hosts without specifying · 8c636bf8
  Chris Lattner authored Aug 05, 2005
```
a subtarget.

llvm-svn: 22677
```
  8c636bf8
- PHINode::hasConstantValue should never return the PHI itself, even if the · 6e709c13
  Chris Lattner authored Aug 05, 2005
```
PHI is its only operand.

llvm-svn: 22676
```
  6e709c13
- Fix an iterator invalidation problem when we decide a phi has a constant value · 1749aaa5
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22675
```
  1749aaa5
- Make sure to clean CastedPointers after casts are potentially deleted. · 11e7a5ed
  Chris Lattner authored Aug 05, 2005
```
This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas

llvm-svn: 22673
```
  11e7a5ed
- now that hasConstantValue defaults to only returning values that dominate · 9f9c260b
  Chris Lattner authored Aug 05, 2005
```
the PHI node, this ugly code can vanish.

llvm-svn: 22672
```
  9f9c260b
- Invoke instructions do not dominate all successors · 37774aff
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22671
```
  37774aff
- Now that hasConstantValue is more careful w.r.t. returning values that only · 6f58350d
  Chris Lattner authored Aug 05, 2005
```
dominate the PHI node, this code can go away.  This also makes passes more
aggressive, e.g. implementing Transforms/CondProp/phisimplify2.ll

llvm-svn: 22670
```
  6f58350d
- Use the bool argument to hasConstantValue to decide whether the client is · bcd8d2c6
  Chris Lattner authored Aug 05, 2005
```
prepared to deal with return values that do not dominate the PHI.  If we
cannot prove that the result dominates the PHI node, do not return it if
the client can't cope.

llvm-svn: 22669
```
  bcd8d2c6
- This code can handle non-dominating instructions · 257efb2a
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22667
```
  257efb2a
- Mark hasConstantValue as a const method · 1d8b2487
  Chris Lattner authored Aug 05, 2005
```
llvm-svn: 22666
```
  1d8b2487
- Add an extra parameter that Chris requested · 0a94dec7
  Nate Begeman authored Aug 04, 2005
```
llvm-svn: 22665
```
  0a94dec7
- Fix a fixme in CondPropagate.cpp by moving a PhiNode optimization into · b392321c
  Nate Begeman authored Aug 04, 2005
```
BasicBlock's removePredecessor routine.  This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp

llvm-svn: 22664
```
  b392321c
- Modify how immediates are removed from base expressions to deal with the fact · 45f8b6e7
  Chris Lattner authored Aug 04, 2005
```
that the symbolic evaluator is not always able to use subtraction to remove
expressions.  This makes the code faster, and fixes the last crash on 178.galgel.
Finally, add a statistic to see how many phi nodes are inserted.

On 178.galgel, we get the follow stats:

2562 loop-reduce  - Number of PHIs inserted
3927 loop-reduce  - Number of GEPs strength reduced

llvm-svn: 22662
```
  45f8b6e7
Aug 04, 2005

Fix a fixme in LegalizeDAG · 77558da5
Nate Begeman authored Aug 04, 2005
```
llvm-svn: 22661
```
77558da5
Hack to naturally align doubles in the constant pool. Remove this once we · e3cbe102
Nate Begeman authored Aug 04, 2005
```
know what The Right Thing To Do is.

llvm-svn: 22660
```
e3cbe102
Use the new subtarget support to automatically choose the correct ABI · 295ea906
Nate Begeman authored Aug 04, 2005
```
and asm printer for PowerPC if one is not specified.

llvm-svn: 22659
```
295ea906

* Refactor some code into a new BasedUser::RewriteInstructionToUseNewBase · a6d7c355

Chris Lattner authored Aug 04, 2005

  method.
* Fix a crash on 178.galgel, where we would insert expressions before PHI
  nodes instead of into the PHI node predecessor blocks.

llvm-svn: 22657

a6d7c355

Fix a case that caused this to crash on 178.galgel · 0f7c0fa2
Chris Lattner authored Aug 04, 2005
```
llvm-svn: 22653
```
0f7c0fa2

Teach LSR about loop-variant expressions, such as loops like this: · acc42c4d

Chris Lattner authored Aug 04, 2005

  for (i = 0; i < N; ++i)
    A[i][foo()] = 0;

here we still want to strength reduce the A[i] part, even though foo() is
l-v.

This also simplifies some of the 'CanReduce' logic.

This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll

llvm-svn: 22652

acc42c4d

Remove some more dead code. · 456044b7
Nate Begeman authored Aug 04, 2005
```
llvm-svn: 22650
```
456044b7

Refactor this code substantially with the following improvements: · eaf24725

Chris Lattner authored Aug 04, 2005

  1. We only analyze instructions once, guaranteed
  2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with
     something much simpler.

The next step is to handle expressions that are not all indvar+loop-invariant
values (e.g. handling indvar+loopvariant).

llvm-svn: 22649

eaf24725

No, IDEFs shouldn't be JITed · 5adb830b
Andrew Lenharth authored Aug 04, 2005
```
llvm-svn: 22648
```
5adb830b
* Unbreak release build · a54e201e
Misha Brukman authored Aug 04, 2005
```
* Add comments to #endif pragmas for readability

llvm-svn: 22647
```
a54e201e
* Unbreak optimized build (noticed by Eric van Riet Paap) · 41acd5e0
Misha Brukman authored Aug 04, 2005
```
* Comment #endif clauses for readability

llvm-svn: 22646
```
41acd5e0
Add Subtarget support to PowerPC. Next up, using it. · 3bcfcd94
Nate Begeman authored Aug 04, 2005
```
llvm-svn: 22644
```
3bcfcd94
refactor some code · 6f286b76
Chris Lattner authored Aug 04, 2005
```
llvm-svn: 22643
```
6f286b76
invert to if's to make the logic simpler · 65107490
Chris Lattner authored Aug 04, 2005
```
llvm-svn: 22641
```
65107490

When processing outer loops and we find uses of an IV in inner loops, make · a0102fbc

Chris Lattner authored Aug 04, 2005

sure to handle the use, just don't recurse into it.

This permits us to generate this code for a simple nested loop case:

.LBB_foo_0:     ; entry
        stwu r1, -48(r1)
        stw r29, 44(r1)
        stw r30, 40(r1)
        mflr r11
        stw r11, 56(r1)
        lis r2, ha16(L_A$non_lazy_ptr)
        lwz r30, lo16(L_A$non_lazy_ptr)(r2)
        li r29, 1
.LBB_foo_1:     ; no_exit.0
        bl L_bar$stub
        li r2, 1
        or r3, r30, r30
.LBB_foo_2:     ; no_exit.1
        lfd f0, 8(r3)
        stfd f0, 0(r3)
        addi r4, r2, 1
        addi r3, r3, 8
        cmpwi cr0, r2, 100
        or r2, r4, r4
        bne .LBB_foo_2  ; no_exit.1
.LBB_foo_3:     ; loopexit.1
        addi r30, r30, 800
        addi r2, r29, 1
        cmpwi cr0, r29, 100
        or r29, r2, r2
        bne .LBB_foo_1  ; no_exit.0
.LBB_foo_4:     ; return
        lwz r11, 56(r1)
        mtlr r11
        lwz r30, 40(r1)
        lwz r29, 44(r1)
        lwz r1, 0(r1)
        blr

instead of this:

_foo:
.LBB_foo_0:     ; entry
        stwu r1, -48(r1)
        stw r28, 44(r1)                   ;; uses an extra register.
        stw r29, 40(r1)
        stw r30, 36(r1)
        mflr r11
        stw r11, 56(r1)
        li r30, 1
        li r29, 0
        or r28, r29, r29
.LBB_foo_1:     ; no_exit.0
        bl L_bar$stub
        mulli r2, r28, 800           ;; unstrength-reduced multiply
        lis r3, ha16(L_A$non_lazy_ptr)   ;; loop invariant address computation
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        mulli r4, r29, 800           ;; unstrength-reduced multiply
        addi r3, r3, 8
        add r3, r4, r3
        li r4, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 0(r3)
        stfd f0, 0(r2)
        addi r5, r4, 1
        addi r2, r2, 8                 ;; multiple stride 8 IV's
        addi r3, r3, 8
        cmpwi cr0, r4, 100
        or r4, r5, r5
        bne .LBB_foo_2  ; no_exit.1
.LBB_foo_3:     ; loopexit.1
        addi r28, r28, 1               ;;; Many IV's with stride 1
        addi r29, r29, 1
        addi r2, r30, 1
        cmpwi cr0, r30, 100
        or r30, r2, r2
        bne .LBB_foo_1  ; no_exit.0
.LBB_foo_4:     ; return
        lwz r11, 56(r1)
        mtlr r11
        lwz r30, 36(r1)
        lwz r29, 40(r1)
        lwz r28, 44(r1)
        lwz r1, 0(r1)
        blr

llvm-svn: 22640

a0102fbc

Teach loop-reduce to see into nested loops, to pull out immediate values · fc624704

Chris Lattner authored Aug 03, 2005

pushed down by SCEV.

In a nested loop case, this allows us to emit this:

        lis r3, ha16(L_A$non_lazy_ptr)
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        li r3, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 8(r2)        ;; Uses offset of 8 instead of 0
        stfd f0, 0(r2)
        addi r4, r3, 1
        addi r2, r2, 8
        cmpwi cr0, r3, 100
        or r3, r4, r4
        bne .LBB_foo_2  ; no_exit.1

instead of this:

        lis r3, ha16(L_A$non_lazy_ptr)
        lwz r3, lo16(L_A$non_lazy_ptr)(r3)
        add r2, r2, r3
        addi r3, r3, 8
        li r4, 1
.LBB_foo_2:     ; no_exit.1
        lfd f0, 0(r3)
        stfd f0, 0(r2)
        addi r5, r4, 1
        addi r2, r2, 8
        addi r3, r3, 8
        cmpwi cr0, r4, 100
        or r4, r5, r5
        bne .LBB_foo_2  ; no_exit.1

llvm-svn: 22639

fc624704

improve debug output · bb78c97e
Chris Lattner authored Aug 03, 2005
```
llvm-svn: 22638
```
bb78c97e

Scalar SSE: load +0.0 -> xorps/xorpd · 8d394eb7

Nate Begeman authored Aug 03, 2005

Scalar SSE: a < b ? c : 0.0 -> cmpss, andps
Scalar SSE: float -> i16 needs to be promoted

llvm-svn: 22637

8d394eb7