- Aug 07, 2005
-
-
Chris Lattner authored
llvm-svn: 22691
-
Chris Lattner authored
* Teach this code to move allocas out of the loop when tail call eliminating a call marked 'tail'. This implements TailCallElim/move_alloca_for_tail_call.ll * Do not perform this transformation if a call is marked 'tail' and if there are allocas that we cannot move out of the loop in #2. Doing so would increase the stack usage of the function. This implements fixes PR615 and TailCallElim/dont-tce-tail-marked-call.ll. llvm-svn: 22690
-
- Aug 06, 2005
-
-
Chris Lattner authored
depending on the command line option. Now the command line option just sets the subtarget as appropriate. G5 opts will now default to on on G5-enabled nightly testers among other machines. llvm-svn: 22688
-
- Aug 05, 2005
-
-
Chris Lattner authored
llvm-svn: 22687
-
Chris Lattner authored
is available, since the target triple doesn't specify whether to use gpopts or not. llvm-svn: 22685
-
Chris Lattner authored
llvm-svn: 22681
-
Chris Lattner authored
avoid revisiting nodes more than once. This eliminates a source of potentially exponential behavior. For a small function in 191.fma3d (hexah_stress_divergence_), this speeds up isel from taking > 20mins to taking 0.07s. llvm-svn: 22680
-
Chris Lattner authored
llvm-svn: 22679
-
Chris Lattner authored
yesterday. This fixes whetstone and a bunch of programs in the External tests. llvm-svn: 22678
-
Chris Lattner authored
a subtarget. llvm-svn: 22677
-
Chris Lattner authored
PHI is its only operand. llvm-svn: 22676
-
Chris Lattner authored
llvm-svn: 22675
-
Chris Lattner authored
This fixes LSR crashes on 301.apsi, 191.fma3d, and 189.lucas llvm-svn: 22673
-
Chris Lattner authored
the PHI node, this ugly code can vanish. llvm-svn: 22672
-
Chris Lattner authored
llvm-svn: 22671
-
Chris Lattner authored
dominate the PHI node, this code can go away. This also makes passes more aggressive, e.g. implementing Transforms/CondProp/phisimplify2.ll llvm-svn: 22670
-
Chris Lattner authored
prepared to deal with return values that do not dominate the PHI. If we cannot prove that the result dominates the PHI node, do not return it if the client can't cope. llvm-svn: 22669
-
Chris Lattner authored
llvm-svn: 22667
-
Chris Lattner authored
llvm-svn: 22666
-
Nate Begeman authored
llvm-svn: 22665
-
Nate Begeman authored
BasicBlock's removePredecessor routine. This requires shuffling around the definition and implementation of hasContantValue from Utils.h,cpp into Instructions.h,cpp llvm-svn: 22664
-
Chris Lattner authored
that the symbolic evaluator is not always able to use subtraction to remove expressions. This makes the code faster, and fixes the last crash on 178.galgel. Finally, add a statistic to see how many phi nodes are inserted. On 178.galgel, we get the follow stats: 2562 loop-reduce - Number of PHIs inserted 3927 loop-reduce - Number of GEPs strength reduced llvm-svn: 22662
-
- Aug 04, 2005
-
-
Nate Begeman authored
llvm-svn: 22661
-
Nate Begeman authored
know what The Right Thing To Do is. llvm-svn: 22660
-
Nate Begeman authored
and asm printer for PowerPC if one is not specified. llvm-svn: 22659
-
Chris Lattner authored
method. * Fix a crash on 178.galgel, where we would insert expressions before PHI nodes instead of into the PHI node predecessor blocks. llvm-svn: 22657
-
Chris Lattner authored
llvm-svn: 22653
-
Chris Lattner authored
for (i = 0; i < N; ++i) A[i][foo()] = 0; here we still want to strength reduce the A[i] part, even though foo() is l-v. This also simplifies some of the 'CanReduce' logic. This implements Transforms/LoopStrengthReduce/ops_after_indvar.ll llvm-svn: 22652
-
Nate Begeman authored
llvm-svn: 22650
-
Chris Lattner authored
1. We only analyze instructions once, guaranteed 2. AnalyzeGetElementPtrUsers has been ripped apart and replaced with something much simpler. The next step is to handle expressions that are not all indvar+loop-invariant values (e.g. handling indvar+loopvariant). llvm-svn: 22649
-
Andrew Lenharth authored
llvm-svn: 22648
-
Misha Brukman authored
* Add comments to #endif pragmas for readability llvm-svn: 22647
-
Misha Brukman authored
* Comment #endif clauses for readability llvm-svn: 22646
-
Nate Begeman authored
llvm-svn: 22644
-
Chris Lattner authored
llvm-svn: 22643
-
Chris Lattner authored
llvm-svn: 22641
-
Chris Lattner authored
sure to handle the use, just don't recurse into it. This permits us to generate this code for a simple nested loop case: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r29, 44(r1) stw r30, 40(r1) mflr r11 stw r11, 56(r1) lis r2, ha16(L_A$non_lazy_ptr) lwz r30, lo16(L_A$non_lazy_ptr)(r2) li r29, 1 .LBB_foo_1: ; no_exit.0 bl L_bar$stub li r2, 1 or r3, r30, r30 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r3) stfd f0, 0(r3) addi r4, r2, 1 addi r3, r3, 8 cmpwi cr0, r2, 100 or r2, r4, r4 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r30, r30, 800 addi r2, r29, 1 cmpwi cr0, r29, 100 or r29, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 40(r1) lwz r29, 44(r1) lwz r1, 0(r1) blr instead of this: _foo: .LBB_foo_0: ; entry stwu r1, -48(r1) stw r28, 44(r1) ;; uses an extra register. stw r29, 40(r1) stw r30, 36(r1) mflr r11 stw r11, 56(r1) li r30, 1 li r29, 0 or r28, r29, r29 .LBB_foo_1: ; no_exit.0 bl L_bar$stub mulli r2, r28, 800 ;; unstrength-reduced multiply lis r3, ha16(L_A$non_lazy_ptr) ;; loop invariant address computation lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 mulli r4, r29, 800 ;; unstrength-reduced multiply addi r3, r3, 8 add r3, r4, r3 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 ;; multiple stride 8 IV's addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 .LBB_foo_3: ; loopexit.1 addi r28, r28, 1 ;;; Many IV's with stride 1 addi r29, r29, 1 addi r2, r30, 1 cmpwi cr0, r30, 100 or r30, r2, r2 bne .LBB_foo_1 ; no_exit.0 .LBB_foo_4: ; return lwz r11, 56(r1) mtlr r11 lwz r30, 36(r1) lwz r29, 40(r1) lwz r28, 44(r1) lwz r1, 0(r1) blr llvm-svn: 22640
-
Chris Lattner authored
pushed down by SCEV. In a nested loop case, this allows us to emit this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 li r3, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 8(r2) ;; Uses offset of 8 instead of 0 stfd f0, 0(r2) addi r4, r3, 1 addi r2, r2, 8 cmpwi cr0, r3, 100 or r3, r4, r4 bne .LBB_foo_2 ; no_exit.1 instead of this: lis r3, ha16(L_A$non_lazy_ptr) lwz r3, lo16(L_A$non_lazy_ptr)(r3) add r2, r2, r3 addi r3, r3, 8 li r4, 1 .LBB_foo_2: ; no_exit.1 lfd f0, 0(r3) stfd f0, 0(r2) addi r5, r4, 1 addi r2, r2, 8 addi r3, r3, 8 cmpwi cr0, r4, 100 or r4, r5, r5 bne .LBB_foo_2 ; no_exit.1 llvm-svn: 22639
-
Chris Lattner authored
llvm-svn: 22638
-
Nate Begeman authored
Scalar SSE: a < b ? c : 0.0 -> cmpss, andps Scalar SSE: float -> i16 needs to be promoted llvm-svn: 22637
-