Commits · 5d417e35bc36094f2e6519866fd25f4d0db6c551 · Roger Ferrer / llvm-epi-0.8

Oct 20, 2006
- While creating mask, use 1ULL instead of 1. · 5d417e35
  Devang Patel authored Oct 20, 2006
```
llvm-svn: 31062
```
  5d417e35
- Fix SimplifyCFG/2006-10-19-UncondDiv.ll by disabling a bad xform. · b8b11599
  Chris Lattner authored Oct 20, 2006
```
llvm-svn: 31061
```
  b8b11599
Oct 19, 2006
- It is OK to remove extra cast if operation is EQ/NE even though source · 5d6df959
  Devang Patel authored Oct 19, 2006
```
and destination sign may not match but other conditions are met.

llvm-svn: 31056
```
  5d6df959
- Typo Typo. · 88afd00d
  Devang Patel authored Oct 19, 2006
```
llvm-svn: 31055
```
  88afd00d
- Typo. · 472530d9
  Devang Patel authored Oct 19, 2006
```
llvm-svn: 31054
```
  472530d9
- Fix bug in PR454 resolution. Added new test case. · b42aef49
  Devang Patel authored Oct 19, 2006
```
This fixes llvmAsmParser.cpp miscompile by llvm on PowerPC Darwin.

llvm-svn: 31053
```
  b42aef49
Oct 17, 2006
- Undo Chris' last patch, it caused a regression. · 3c514959
  Reid Spencer authored Oct 16, 2006
```
llvm-svn: 30991
```
  3c514959
Oct 16, 2006
- fix a buggy check that accidentally disabled this xform · 9a1c7dd2
  Chris Lattner authored Oct 15, 2006
```
llvm-svn: 30967
```
  9a1c7dd2
Oct 12, 2006
- Replace custom dispatch code with two uses of InstVisitor. Improves · 77e030bc
  Nick Lewycky authored Oct 12, 2006
```
compile-time performance.

llvm-svn: 30896
```
  77e030bc
Oct 09, 2006
- Implement SROA of unions with mixed pointers/integers in them. This implements · 41b44224
  Chris Lattner authored Oct 08, 2006
```
PR892 and Transforms/ScalarRepl/union-pointer.ll:test2

llvm-svn: 30825
```
  41b44224
- Implement Transforms/ScalarRepl/union-pointer.ll:test · 05f8272a
  Chris Lattner authored Oct 08, 2006
```
llvm-svn: 30823
```
  05f8272a
Oct 05, 2006

add a new SimplifyDemandedVectorElts method, which works similarly to · 2deeaeac

Chris Lattner authored Oct 05, 2006

SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752

2deeaeac

Oct 04, 2006
- This case isn't implemented yet. It seems unlikely to be needed, but if it · 52886e72
  Chris Lattner authored Oct 04, 2006
```
ever is, we want to get an assert instead of silent bad codegen.

llvm-svn: 30716
```
  52886e72
Oct 03, 2006
- Simplify logic further. · 58a910df
  Nick Lewycky authored Oct 03, 2006
```
Ensure that we copy KnownProperties before calling visitBasicBlock, else
we may leak properties into blocks where they don't belong.

llvm-svn: 30705
```
  58a910df
- Simplify, now that predsimplify depends on break-crit-edges. · 1d00f3e1
  Nick Lewycky authored Oct 03, 2006
```
Fix SwitchInst where dest-block is the same as one of the cases.

llvm-svn: 30700
```
  1d00f3e1
- Move break-crit-edges before the predicate simplifier. Allows us to · 755f801a
  Nick Lewycky authored Oct 03, 2006
```
optimize in more cases.

llvm-svn: 30699
```
  755f801a
- Revert previous patch. Still breaking things. · ff510a58
  Evan Cheng authored Oct 03, 2006
```
llvm-svn: 30698
```
  ff510a58
- Fix PR932 and Analysis/Dominators/2006-10-02-BreakCritEdges.ll: · 8aca0ee8
  Chris Lattner authored Oct 03, 2006
```
The critical edge block dominates the dest block if the destblock dominates
all edges other than the one incoming from the critical edge.

llvm-svn: 30696
```
  8aca0ee8
Oct 01, 2006
- Fix a bug from r1.391 of this file, where we checked the size instead of · 7d19067c
  Chris Lattner authored Oct 01, 2006
```
the alignment when promoting allocations.  This implements
InstCombine/cast.ll:test32

llvm-svn: 30682
```
  7d19067c
- Fix debug output · 4797c891
  Chris Lattner authored Sep 30, 2006
```
llvm-svn: 30680
```
  4797c891
- Implement SRA of heap allocations. · 24d3d428
  Chris Lattner authored Sep 30, 2006
```
llvm-svn: 30679
```
  24d3d428
Sep 30, 2006
- Add some ifdef'd out debug info · 80a01ef6
  Chris Lattner authored Sep 30, 2006
```
llvm-svn: 30676
```
  80a01ef6
Sep 29, 2006
- · 6ab03f6a
  Chris Lattner authored Sep 28, 2006
```
Eliminate ConstantBool::True and ConstantBool::False.  Instead, provide
ConstantBool::getTrue() and ConstantBool::getFalse().

llvm-svn: 30665
```
  6ab03f6a
- Another attempt at making ArgPromotion smarter. This patch no longer breaks Burg. · 7cb6809c
  Owen Anderson authored Sep 28, 2006
```
llvm-svn: 30657
```
  7cb6809c
- simplify code · 525804f3
  Chris Lattner authored Sep 28, 2006
```
llvm-svn: 30656
```
  525804f3
Sep 27, 2006
- set DEBUG_TYPE right · e03ca2ca
  Chris Lattner authored Sep 27, 2006
```
llvm-svn: 30623
```
  e03ca2ca
Sep 23, 2006

Style changes only. Remove dead code, fix a comment. · 059c7926
Nick Lewycky authored Sep 23, 2006
```
llvm-svn: 30588
```
059c7926

Be far more careful when splitting a loop header, either to form a preheader · 6bd6da40

Chris Lattner authored Sep 23, 2006

or when splitting loops with a common header into multiple loops.  In particular
the old code would always insert the preheader before the old loop header.  This
is disasterous in cases where the loop hasn't been rotated.  For example, it can
produce code like:

        .. outside the loop...
        jmp LBB1_2      #bb13.outer
LBB1_1: #bb1
        movsd 8(%esp,%esi,8), %xmm1
        mulsd (%edi), %xmm1
        addsd %xmm0, %xmm1
        addl $24, %edi
        incl %esi
        jmp LBB1_3      #bb13
LBB1_2: #bb13.outer
        leal (%edx,%eax,8), %edi
        pxor %xmm1, %xmm1
        xorl %esi, %esi
LBB1_3: #bb13
        movapd %xmm1, %xmm0
        cmpl $4, %esi
        jl LBB1_1       #bb1

Note that the loop body is actually LBB1_1 + LBB1_3, which means that the
loop now contains an uncond branch WITHIN it to jump around the inserted
loop header (LBB1_2).  Doh.

This patch changes the preheader insertion code to insert it in the right
spot, producing this code:

        ... outside the loop, fall into the header ...
LBB1_1: #bb13.outer
        leal (%edx,%eax,8), %esi
        pxor %xmm0, %xmm0
        xorl %edi, %edi
        jmp LBB1_3      #bb13
LBB1_2: #bb1
        movsd 8(%esp,%edi,8), %xmm0
        mulsd (%esi), %xmm0
        addsd %xmm1, %xmm0
        addl $24, %esi
        incl %edi
LBB1_3: #bb13
        movapd %xmm0, %xmm1
        cmpl $4, %edi
        jl LBB1_2       #bb1

Totally crazy, no branch in the loop! :)

llvm-svn: 30587

6bd6da40

Teach UpdateDomInfoForRevectoredPreds to handle revectored preds that are not · 608cd05e

Chris Lattner authored Sep 23, 2006

reachable, making it general purpose enough for use by InsertPreheaderForLoop.
Eliminate custom dominfo updating code in InsertPreheaderForLoop, using
UpdateDomInfoForRevectoredPreds instead.

llvm-svn: 30586

608cd05e

Sep 21, 2006
- Fix Transforms/IndVarsSimplify/2006-09-20-LFTR-Crash.ll · 51c95cdd
  Chris Lattner authored Sep 21, 2006
```
llvm-svn: 30555
```
  51c95cdd
- Don't rewrite ConstantExpr::get. · fde9c308
  Nick Lewycky authored Sep 21, 2006
```
llvm-svn: 30552
```
  fde9c308
- Once we're down to "setcc type constant1, constant2", at least come up · d74c55f4
  Nick Lewycky authored Sep 20, 2006
```
with the right answer.

llvm-svn: 30550
```
  d74c55f4
Sep 20, 2006
- Use a total ordering to compare instructions. · cfff1c3f
  Nick Lewycky authored Sep 20, 2006
```
Fixes infinite loop in resolve().

llvm-svn: 30540
```
  cfff1c3f
- simplify · 44cb67af
  Andrew Lenharth authored Sep 20, 2006
```
llvm-svn: 30535
```
  44cb67af
- We went through all that trouble to compute whether it was safe to transform · 380c7e9a
  Chris Lattner authored Sep 20, 2006
```
this comparison, but never checked it.  Whoops, no wonder we miscompiled
177.mesa!

llvm-svn: 30511
```
  380c7e9a
- Back out Chris' last set of changes. This breaks 177.mesa and povray somehow. · cd3f6ff0
  Evan Cheng authored Sep 20, 2006
```
llvm-svn: 30505
```
  cd3f6ff0
- 80 col. · 453280b9
  Evan Cheng authored Sep 20, 2006
```
llvm-svn: 30504
```
  453280b9
Sep 19, 2006
- If we have an add, do it in the pointer realm, not the int realm. This is... · 4f339beb
  Andrew Lenharth authored Sep 19, 2006
```
If we have an add, do it in the pointer realm, not the int realm.  This is critical in the linux kernel for pointer analysis correctness

llvm-svn: 30496
```
  4f339beb
- implement select.ll:test19-22 · 12f52faf
  Chris Lattner authored Sep 19, 2006
```
llvm-svn: 30482
```
  12f52faf
Sep 18, 2006

Walk down the dominator tree instead of the control flow graph. That means · b9c5483a

Nick Lewycky authored Sep 18, 2006

that we can't modify the CFG any more, at least not until it's possible
to update the dominator tree (PR217).

llvm-svn: 30469

b9c5483a