Commits · 16ae43e90174b5bba0ee7a926bf687e501960c9f · Roger Ferrer / llvm-epi-0.8

Oct 06, 2006
- MachineBasicBlock::splice was incorrectly updating parent pointers on · 16ae43e9
  Chris Lattner authored Oct 06, 2006
```
instructions.

llvm-svn: 30760
```
  16ae43e9
- Make use of getStore(). · df9ac47e
  Evan Cheng authored Oct 05, 2006
```
llvm-svn: 30759
```
  df9ac47e
- Add getStore() helper function to create ISD::STORE nodes. · af309d29
  Evan Cheng authored Oct 05, 2006
```
llvm-svn: 30758
```
  af309d29
Oct 05, 2006

Don't crash if an MBB doesn't have an LLVM BB · 8b1a59a2
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30757
```
8b1a59a2
use a const ref for passing the vector to ArgumentLayout · decfeca5
Rafael Espindola authored Oct 05, 2006
```
llvm-svn: 30756
```
decfeca5
implement a ArgumentLayout class to factor code common to LowerFORMAL_ARGUMENTS and LowerCALL · e04df41c
Rafael Espindola authored Oct 05, 2006
```
implement FMDRR
add support for f64 function arguments

llvm-svn: 30754
```
e04df41c
Alias analysis code clean ups. · 6549d22e
Jim Laskey authored Oct 05, 2006
```
llvm-svn: 30753
```
6549d22e

add a new SimplifyDemandedVectorElts method, which works similarly to · 2deeaeac

Chris Lattner authored Oct 05, 2006

SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752

2deeaeac

new testcase · 3d5e9818
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30751
```
3d5e9818
Add insertelement/extractelement helper ctors. · 65511ff6
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30750
```
65511ff6
Lower some min/max idioms to minss/maxss when unsafe fp math is enabled. · f2ef2435
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30748
```
f2ef2435
Check that jump tables wind up in the rodata section · 16b8f958
Andrew Lenharth authored Oct 05, 2006
```
llvm-svn: 30747
```
16b8f958
remove JumpTableTextSection · 40a95dd3
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30746
```
40a95dd3
Don't bother setting JumpTableTextSection, it is about to disappear · 8cfd10ef
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30745
```
8cfd10ef
Emit pic jumptables to the same section that the function is emitted to, · 66c1625a
Chris Lattner authored Oct 05, 2006
```
allowing label differences to work.  This fixes CodeGen/X86/pic_jumptable.ll

llvm-svn: 30744
```
66c1625a

Verify that jump tables are emitted to the same section as the function is, · bfe59e87

Chris Lattner authored Oct 05, 2006

when codegen'ing in pic mode.  This fixes a miscompilation of a switch stmt
in a template, as the template goes to a non-.text section.

llvm-svn: 30743

bfe59e87

Pass the MachineFunction into EmitJumpTableInfo. · a6a570e0
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30742
```
a6a570e0
implement and use getSectionForFunction · 38e2c8a0
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30741
```
38e2c8a0
Use getSectionForFunction. · 44316991
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30740
```
44316991
Use getSectionForFunction · d4d255a4
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30739
```
d4d255a4
use getSectionForFunction to decide which section to emit code into · c8c78982
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30738
```
c8c78982
Implement getSectionForFunction, use it when printing function body. · b82247b1
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30737
```
b82247b1
move getSectionForFunction to AsmPrinter · dc822411
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30736
```
dc822411
Move getSectionForFunction to AsmPrinter, change it to return a string. · 028d663e
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30735
```
028d663e
move getSectionForFunction to AsmPrinter. · 0dca9271
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30734
```
0dca9271
implement DarwinTargetAsmInfo::getSectionForFunction, use it when outputting · 0d236450
Chris Lattner authored Oct 05, 2006
```
function bodies

llvm-svn: 30733
```
0d236450
Give TargetAsmInfo a virtual dtor, add a new getSectionForFunction method. · afe6d7a1
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30732
```
afe6d7a1
emit jump table before debug info · 41e22a54
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30731
```
41e22a54
Always emit the jump table after the function so it's part of the same 'atom' · aad26a19
Chris Lattner authored Oct 05, 2006
```
as the function body.

llvm-svn: 30730
```
aad26a19
getFilename/getDirectory shouldn't abort if the global has no init. This · 19721e87
Chris Lattner authored Oct 04, 2006
```
can happen on bugpoint reduced testcases f.e..

llvm-svn: 30729
```
19721e87
Fix some typos that can cause a flag value to have more than one use. · f80dfa83
Evan Cheng authored Oct 04, 2006
```
llvm-svn: 30727
```
f80dfa83
Fix a static dtor issue · c374ec43
Chris Lattner authored Oct 04, 2006
```
llvm-svn: 30726
```
c374ec43

Oct 04, 2006
- Fix more static dtor issues · 8111c592
  Chris Lattner authored Oct 04, 2006
```
llvm-svn: 30725
```
  8111c592
- Fix some more static dtor issues. · 538c6eb0
  Chris Lattner authored Oct 04, 2006
```
llvm-svn: 30724
```
  538c6eb0
- Added option -disable-x86-shuffle-opti to disable X86 specific vector shuffle optimizations. · 8c5766ef
  Evan Cheng authored Oct 04, 2006
```
llvm-svn: 30723
```
  8c5766ef
- Formating. · 412aaabc
  Evan Cheng authored Oct 04, 2006
```
llvm-svn: 30722
```
  412aaabc
- More extensive alias analysis. · 708d0db2
  Jim Laskey authored Oct 04, 2006
```
llvm-svn: 30721
```
  708d0db2
- More long term solution · 0d5a0eae
  Jim Laskey authored Oct 04, 2006
```
llvm-svn: 30720
```
  0d5a0eae
- Pattern match min/max nodes when we have sse. This implements · 9259b1ef
  Chris Lattner authored Oct 04, 2006
```
CodeGen/X86/scalar_sse_minmax.ll

llvm-svn: 30719
```
  9259b1ef
- pattern match min/max nodes · 1e21d3a5
  Chris Lattner authored Oct 04, 2006
```
llvm-svn: 30718
```
  1e21d3a5