Commits · f13a7b376cf79b84076160705ae9f8d836a0e647 · Roger Ferrer / llvm-epi-0.8

Oct 07, 2006
- eliminate redundancy · f13a7b37
  Chris Lattner authored Oct 07, 2006
```
llvm-svn: 30783
```
  f13a7b37
- Fix a bug legalizing zero-extending i64 loads into 32-bit loads. The bottom · f9f90bc2
  Chris Lattner authored Oct 07, 2006
```
part was always forced to be sextload, even when we needed an zextload.

llvm-svn: 30782
```
  f9f90bc2
- Set the jt section · dc3064e2
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30781
```
  dc3064e2
- initialize ivar · a389a612
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30780
```
  a389a612
- If a target uses a GOT, put it in the jt data section, not the text · 9043823c
  Chris Lattner authored Oct 06, 2006
```
section.  This will fix alpha when Andrew implements
AlphaTargetMachine::getTargetLowering().

llvm-svn: 30779
```
  9043823c
- Alpha uses a got · 21fa7698
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30778
```
  21fa7698
- Add support for targets to declare that they use a GOT · 1c52b57e
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30777
```
  1c52b57e
- jump tables handle pic · 9d75324d
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30776
```
  9d75324d
Oct 06, 2006
- print labels even if a MBB doesn't have a corresponding LLVM BB, just don't · 4e107aa0
  Chris Lattner authored Oct 06, 2006
```
print the LLVM BB label.

llvm-svn: 30775
```
  4e107aa0
- add optional input flag to FMRRD · aa2a12f1
  Rafael Espindola authored Oct 06, 2006
```
llvm-svn: 30774
```
  aa2a12f1
- add support for calling functions that return double · 671f2528
  Rafael Espindola authored Oct 06, 2006
```
llvm-svn: 30771
```
  671f2528
- 80 col violation. · 5fe96802
  Evan Cheng authored Oct 06, 2006
```
llvm-svn: 30770
```
  5fe96802
- ugly codegen · 2421a179
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30769
```
  2421a179
- Fix a miscompilation of: · f5839a08
  Chris Lattner authored Oct 06, 2006
```
long long foo(long long X) {
  return (long long)(signed char)(int)X;
}

Instead of:

_foo:
        extsb r2, r4
        srawi r3, r4, 31
        mr r4, r2
        blr

we now produce:

_foo:
        extsb r4, r4
        srawi r3, r4, 31
        blr

This fixes a miscompilation in ConstantFolding.cpp.

llvm-svn: 30768
```
  f5839a08
- fix some bugs affecting functions with no arguments · ef01656e
  Rafael Espindola authored Oct 06, 2006
```
llvm-svn: 30767
```
  ef01656e
- fix the stack alignment · 6024ea83
  Rafael Espindola authored Oct 06, 2006
```
llvm-svn: 30766
```
  6024ea83
- add support for calling functions that have double arguments · 5fe7909e
  Rafael Espindola authored Oct 06, 2006
```
llvm-svn: 30765
```
  5fe7909e
- Still need to support -mcpu=<> or cross compilation will fail. Doh. · ff1beda5
  Evan Cheng authored Oct 06, 2006
```
llvm-svn: 30764
```
  ff1beda5
- Do away with CPU feature list. Just use CPUID to detect MMX, SSE, SSE2, SSE3, and 64-bit support. · 9274f72e
  Evan Cheng authored Oct 06, 2006
```
llvm-svn: 30763
```
  9274f72e
- It appears the inline asm in GetCpuIDAndInfo() may clobbers some registers if... · 4c1a804a
  Evan Cheng authored Oct 06, 2006
```
It appears the inline asm in GetCpuIDAndInfo() may clobbers some registers if it isn't inlined (at < -O3). Force it to be inlined.

llvm-svn: 30762
```
  4c1a804a
- add an accessor · 469ea0c9
  Chris Lattner authored Oct 06, 2006
```
llvm-svn: 30761
```
  469ea0c9
- MachineBasicBlock::splice was incorrectly updating parent pointers on · 16ae43e9
  Chris Lattner authored Oct 06, 2006
```
instructions.

llvm-svn: 30760
```
  16ae43e9
- Make use of getStore(). · df9ac47e
  Evan Cheng authored Oct 05, 2006
```
llvm-svn: 30759
```
  df9ac47e
- Add getStore() helper function to create ISD::STORE nodes. · af309d29
  Evan Cheng authored Oct 05, 2006
```
llvm-svn: 30758
```
  af309d29
Oct 05, 2006

Don't crash if an MBB doesn't have an LLVM BB · 8b1a59a2
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30757
```
8b1a59a2
use a const ref for passing the vector to ArgumentLayout · decfeca5
Rafael Espindola authored Oct 05, 2006
```
llvm-svn: 30756
```
decfeca5
implement a ArgumentLayout class to factor code common to LowerFORMAL_ARGUMENTS and LowerCALL · e04df41c
Rafael Espindola authored Oct 05, 2006
```
implement FMDRR
add support for f64 function arguments

llvm-svn: 30754
```
e04df41c
Alias analysis code clean ups. · 6549d22e
Jim Laskey authored Oct 05, 2006
```
llvm-svn: 30753
```
6549d22e

add a new SimplifyDemandedVectorElts method, which works similarly to · 2deeaeac

Chris Lattner authored Oct 05, 2006

SimplifyDemandedBits.  The idea is that some operations can be simplified if
not all of the computed elements are needed.  Some targets (like x86) have a
large number of intrinsics that operate on a single element, but pass other
elts through unmodified.  If those other elements are not needed, the
intrinsics can be simplified to scalar operations, and insertelement ops can
be removed.

This turns (f.e.):

ushort %Convert_sse(float %f) {
        %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
        %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
        %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
        %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
        %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

into:

ushort %Convert_sse(float %f) {
entry:
        %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
        %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
        %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
        %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
        %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
        %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
        ret ushort %tmp69
}

which improves codegen from:

_Convert_sse:
        movss LCPI1_0, %xmm0
        movss 4(%esp), %xmm1
        subss %xmm0, %xmm1
        movss LCPI1_1, %xmm0
        mulss %xmm0, %xmm1
        movss LCPI1_2, %xmm0
        minss %xmm0, %xmm1
        xorps %xmm0, %xmm0
        maxss %xmm0, %xmm1
        cvttss2si %xmm1, %eax
        andl $65535, %eax
        ret

to:

_Convert_sse:
        movss 4(%esp), %xmm0
        subss LCPI1_0, %xmm0
        mulss LCPI1_1, %xmm0
        movss LCPI1_2, %xmm1
        minss %xmm1, %xmm0
        xorps %xmm1, %xmm1
        maxss %xmm1, %xmm0
        cvttss2si %xmm0, %eax
        andl $65535, %eax
        ret


This is just a first step, it can be extended in many ways.  Testcase here:
Transforms/InstCombine/vec_demanded_elts.ll

llvm-svn: 30752

2deeaeac

new testcase · 3d5e9818
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30751
```
3d5e9818
Add insertelement/extractelement helper ctors. · 65511ff6
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30750
```
65511ff6
Lower some min/max idioms to minss/maxss when unsafe fp math is enabled. · f2ef2435
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30748
```
f2ef2435
Check that jump tables wind up in the rodata section · 16b8f958
Andrew Lenharth authored Oct 05, 2006
```
llvm-svn: 30747
```
16b8f958
remove JumpTableTextSection · 40a95dd3
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30746
```
40a95dd3
Don't bother setting JumpTableTextSection, it is about to disappear · 8cfd10ef
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30745
```
8cfd10ef
Emit pic jumptables to the same section that the function is emitted to, · 66c1625a
Chris Lattner authored Oct 05, 2006
```
allowing label differences to work.  This fixes CodeGen/X86/pic_jumptable.ll

llvm-svn: 30744
```
66c1625a

Verify that jump tables are emitted to the same section as the function is, · bfe59e87

Chris Lattner authored Oct 05, 2006

when codegen'ing in pic mode.  This fixes a miscompilation of a switch stmt
in a template, as the template goes to a non-.text section.

llvm-svn: 30743

bfe59e87

Pass the MachineFunction into EmitJumpTableInfo. · a6a570e0
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30742
```
a6a570e0
implement and use getSectionForFunction · 38e2c8a0
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30741
```
38e2c8a0
Use getSectionForFunction. · 44316991
Chris Lattner authored Oct 05, 2006
```
llvm-svn: 30740
```
44316991