Skip to content
  1. Oct 07, 2006
  2. Oct 06, 2006
  3. Oct 05, 2006
    • Chris Lattner's avatar
      Don't crash if an MBB doesn't have an LLVM BB · 8b1a59a2
      Chris Lattner authored
      llvm-svn: 30757
      8b1a59a2
    • Rafael Espindola's avatar
      use a const ref for passing the vector to ArgumentLayout · decfeca5
      Rafael Espindola authored
      llvm-svn: 30756
      decfeca5
    • Rafael Espindola's avatar
      implement a ArgumentLayout class to factor code common to LowerFORMAL_ARGUMENTS and LowerCALL · e04df41c
      Rafael Espindola authored
      implement FMDRR
      add support for f64 function arguments
      
      llvm-svn: 30754
      e04df41c
    • Jim Laskey's avatar
      Alias analysis code clean ups. · 6549d22e
      Jim Laskey authored
      llvm-svn: 30753
      6549d22e
    • Chris Lattner's avatar
      add a new SimplifyDemandedVectorElts method, which works similarly to · 2deeaeac
      Chris Lattner authored
      SimplifyDemandedBits.  The idea is that some operations can be simplified if
      not all of the computed elements are needed.  Some targets (like x86) have a
      large number of intrinsics that operate on a single element, but pass other
      elts through unmodified.  If those other elements are not needed, the
      intrinsics can be simplified to scalar operations, and insertelement ops can
      be removed.
      
      This turns (f.e.):
      
      ushort %Convert_sse(float %f) {
              %tmp = insertelement <4 x float> undef, float %f, uint 0                ; <<4 x float>> [#uses=1]
              %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, uint 1             ; <<4 x float>> [#uses=1]
              %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, uint 2           ; <<4 x float>> [#uses=1]
              %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, uint 3           ; <<4 x float>> [#uses=1]
              %tmp28 = tail call <4 x float> %llvm.x86.sse.sub.ss( <4 x float> %tmp12, <4 x float> < float 1.000000e+00, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
              %tmp37 = tail call <4 x float> %llvm.x86.sse.mul.ss( <4 x float> %tmp28, <4 x float> < float 5.000000e-01, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
              %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp37, <4 x float> < float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00 > )               ; <<4 x float>> [#uses=1]
              %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> zeroinitializer )          ; <<4 x float>> [#uses=1]
              %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
              %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
              ret ushort %tmp69
      }
      
      into:
      
      ushort %Convert_sse(float %f) {
      entry:
              %tmp28 = sub float %f, 1.000000e+00             ; <float> [#uses=1]
              %tmp37 = mul float %tmp28, 5.000000e-01         ; <float> [#uses=1]
              %tmp375 = insertelement <4 x float> undef, float %tmp37, uint 0         ; <<4 x float>> [#uses=1]
              %tmp48 = tail call <4 x float> %llvm.x86.sse.min.ss( <4 x float> %tmp375, <4 x float> < float 6.553500e+04, float undef, float undef, float undef > )           ; <<4 x float>> [#uses=1]
              %tmp59 = tail call <4 x float> %llvm.x86.sse.max.ss( <4 x float> %tmp48, <4 x float> < float 0.000000e+00, float undef, float undef, float undef > )            ; <<4 x float>> [#uses=1]
              %tmp = tail call int %llvm.x86.sse.cvttss2si( <4 x float> %tmp59 )              ; <int> [#uses=1]
              %tmp69 = cast int %tmp to ushort                ; <ushort> [#uses=1]
              ret ushort %tmp69
      }
      
      which improves codegen from:
      
      _Convert_sse:
              movss LCPI1_0, %xmm0
              movss 4(%esp), %xmm1
              subss %xmm0, %xmm1
              movss LCPI1_1, %xmm0
              mulss %xmm0, %xmm1
              movss LCPI1_2, %xmm0
              minss %xmm0, %xmm1
              xorps %xmm0, %xmm0
              maxss %xmm0, %xmm1
              cvttss2si %xmm1, %eax
              andl $65535, %eax
              ret
      
      to:
      
      _Convert_sse:
              movss 4(%esp), %xmm0
              subss LCPI1_0, %xmm0
              mulss LCPI1_1, %xmm0
              movss LCPI1_2, %xmm1
              minss %xmm1, %xmm0
              xorps %xmm1, %xmm1
              maxss %xmm1, %xmm0
              cvttss2si %xmm0, %eax
              andl $65535, %eax
              ret
      
      
      This is just a first step, it can be extended in many ways.  Testcase here:
      Transforms/InstCombine/vec_demanded_elts.ll
      
      llvm-svn: 30752
      2deeaeac
    • Chris Lattner's avatar
      new testcase · 3d5e9818
      Chris Lattner authored
      llvm-svn: 30751
      3d5e9818
    • Chris Lattner's avatar
      Add insertelement/extractelement helper ctors. · 65511ff6
      Chris Lattner authored
      llvm-svn: 30750
      65511ff6
    • Chris Lattner's avatar
      f2ef2435
    • Andrew Lenharth's avatar
      Check that jump tables wind up in the rodata section · 16b8f958
      Andrew Lenharth authored
      llvm-svn: 30747
      16b8f958
    • Chris Lattner's avatar
      remove JumpTableTextSection · 40a95dd3
      Chris Lattner authored
      llvm-svn: 30746
      40a95dd3
    • Chris Lattner's avatar
      8cfd10ef
    • Chris Lattner's avatar
      Emit pic jumptables to the same section that the function is emitted to, · 66c1625a
      Chris Lattner authored
      allowing label differences to work.  This fixes CodeGen/X86/pic_jumptable.ll
      
      llvm-svn: 30744
      66c1625a
    • Chris Lattner's avatar
      Verify that jump tables are emitted to the same section as the function is, · bfe59e87
      Chris Lattner authored
      when codegen'ing in pic mode.  This fixes a miscompilation of a switch stmt
      in a template, as the template goes to a non-.text section.
      
      llvm-svn: 30743
      bfe59e87
    • Chris Lattner's avatar
      Pass the MachineFunction into EmitJumpTableInfo. · a6a570e0
      Chris Lattner authored
      llvm-svn: 30742
      a6a570e0
    • Chris Lattner's avatar
      implement and use getSectionForFunction · 38e2c8a0
      Chris Lattner authored
      llvm-svn: 30741
      38e2c8a0
    • Chris Lattner's avatar
      Use getSectionForFunction. · 44316991
      Chris Lattner authored
      llvm-svn: 30740
      44316991
Loading