Commits · 9fcad09b1bd0e00d1799a52ff6b1b8e4527ea297 · Roger Ferrer / llvm-epi-0.8

Apr 17, 2006

Codegen insertelement with constant insertion points as scalar_to_vector · 326870b4

Chris Lattner authored Apr 17, 2006

and a shuffle.  For this:

void %test2(<4 x float>* %F, float %f) {
        %tmp = load <4 x float>* %F             ; <<4 x float>> [#uses=2]
        %tmp3 = add <4 x float> %tmp, %tmp              ; <<4 x float>> [#uses=1]
        %tmp2 = insertelement <4 x float> %tmp3, float %f, uint 2               ; <<4 x float>> [#uses=2]
        %tmp6 = add <4 x float> %tmp2, %tmp2            ; <<4 x float>> [#uses=1]
        store <4 x float> %tmp6, <4 x float>* %F
        ret void
}

we now get this on X86 (which will get better):

_test2:
        movl 4(%esp), %eax
        movaps (%eax), %xmm0
        addps %xmm0, %xmm0
        movaps %xmm0, %xmm1
        shufps $3, %xmm1, %xmm1
        movaps %xmm0, %xmm2
        shufps $1, %xmm2, %xmm2
        unpcklps %xmm1, %xmm2
        movss 8(%esp), %xmm1
        unpcklps %xmm1, %xmm0
        unpcklps %xmm2, %xmm0
        addps %xmm0, %xmm0
        movaps %xmm0, (%eax)
        ret

instead of:

_test2:
        subl $28, %esp
        movl 32(%esp), %eax
        movaps (%eax), %xmm0
        addps %xmm0, %xmm0
        movaps %xmm0, (%esp)
        movss 36(%esp), %xmm0
        movss %xmm0, 8(%esp)
        movaps (%esp), %xmm0
        addps %xmm0, %xmm0
        movaps %xmm0, (%eax)
        addl $28, %esp
        ret

llvm-svn: 27765

326870b4

Apr 16, 2006
- Add support for promoting stores from one legal type to another, allowing us · 91226e57
  Chris Lattner authored Apr 16, 2006
```
to write one pattern for vector stores instead of 4.

llvm-svn: 27730
```
  91226e57
- Make these predicates return true for bit_convert(buildvector)'s as well as · 7e7ad593
  Chris Lattner authored Apr 15, 2006
```
buildvectors.

llvm-svn: 27723
```
  7e7ad593
Apr 14, 2006
- Make this assertion better · 086e986e
  Chris Lattner authored Apr 14, 2006
```
llvm-svn: 27695
```
  086e986e
Apr 13, 2006
- Expand some code with temporary variables to rid ourselves of the warning · 709eaacb
  Reid Spencer authored Apr 13, 2006
```
about "dereferencing type-punned pointer will break strict-aliasing rules"

llvm-svn: 27671
```
  709eaacb
Apr 12, 2006
- Promote vector AND, OR, and XOR · 119266ea
  Evan Cheng authored Apr 12, 2006
```
llvm-svn: 27632
```
  119266ea
- Vector type promotion for ISD::LOAD and ISD::SELECT · be8a8933
  Evan Cheng authored Apr 12, 2006
```
llvm-svn: 27606
```
  be8a8933
- Implement support for the formal_arguments node. To get this, targets... · d3b504ae
  Chris Lattner authored Apr 12, 2006
```
Implement support for the formal_arguments node.  To get this, targets shouldcustom legalize it and remove their XXXTargetLowering::LowerArguments overload

llvm-svn: 27604
```
  d3b504ae
- Don't memoize vloads in the load map! Don't memoize them anywhere here, let · 417b96b6
  Chris Lattner authored Apr 12, 2006
```
getNode do it.  This fixes CodeGen/Generic/2006-04-11-vecload.ll

llvm-svn: 27602
```
  417b96b6
Apr 11, 2006
- Only get Tmp2 for cases where number of operands is > 1. Fixed return void. · 7256b0ae
  Evan Cheng authored Apr 11, 2006
```
llvm-svn: 27586
```
  7256b0ae
- add some todos · 6cf3bbbe
  Chris Lattner authored Apr 11, 2006
```
llvm-svn: 27580
```
  6cf3bbbe
- Add basic support for legalizing returns of vectors · 2eb22eef
  Chris Lattner authored Apr 11, 2006
```
llvm-svn: 27578
```
  2eb22eef
- Use existing information. · dca2655d
  Jim Laskey authored Apr 10, 2006
```
llvm-svn: 27574
```
  dca2655d
Apr 10, 2006
- Missing break · cb73b8d4
  Evan Cheng authored Apr 10, 2006
```
llvm-svn: 27559
```
  cb73b8d4
Apr 09, 2006
- Add code generator support for VSELECT · 02274a52
  Chris Lattner authored Apr 08, 2006
```
llvm-svn: 27542
```
  02274a52
Apr 08, 2006
- Canonicalize vvector_shuffle(x,x) -> vvector_shuffle(x,undef) to enable patterns · e1401e36
  Chris Lattner authored Apr 08, 2006
```
to match again :)

llvm-svn: 27533
```
  e1401e36
- Codegen shufflevector as VVECTOR_SHUFFLE · 098c01e9
  Chris Lattner authored Apr 08, 2006
```
llvm-svn: 27529
```
  098c01e9
- add a sanity check: LegalizeOp should return a value that is the same type · 101ea668
  Chris Lattner authored Apr 08, 2006
```
as its input.

llvm-svn: 27528
```
  101ea668
- INSERT_VECTOR_ELT lowering bug: · 78e3d565
  Evan Cheng authored Apr 08, 2006
```
  store vector to $esp
  store element to $esp + sizeof(VT) * index
  load  vector from $esp
The bug is VT is the type of the vector element, not the type of the vector!

llvm-svn: 27517
```
  78e3d565
- Stub out shufflevector · aa3185f1
  Chris Lattner authored Apr 08, 2006
```
llvm-svn: 27514
```
  aa3185f1
- Remove section change in function end, preventing override of function's real · 7d459273
  Jim Laskey authored Apr 08, 2006
```
section.

llvm-svn: 27503
```
  7d459273
Apr 07, 2006
- Make sure that debug labels are defined within the same section and after the · c0d6518f
  Jim Laskey authored Apr 07, 2006
```
entry point of a function.

llvm-svn: 27494
```
  c0d6518f
- Foundation for call frame information. · 2d7298c3
  Jim Laskey authored Apr 07, 2006
```
llvm-svn: 27491
```
  2d7298c3
- 1. If both vector operands of a vector_shuffle are undef, turn it into an undef. · 613996c5
  Evan Cheng authored Apr 06, 2006
```
2. A shuffle mask element can also be an undef.

llvm-svn: 27472
```
  613996c5
Apr 05, 2006
- Make a vector live across blocks have the correct Vec type. This fixes · 4a2413a5
  Chris Lattner authored Apr 05, 2006
```
CodeGen/X86/2006-04-04-CrossBlockCrash.ll

llvm-svn: 27436
```
  4a2413a5
- Exapnd a VECTOR_SHUFFLE to a BUILD_VECTOR if target asks for it to be expanded · 9fa8959d
  Evan Cheng authored Apr 05, 2006
```
or custom lowering fails.

llvm-svn: 27432
```
  9fa8959d
Apr 04, 2006

Do not create ZEXTLOAD's unless we are before legalize or the operation is · 4ea52cac
Chris Lattner authored Apr 04, 2006
```
legal.

llvm-svn: 27402
```
4ea52cac

* Add supprot for SCALAR_TO_VECTOR operations where the input needs to be · 6be79823

Chris Lattner authored Apr 04, 2006

  promoted/expanded (e.g. SCALAR_TO_VECTOR from i8/i16 on PPC).
* Add support for targets to request that VECTOR_SHUFFLE nodes be promoted
  to a canonical type, for example, we only want v16i8 shuffles on PPC.
* Move isShuffleLegal out of TLI into Legalize.
* Teach isShuffleLegal to allow shuffles that need to be promoted.

llvm-svn: 27399

6be79823

Constant fold bitconvert(undef) · a9e77d14
Chris Lattner authored Apr 04, 2006
```
llvm-svn: 27391
```
a9e77d14

Apr 03, 2006
- The stack alignment is now computed dynamically, just verify it is correct. · b710a81e
  Chris Lattner authored Apr 03, 2006
```
llvm-svn: 27380
```
  b710a81e
- Remove unused method · 6bc4b9c7
  Chris Lattner authored Apr 03, 2006
```
llvm-svn: 27379
```
  6bc4b9c7
- Add a missing check, this fixes UnitTests/Vector/sumarray.c · e1e3adf8
  Chris Lattner authored Apr 03, 2006
```
llvm-svn: 27375
```
  e1e3adf8
- Add a missing check, which broke a bunch of vector tests. · 04c00fc8
  Chris Lattner authored Apr 03, 2006
```
llvm-svn: 27374
```
  04c00fc8
- back this out · 94f012f6
  Andrew Lenharth authored Apr 03, 2006
```
llvm-svn: 27367
```
  94f012f6
Apr 02, 2006

This should be a win of every arch · 015eaf5f
Andrew Lenharth authored Apr 02, 2006
```
llvm-svn: 27364
```
015eaf5f

Add a little dag combine to compile this: · 4993249a

Chris Lattner authored Apr 02, 2006

int %AreSecondAndThirdElementsBothNegative(<4 x float>* %in) {
entry:
        %tmp1 = load <4 x float>* %in           ; <<4 x float>> [#uses=1]
        %tmp = tail call int %llvm.ppc.altivec.vcmpgefp.p( int 1, <4 x float> < float 0x7FF8000000000000, float 0.000000e+00, float 0.000000e+00, float 0x7FF8000000000000 >, <4 x float> %tmp1 )           ; <int> [#uses=1]
        %tmp = seteq int %tmp, 0                ; <bool> [#uses=1]
        %tmp3 = cast bool %tmp to int           ; <int> [#uses=1]
        ret int %tmp3
}

into this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        mtspr 256, r2
        blr

instead of this:

_AreSecondAndThirdElementsBothNegative:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI1_0)
        lis r5, ha16(LCPI1_0)
        lvx v0, 0, r3
        lvx v1, r5, r4
        vcmpgefp. v0, v1, v0
        mfcr r3, 2
        rlwinm r3, r3, 27, 31, 31
        xori r3, r3, 1
        cntlzw r3, r3
        srwi r3, r3, 5
        mtspr 256, r2
        blr

llvm-svn: 27356

4993249a

Implement promotion for EXTRACT_VECTOR_ELT, allowing v16i8 multiplies to work with PowerPC. · 42a5fca4
Chris Lattner authored Apr 02, 2006
```
llvm-svn: 27349
```
42a5fca4

Implement the Expand action for binary vector operations to break the binop · 87f08094

Chris Lattner authored Apr 02, 2006

into elements and operate on each piece.  This allows generic vector integer
multiplies to work on PPC, though the generated code is horrible.

llvm-svn: 27347

87f08094

Intrinsics that just load from memory can be treated like loads: they don't · a9c59156
Chris Lattner authored Apr 02, 2006
```
have to serialize against each other.  This allows us to schedule lvx's
across each other, for example.

llvm-svn: 27346
```
a9c59156

Constant fold all of the vector binops. This allows us to compile this: · 0442a187

Chris Lattner authored Apr 02, 2006

"vector unsigned char mergeLowHigh = (vector unsigned char)
( 8, 9, 10, 11, 16, 17, 18, 19, 12, 13, 14, 15, 20, 21, 22, 23 );
vector unsigned char mergeHighLow = vec_xor( mergeLowHigh, vec_splat_u8(8));"

aka:

void %test2(<16 x sbyte>* %P) {
  store <16 x sbyte> cast (<4 x int> xor (<4 x int> cast (<16 x ubyte> < ubyte 8, ubyte 9, ubyte 10, ubyte 11, ubyte 16, ubyte 17, ubyte 18, ubyte 19, ubyte 12, ubyte 13, ubyte 14, ubyte 15, ubyte 20, ubyte 21, ubyte 22, ubyte 23 > to <4 x int>), <4 x int> cast (<16 x sbyte> < sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8, sbyte 8 > to <4 x int>)) to <16 x sbyte>), <16 x sbyte> * %P
  ret void
}

into this:

_test2:
        mfspr r2, 256
        oris r4, r2, 32768
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        lvx v0, r5, r4
        stvx v0, 0, r3
        mtspr 256, r2
        blr

instead of this:

_test2:
        mfspr r2, 256
        oris r4, r2, 49152
        mtspr 256, r4
        li r4, lo16(LCPI2_0)
        lis r5, ha16(LCPI2_0)
        vspltisb v0, 8
        lvx v1, r5, r4
        vxor v0, v1, v0
        stvx v0, 0, r3
        mtspr 256, r2
        blr

... which occurs here:
http://developer.apple.com/hardware/ve/calcspeed.html

llvm-svn: 27343

0442a187