- Apr 22, 2006
-
-
Evan Cheng authored
Don't do all the lowering stuff for 2-wide build_vector's. Also, minor optimization for shuffle of undef. llvm-svn: 27946
-
Evan Cheng authored
Fix a performance regression. Use {p}shuf* when there are only two distinct elements in a build_vector. llvm-svn: 27945
-
Chris Lattner authored
llvm-svn: 27943
-
Evan Cheng authored
movd always clear the top 96 bits and movss does so when it's loading the value from memory. The net result is codegen for 4-wide shuffles is much improved. It is near optimal if one or more elements is a zero. e.g. __m128i test(int a, int b) { return _mm_set_epi32(0, 0, b, a); } compiles to _test: movd 8(%esp), %xmm1 movd 4(%esp), %xmm0 punpckldq %xmm1, %xmm0 ret compare to gcc: _test: subl $12, %esp movd 20(%esp), %xmm0 movd 16(%esp), %xmm1 punpckldq %xmm0, %xmm1 movq %xmm1, %xmm0 movhps LC0, %xmm0 addl $12, %esp ret or icc: _test: movd 4(%esp), %xmm0 #5.10 movd 8(%esp), %xmm3 #5.10 xorl %eax, %eax #5.10 movd %eax, %xmm1 #5.10 punpckldq %xmm1, %xmm0 #5.10 movd %eax, %xmm2 #5.10 punpckldq %xmm2, %xmm3 #5.10 punpckldq %xmm3, %xmm0 #5.10 ret #5.10 There are still room for improvement, for example the FP variant of the above example: __m128 test(float a, float b) { return _mm_set_ps(0.0, 0.0, b, a); } _test: movss 8(%esp), %xmm1 movss 4(%esp), %xmm0 unpcklps %xmm1, %xmm0 xorps %xmm1, %xmm1 movlhps %xmm1, %xmm0 ret The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle. llvm-svn: 27939
-
Nate Begeman authored
llvm-svn: 27938
-
Nate Begeman authored
llvm-svn: 27937
-
- Apr 21, 2006
-
-
Chris Lattner authored
llvm-svn: 27935
-
Chris Lattner authored
llvm-svn: 27934
-
Evan Cheng authored
scalar value. e.g. _mm_set_epi32(0, a, 0, 0); ==> movd 4(%esp), %xmm0 pshufd $69, %xmm0, %xmm0 _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); ==> movzbw 4(%esp), %ax movzwl %ax, %eax pxor %xmm0, %xmm0 pinsrw $5, %eax, %xmm0 llvm-svn: 27923
-
- Apr 20, 2006
-
-
Chris Lattner authored
llvm-svn: 27908
-
Chris Lattner authored
llvm-svn: 27907
-
Chris Lattner authored
llvm-svn: 27900
-
Chris Lattner authored
llvm-svn: 27895
-
Chris Lattner authored
llvm-svn: 27885
-
Evan Cheng authored
to a vector shuffle. - VECTOR_SHUFFLE lowering change in preparation for more efficient codegen of vector shuffle with zero (or any splat) vector. llvm-svn: 27875
-
Chris Lattner authored
CodeGen/PowerPC/2006-04-19-vmaddfp-crash.ll llvm-svn: 27868
-
Evan Cheng authored
but i64 is not. If possible, change a i64 op to a f64 (e.g. load, constant) and then cast it back. llvm-svn: 27849
-
Evan Cheng authored
llvm-svn: 27847
-
Evan Cheng authored
instructions. - Fixed a commute vector_shuff bug. llvm-svn: 27845
-
- Apr 19, 2006
-
-
Evan Cheng authored
llvm-svn: 27844
-
Evan Cheng authored
llvm-svn: 27843
-
Evan Cheng authored
- Added more movhlps and movlhps patterns. llvm-svn: 27842
-
Evan Cheng authored
llvm-svn: 27840
-
Evan Cheng authored
llvm-svn: 27836
-
Evan Cheng authored
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These are preferred over shufps in most cases. llvm-svn: 27835
-
Evan Cheng authored
llvm-svn: 27834
-
Chris Lattner authored
llvm-svn: 27832
-
Chris Lattner authored
llvm-svn: 27828
-
Chris Lattner authored
llvm-svn: 27827
-
- Apr 18, 2006
-
-
Evan Cheng authored
- PINSRWrmi encoding bug. llvm-svn: 27818
-
Evan Cheng authored
llvm-svn: 27817
-
Evan Cheng authored
llvm-svn: 27816
-
Evan Cheng authored
llvm-svn: 27815
-
Evan Cheng authored
llvm-svn: 27814
-
Evan Cheng authored
llvm-svn: 27813
-
Chris Lattner authored
llvm-svn: 27810
-
Chris Lattner authored
llvm-svn: 27809
-
Chris Lattner authored
void foo2(vector float *A, vector float *B) { vector float C = (vector float)vec_cmpeq(*A, *B); if (!vec_any_eq(*A, *B)) *B = (vector float){0,0,0,0}; *A = C; } llvm-svn: 27808
-
Evan Cheng authored
llvm-svn: 27807
-
Chris Lattner authored
llvm-svn: 27806
-