- Apr 22, 2006
-
-
Evan Cheng authored
Don't do all the lowering stuff for 2-wide build_vector's. Also, minor optimization for shuffle of undef. llvm-svn: 27946
-
Evan Cheng authored
Fix a performance regression. Use {p}shuf* when there are only two distinct elements in a build_vector. llvm-svn: 27945
-
Chris Lattner authored
llvm-svn: 27943
-
Chris Lattner authored
This allows Prolangs-C++/city and probably a bunch of other stuff to work well with the new front-end llvm-svn: 27941
-
Evan Cheng authored
movd always clear the top 96 bits and movss does so when it's loading the value from memory. The net result is codegen for 4-wide shuffles is much improved. It is near optimal if one or more elements is a zero. e.g. __m128i test(int a, int b) { return _mm_set_epi32(0, 0, b, a); } compiles to _test: movd 8(%esp), %xmm1 movd 4(%esp), %xmm0 punpckldq %xmm1, %xmm0 ret compare to gcc: _test: subl $12, %esp movd 20(%esp), %xmm0 movd 16(%esp), %xmm1 punpckldq %xmm0, %xmm1 movq %xmm1, %xmm0 movhps LC0, %xmm0 addl $12, %esp ret or icc: _test: movd 4(%esp), %xmm0 #5.10 movd 8(%esp), %xmm3 #5.10 xorl %eax, %eax #5.10 movd %eax, %xmm1 #5.10 punpckldq %xmm1, %xmm0 #5.10 movd %eax, %xmm2 #5.10 punpckldq %xmm2, %xmm3 #5.10 punpckldq %xmm3, %xmm0 #5.10 ret #5.10 There are still room for improvement, for example the FP variant of the above example: __m128 test(float a, float b) { return _mm_set_ps(0.0, 0.0, b, a); } _test: movss 8(%esp), %xmm1 movss 4(%esp), %xmm0 unpcklps %xmm1, %xmm0 xorps %xmm1, %xmm1 movlhps %xmm1, %xmm0 ret The xorps and movlhps are unnecessary. This will require post legalizer optimization to handle. llvm-svn: 27939
-
Nate Begeman authored
llvm-svn: 27938
-
Nate Begeman authored
llvm-svn: 27937
-
- Apr 21, 2006
-
-
Chris Lattner authored
llvm-svn: 27935
-
Chris Lattner authored
llvm-svn: 27934
-
Chris Lattner authored
miscompares). Switch RISC targets to use the list-td scheduler, which isn't. llvm-svn: 27933
-
Chris Lattner authored
llvm-svn: 27931
-
Chris Lattner authored
llvm-svn: 27930
-
Evan Cheng authored
scalar value. e.g. _mm_set_epi32(0, a, 0, 0); ==> movd 4(%esp), %xmm0 pshufd $69, %xmm0, %xmm0 _mm_set_epi8(0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); ==> movzbw 4(%esp), %ax movzwl %ax, %eax pxor %xmm0, %xmm0 pinsrw $5, %eax, %xmm0 llvm-svn: 27923
-
Chris Lattner authored
llvm-gcc4 boostrap. Whenever a node is deleted by the dag combiner, it *must* be returned by the visit function, or the dag combiner will not know that the node has been processed (and will, e.g., send it to the target dag combine xforms). llvm-svn: 27922
-
- Apr 20, 2006
-
-
Chris Lattner authored
llvm-svn: 27912
-
Chris Lattner authored
llvm-svn: 27908
-
Chris Lattner authored
llvm-svn: 27907
-
Chris Lattner authored
llvm-svn: 27900
-
Chris Lattner authored
llvm-svn: 27899
-
Chris Lattner authored
llvm-svn: 27895
-
Chris Lattner authored
llvm-svn: 27893
-
Chris Lattner authored
llvm-svn: 27885
-
Andrew Lenharth authored
llvm-svn: 27881
-
Andrew Lenharth authored
can be converted to losslessly, we can continue the conversion to a direct call. llvm-svn: 27880
-
Evan Cheng authored
to a vector shuffle. - VECTOR_SHUFFLE lowering change in preparation for more efficient codegen of vector shuffle with zero (or any splat) vector. llvm-svn: 27875
-
Evan Cheng authored
DAG combiner can turn a VAND V, <-1, 0, -1, -1>, i.e. vector clear elements, into a vector shuffle with a zero vector. It only does so when TLI tells it the xform is profitable. llvm-svn: 27874
-
Chris Lattner authored
CodeGen/PowerPC/2006-04-19-vmaddfp-crash.ll llvm-svn: 27868
-
Chris Lattner authored
llvm-svn: 27863
-
Evan Cheng authored
but i64 is not. If possible, change a i64 op to a f64 (e.g. load, constant) and then cast it back. llvm-svn: 27849
-
Evan Cheng authored
llvm-svn: 27847
-
Chris Lattner authored
llvm-svn: 27846
-
Evan Cheng authored
instructions. - Fixed a commute vector_shuff bug. llvm-svn: 27845
-
- Apr 19, 2006
-
-
Evan Cheng authored
llvm-svn: 27844
-
Evan Cheng authored
llvm-svn: 27843
-
Evan Cheng authored
- Added more movhlps and movlhps patterns. llvm-svn: 27842
-
Evan Cheng authored
llvm-svn: 27840
-
Evan Cheng authored
llvm-svn: 27836
-
Evan Cheng authored
- Increase cost (complexity) of patterns which match mov{h|l}ps ops. These are preferred over shufps in most cases. llvm-svn: 27835
-
Evan Cheng authored
llvm-svn: 27834
-
Chris Lattner authored
llvm-svn: 27832
-