- Apr 18, 2006
-
-
Chris Lattner authored
even/odd halves. Thanks to Nate telling me what's what. llvm-svn: 27793
-
Chris Lattner authored
vmuloub v5, v3, v2 vmuleub v2, v3, v2 vperm v2, v2, v5, v4 This implements CodeGen/PowerPC/vec_mul.ll. With this, v16i8 multiplies are 6.79x faster than before. Overall, UnitTests/Vector/multiplies.c is now 2.45x faster with LLVM than with GCC. Remove the 'integer multiplies' todo from the README file. llvm-svn: 27792
-
Evan Cheng authored
llvm-svn: 27790
-
Chris Lattner authored
li r5, lo16(LCPI1_0) lis r6, ha16(LCPI1_0) lvx v4, r6, r5 vmulouh v5, v3, v2 vmuleuh v2, v3, v2 vperm v2, v2, v5, v4 where v4 is: LCPI1_0: ; <16 x ubyte> .byte 2 .byte 3 .byte 18 .byte 19 .byte 6 .byte 7 .byte 22 .byte 23 .byte 10 .byte 11 .byte 26 .byte 27 .byte 14 .byte 15 .byte 30 .byte 31 This is 5.07x faster on the G5 (measured) than lowering to scalar code + loads/stores. llvm-svn: 27789
-
Chris Lattner authored
scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788
-
Evan Cheng authored
llvm-svn: 27786
-
Evan Cheng authored
llvm-svn: 27784
-
Evan Cheng authored
llvm-svn: 27782
-
Evan Cheng authored
llvm-svn: 27779
-
- Apr 17, 2006
-
-
Chris Lattner authored
llvm-svn: 27778
-
Chris Lattner authored
allows us to codegen functions as: _test_rol: vspltisw v2, -12 vrlw v2, v2, v2 blr instead of: _test_rol: mfvrsave r2, 256 mr r3, r2 mtvrsave r3 vspltisw v2, -12 vrlw v2, v2, v2 mtvrsave r2 blr Testcase here: CodeGen/PowerPC/vec_vrsave.ll llvm-svn: 27777
-
Evan Cheng authored
llvm-svn: 27773
-
Chris Lattner authored
the vrsave register for the caller. This allows us to codegen a function as: _test_rol: mfspr r2, 256 mr r3, r2 mtspr 256, r3 vspltisw v2, -12 vrlw v2, v2, v2 mtspr 256, r2 blr instead of: _test_rol: mfspr r2, 256 oris r3, r2, 40960 mtspr 256, r3 vspltisw v0, -12 vrlw v2, v0, v0 mtspr 256, r2 blr llvm-svn: 27772
-
Chris Lattner authored
vspltisw v2, -12 vrlw v2, v2, v2 instead of: vspltisw v0, -12 vrlw v2, v0, v0 when a function is returning a value. llvm-svn: 27771
-
Chris Lattner authored
llvm-svn: 27770
-
Chris Lattner authored
llvm-svn: 27769
-
Evan Cheng authored
llvm-svn: 27768
-
Chris Lattner authored
llvm-svn: 27767
-
Chris Lattner authored
being a bit more clever, add support for odd splats from -31 to -17. llvm-svn: 27764
-
Evan Cheng authored
llvm-svn: 27763
-
Evan Cheng authored
llvm-svn: 27762
-
Jeff Cohen authored
llvm-svn: 27761
-
Chris Lattner authored
This implements vec_constants.ll:test_vsldoi and test_rol llvm-svn: 27760
-
Chris Lattner authored
llvm-svn: 27758
-
Evan Cheng authored
llvm-svn: 27755
-
Chris Lattner authored
new patterns. llvm-svn: 27754
-
Chris Lattner authored
PowerPC/vec_constants.ll:test_29. llvm-svn: 27752
-
Chris Lattner authored
Effeciently codegen even splats in the range [-32,30]. This allows us to codegen <30,30,30,30> as: vspltisw v0, 15 vadduwm v2, v0, v0 instead of as a cp load. llvm-svn: 27750
-
Chris Lattner authored
if it can be implemented in 3 or fewer discrete altivec instructions, codegen it as such. This implements Regression/CodeGen/PowerPC/vec_perf_shuffle.ll llvm-svn: 27748
-
Chris Lattner authored
llvm-svn: 27746
-
Chris Lattner authored
llvm-svn: 27744
-
Chris Lattner authored
llvm-svn: 27742
-
Chris Lattner authored
of various 4-element vectors. llvm-svn: 27739
-
- Apr 16, 2006
-
-
Evan Cheng authored
llvm-svn: 27734
-
Evan Cheng authored
llvm-svn: 27733
-
Evan Cheng authored
address has to be 16-byte aligned but the values aren't spilled to 128-bit locations. llvm-svn: 27732
-
Chris Lattner authored
one type (v4i32) so that we don't have to write patterns for each type, and so that more CSE opportunities are exposed. llvm-svn: 27731
-
Chris Lattner authored
Remove some done items from the todo list. llvm-svn: 27729
-
Chris Lattner authored
llvm-svn: 27726
-
Chris Lattner authored
go away when I start using evan's binop type canonicalizer llvm-svn: 27725
-