Custom lower v4i32 multiplies into a cute sequence, instead of having legalize
scalarize the sequence into 4 mullw's and a bunch of load/store traffic. This speeds up v4i32 multiplies 4.1x (measured) on a G5. This implements PowerPC/vec_mul.ll llvm-svn: 27788
Loading
Please register or sign in to comment