[mlir][arith] Support wide integer multiplication emulation
Emulate multiplication by splitting each input element of type i2N into 4 digits of type iN and bit width i(N/2). This is so that the intermediate multiplications and additions do not overflow. We extract these i(N/2) digits from iN vector elements by masking (low digit) and shifting right (high digit). The multiplication algorithm used is the standard (long) multiplication. Multiplying two i2N integers produces (at most) a i4N result, but because the calculation of top i2N is not necessary, we omit it. In total, this implementations performs 10 intermediate multiplications and 16 additions. The number of multiplications could be decreased by switching to a more efficient algorithm like Karatsuba. This would, however, require being able to perform (intermediate) wide additions and subtractions, so it is not clear that such implementation would be more efficient. I tested this on all 16-bit inut pairs, when emulating i16 with i8. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D133629
Loading
Please sign in to comment