Skip to content
Commit 1485fd29 authored by aartbik's avatar aartbik
Browse files

[mlir] [VectorOps] Improve scatter/gather CPU performance

Replaced the linearized address with the proper LLVM way of
defining vector of base + indices in SIMD style. This yields
much better code. Some prototype results with microbencmarking
sparse matrix x vector with 50% sparsity (about 2-3x faster):

         LINEARIZED     IMPROVED
GFLOPS  sdot  saxpy     sdot saxpy
16x16    1.6   1.4       4.4  2.1
32x32    1.7   1.6       5.8  5.9
64x64    1.7   1.7       6.4  6.4
128x128  1.7   1.7       5.9  5.9
256x256  1.6   1.6       6.1  6.0
512x512  1.4   1.4       4.9  4.7

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D84368
parent dab898f9
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment