[X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF
Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1)) Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x)) Differential Revision: http://reviews.llvm.org/D12663 llvm-svn: 248091
Loading
Please sign in to comment