[CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+
vote.ballot instruction is gone in recent CUDA versions and vote.sync.ballot can not be used because it needs a thread mask parameter. Fortunately PTX 6.2 (introduced with CUDA-9.2) provides activemask.b32 instruction for this. Differential Revision: https://reviews.llvm.org/D66665 llvm-svn: 370792
Loading
Please register or sign in to comment