[NVPTX] Add intrinsics for shfl instructions.
Summary: Currently clang emits these instructions via inline (volatile) asm in the CUDA headers. Switching to intrinsics will let the optimizer reason across calls to these intrinsics. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D21160 llvm-svn: 272298
Loading
Please register or sign in to comment