Commit 2628641b authored Jun 25, 2019 by Alex Zinenko Committed by A. Unique TensorFlower Jun 25, 2019

GPUtoNVVM: adjust integer bitwidth when lowering special register ops

GPU dialect operations (launch and launch_func) use `index` type for thread and
block index values inside the kernel, for compatibility with affine loops.
NVVM dialect operations, following the NVVM intrinsics, use `!llvm.i32` type,
which does not necessarily have the same bit width as the lowered `index` type.
Optionally sign-extend (indices are signed) or truncate the result of the NVVM
dialect operation to the bit width of the lowered `index` type before passing
it to other operations. This behavior is consistent with `std.index_cast`. We
cannot use the latter since we are targeting LLVM dialect types directly,
rather than standard integer types.

PiperOrigin-RevId: 254980868

parent 10f320f7

Show whitespace changes

Inline Side-by-side

Please to comment