Commit 11d12670 authored Oct 08, 2019 by Alex Zinenko Committed by A. Unique TensorFlower Oct 08, 2019

GPUToCUDA: attach CUBIN to the nested module rather than to the function

Originally, we were attaching attributes containing CUBIN blobs to the kernel
function called by `gpu.launch_func`. This kernel is now contained in a nested
module that is used as a compilation unit. Attach compiled CUBIN blobs to the
module rather than to the function since we were compiling the module. This
also avoids duplication of the attribute on multiple kernels within the same
module.

PiperOrigin-RevId: 273497303

parent 52e082b6

Show whitespace changes

Inline Side-by-side

Please to comment