Fix maskAndClamp in gpu.all_reduce.
The clamp value determines the returned predicate. Previously, the clamp value was fixed to 31 and the predicate was therefore always true. This is incorrect for partial warp reductions, but went unnoticed because the returned values happened to be zero (but it could be anything). PiperOrigin-RevId: 285343160
Loading
Please sign in to comment