refactor cudaarithm reductions:
* remove overloads with explicit buffer, now BufferPool is used * added async versions for all reduce functions
Showing
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please
register
or
sign in
to comment