Refactored GPU backend state into BackendContext (#1186)
* Refactored GPU backend state into BackendContext and moved it to the highest level GPU_Backend. Some bugs have appeared in so doing. Needs investigation. * extra *block_size * change grid_size to threads * Bug fix in softmax cache parameters. * Additional bug fix for maxpool1d cache parameters. * Bug fix in softmax cache parameters. * Additional bug fix for maxpool1d cache parameters. * Remove temporary print statements. * Use nthreads in primitive hash. * Switched from using stack references for cudnn and cublas handles to heap pointers held only the c-struct GPURuntimeContext but managed by the GPU_Backend. * Refactored the use of GPURuntimeContext* ctx throughout the emitters. * Use std::prev instead of operator-- for memory iteratory capture * bug fix from abaf1d7
Showing
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please
register
or
sign in
to comment