Drwebb/gpu backend dot op (#413)
* Drwebb/gpu backend dot op (#387) * GPU Dot prod emitter switch statement * cuBLAS dot kernel call * Flush out arg substitution into gpu dot kernel call * Drwebb/gpu backend dot op (#392) * Take in CodeWriter into gpu op emitters * Introduce GPU function gen based on pass functions * Additional gpu emitter stubs * link cublas in to unit test and ngraph * Use static code gen methods for GPU, add new GPU op stubs * use pass manager to declare functions / cublas Updates * Prune down gpu_external_function wip * Switch back to GPU tensor views in GPU backend * Pass in cublas handle to GPU external function * cuMalloc memory in gpu tensor view * Use cuda runtime malloc and free for tensor view managment c * change GPU tensor view init, and use GPU tensor view for GPU call frame * include headers as system dirs * GPU tensor printing utility function * cublasSetPointer to device mode / Fix copyright notification lowercasing * Passing GPU dot product test using cuBLAS Clean up * Changes from review
Showing
This diff is collapsed.
This diff is collapsed.
Please
register
or
sign in
to comment