Files · 026bede0a3f71ba266d65cc3b1c4871adbee69fd · submodule / ngraph

Add GPURuntimeContext and GPUPrimitiveEmitter to the gpu transformer (#837) · 026bede0

Chris Sullivan authored Apr 13, 2018

* Begin prototype of cudnn_emitter.

* Added GPURuntimeContext to gpu_external_function for passing through to JIT functions.

* gpu_emitters now utilize gpu runtime context.

* Moved cublas and cudnn handles into GPURuntimeContext pointer and out of callframe EntryPoint.

* Added CUDNNEmitter, comparable to MKLDNNEmitter,
which allows for cudnn kernels to be defined via
lambda primitives that are emitted and
subsequently called during graph execution.
An example implementation is provided for op::Sum.

* Added GPURuntimeContext to gpu_external_function for passing through to JIT functions.

* gpu_emitters now utilize gpu runtime context.

* Moved cublas and cudnn handles into GPURuntimeContext pointer and out of callframe EntryPoint.

* GPURuntimeContext should be stored as unique_ptr in external function.

* Extract raw pointer from unique for cudnn_emitter.

* Removing unrelated code from PR.

* GPURuntimeContext needs to be a strict C interface in case
the native compiler and clang are utilizing different glibc ABIs.
Updated to reflect this.

* Added cudnn::primitive typedef for better readability.

* Moved allocation of CudaFunctionPool to external function
so that it is available during gpu emission.

* Fixed too-late initialization of cudart.

* CUDNNEmitter moved into superset class GPUPrimitiveEmitter.
The GPUPrimitiveEmitter handles the emission of all gpu primitives,
including cudnn, cuda, and cublas. CUBLASEmitter support not yet included.

* Added unordered_map for cacheing primitives in the gpu_emitter.

* Added dtor to GPUPrimitiveEmitter to cleanup compiled functions.

* Adding back a serialized model graph that was accidentally rem* Added a few additional helpers to use ngraph::row_major_strides.

* added whitespace per @fengleitian's comment

* Remove implicit type conversions from size_t to int.

* Add op::MaxPool, op::MaxPoolBackprop and op::Pad to GPU transformer (#817)

* Added pooling for 1 and 2dimensions. 1d uses a cuda kernel and 2d utilizes cudnn.
Padding is not yet supported.

* Normalized call signature on gpu emission for 1d max pool. Added a few comments.

* Max pool backprop impl. inprogress. Amend this commit.

* Max pool backprop implemented. Note that cuDNN
requests the output tensor for the maxpool operation but it is not required for computation.

* Formatting and invokation for maxpool changed.

* Fixed too-late initialization of cudart.

* Added padding kernel that is used with maxpool. Need to investigate remaining tests.

* Changed dimensionality check to correctly
determine if data is 1d or not.

* Added 3d MaxPooling (forward), verified by forcing 2d case to use Nd pooling routines.

* Added 3d MaxPooling (backward), verified by forcing 2d case to use Nd pooling routines.

* Moved cudnn prologues for maxpool into ngraph runtime and out of primitive so
that the only execution occuring on the JIT runtime is the evaluation of the op kernel.

* Refactored forward and backward pooling into single CUDNNEmitter::build_pooling interface
with a runtime switch to determine if the op is forward or backward propagation.

* Cache preconstructed cudnn kernel for maxpool if it has already been constructed.

* Forgot to add padding arrays back into cudnn kernel for MaxPool in the 2d case.

* Fixed namespace issues and use join(...,'_')

* Refactored 4d/Nd tensor descriptor builder into single function.

* Changed conditionals and comments. Now throws if MaxPool on more than 3 spatial dimensions is requested.

* Fixed forward declare for GPURuntimeContext (class -> struct).

* Clang complains about missing braces on brace-initializer. Fixed implicit conversions.

* Fixed implicit conversions (clang).

* Reverting changes on autodiff test for maxpool. @Krovatkin will update later.

026bede0

Name	Last commit	Last update
.ci/travis/ubuntu		Loading commit data...
cmake		Loading commit data...
contrib/docker		Loading commit data...
doc		Loading commit data...
licenses		Loading commit data...
maint		Loading commit data...
python		Loading commit data...
src		Loading commit data...
test		Loading commit data...
third-party		Loading commit data...
.clang-format		Loading commit data...
.gitignore		Loading commit data...
.gitmodules		Loading commit data...
.travis.yml		Loading commit data...
CMakeLists.txt		Loading commit data...
INSTALL.md		Loading commit data...
LICENSE		Loading commit data...
README.md		Loading commit data...
changes.md		Loading commit data...

README.md