• Chris Sullivan's avatar
    RNN fusion (inference) (#1459) · 4df5ea8b
    Chris Sullivan authored
    * Add op::Sigmoid to nvgpu.
    
    * Bring rnn fusion and concat passes over into GPU from IA. This is a temporary move until generalization and gpu specification can occur.
    
    * Add LSTM fusion and cudnn inference kernel. Next need recurrent fusion and layer fusion.
    
    * Formatting
    
    * Removed unecessary extra output from LSTM op (rnn with seq. length = 1, so y = hy).
    
    * Add RNN fusion of LSTM cells within a recurrent layer.
    
    * Formatting.
    
    * Add fusion across RNN layers.
    
    * Formatting.
    
    * Add algebraic simplification.
    
    * Added rnn fusion tests.
    
    * Updated conditional on LSTM fusion to better distinguish bound nodes as ht vs xt.
    
    * Formatting.
    
    * Removed print statements.
    
    * Formatting.
    
    * Committing missing file.
    
    * Remove concat inputs pass and mkldnn references.
    
    * fix cmake paths
    
    * conflict resolution with merge from master.
    
    * remove explicit lstm op support. bare LSTM ops are converted to RNN ops for emission.
    
    * Formatting.
    
    * Use NGRAPH_ASSERT. Formatting of intel copyright.
    
    * Add check on the feature size (shape) of the recurrent (hidden) input and cell state, to ensure they are the same size.
    
    * fix wrong rnn header
    
    * Formatting.
    
    * Add back lstm op to dispatch table.
    
    * Added RNN test which shows cudnn rnn kernel is not producing correct results.
    
    * With update to AlgSimpl. to simplify concat-reshape-slice, the check modifed in this commit needed to be relaxed.
    
    * Bug fix in parameter tensor packing.
    
    * Alias third output element of RNN for cell state (bug fix).
    
    * Resolve numerical correctness issue with negative values in RNN (bug fix).
    Add minimal test to evaluate LSTM and compare with values calculated by hand.
    
    * Add tensor parameter sizes to kernel hash as
    they are kernel-specific.
    
    * Add 2 layer lstm fusion test against by-hand solution.
    
    * Export param concatenation to graph for cudnn kernel at both the single rnn layer and multi-layer.
    
    * Formatting.
    
    * Finishing touches after merge: add support for macro expansed dispatch via op_tbl.
    
    * Simplify macro support for gpu ops.
    
    * Add CUDNN_VERSION >= 7200 defguards for RNN fusion.
    Need to decide how to notify user of increased performance with >= 7200.
    
    * Revert lstm_analytic test to explicitly copy data to tensor params.
    
    * Removed namespace arg from NGRAPH_GPU_OP.
    
    * Refactored macros to different header so op_tbl only contains op list.
    
    * Defguard on cudnn_descriptor<cudnnRNNDataDescriptor_t>.
    
    * doubles -> floats
    
    * Reorg. pass asserts, prepare to replace with non-throwing pass failures.
    
    * Remove Lstm op and replace it with Rnn.
    
    * Format
    
    * Utilize RETURN_IF_FALSE in rnn pass to avoid any RT asserts.
    Note that falling back to raw (no passes) graph for 2rnn_3lstm json from mxnet models
    results in a double free inside of the memory layout pass. Appears to be a bug
    in Reshape pass through.
    
    * Removed print statements. Add check on input data and recurrent data.
    
    * Don't reuse memory for non-destructive ops.
    
    * Add back Rnn test.
    
    * Formatting.
    
    * Clean up comments.
    
    * Update test per review comments.
    4df5ea8b
Name
Last commit
Last update
.ci Loading commit data...
cmake Loading commit data...
contrib/docker Loading commit data...
doc Loading commit data...
licenses Loading commit data...
maint Loading commit data...
python Loading commit data...
src Loading commit data...
test Loading commit data...
.clang-format Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
.gitmodules Loading commit data...
.travis.yml Loading commit data...
CMakeLists.txt Loading commit data...
CODEOWNERS Loading commit data...
CONTRIB.md Loading commit data...
INSTALL.md Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
VERSION.in Loading commit data...
changes.md Loading commit data...