- 06 Dec, 2018 1 commit
-
-
Pruthvi authored
* - Added reorder support for rnn weights_layer/iter * i) fixed compilation issues ii) working but still observing precision error * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter * i) added support for src reorder to TNC from NTC * reorder support for rnn output fron NTC to TNC * - added support for rnn weight reorder ldgoi -> ldigo - code refactor for lstm/rnn kernel in mkldnn emitter * - refactor rnn mkldnnn kernel, change variable names * fix RNN codegen kernel * disbale layer rnn fusion pass, to test CI * method to validate recurrent rnn inputs * add correlated macthes for Recurrent RNN PM * - simplify reorder logic for rnn_weights - fix graph pattern for fusing rnn cell across time steps * do weights reorders in rnn timesteps fusion * refactored LSTM graph pass * - Bug fix for finding the lstm inputs determenstically - Refactored LSTM graph pass to single pass - made changes to LSTM RNN time step fusion graph pass * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass * fix compilation error * Fix GNMT rnn fusion * check if the node is in use before replacing in RNN graph passes * i) fix style ii) fix topo sort issue in RNN graph pass * style fix * fix bug in simplify_concat pass * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2 * cse for convert layout * addressed PR comments * - optimization pass to remove Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2 - conditional fusing of LSTM cells only for the decoder * made changes to multi layer RNN fusion callback * fix asserts in RNN op * - added support to fuse layers when slc=dlc for RNN cells - bug fix on the sanity checks for RNN Op * - support RNN layer fusion till slc = dlc - bug fixes in multi layer rnn fusion call back * capture reshape in the RNN weights * Addressed PR comments * - added comments in multi layer PM call back - fuse only if slc == DLC across layers * restore deleted 3_lstm_cell_forward.json file * fix typo * fix failing unit tets * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input. * Address PR feedback: process in place slice after propagating in place input. * Set INTERMEDIATE role before propagating in place input. * Do not add temporaries to the variable name map before propagating in place input in codegen. * Fix a bug in codegen. * Fix a bug in codegen slice. * reenable disabled rnn unit test * fix compiler error * - bug fix in the slicing logic for the layer fused rnn cell - fix failing rnn unit test * - Addressed PR comments - removed redundant checks from the rnn graph pass - simplified rnn call back replace node logic * - added new multilayer rnn *.json file - fix test case * [PRIVATE BRANCH] Style fixes (#2080) * Style fixes * change order of lstm gates * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113) * Style fixes for single-layer RNN fusion * Style fixes to multi-layer RNN * style fix * disable GPU test
-
- 23 May, 2018 1 commit
-
-
Pruthvi authored
* - Added pattren matcher for LSTM cell * WIP added support to replace lstm cell instead of subgraph * WIP LSTM pattern matcher, fuses recurrent cells * WIP added RNN CPU op * WIP mkldnn emmiter code for fprop RNN * WIP RNN mkldnn integration - Added mkldnn kernel for uni directional LSTM in the CPU emitter * add a getter for root node * recurrent graph rewrite * fix perms, rename match_root -> get_match_root * fix comp errors * make match_root return the topmost match; fix tests * - WIP GetOutputElement for handling multiple LSTM o/ps - use RecurrentGraphRewrite for replacing node after matching LSTM cells * WIP LSTM multi Output + debug prints * moved LSTM fusion to cpu_fusion * WIP added RNN superfused OP * WIP towards RNN layer fusion * WIP multiple output slicing RNN * WIP RNN mulitple o/ps fusion across layer * WIP corrected input params for fused RNN OP * concat corrosponding param's across differnt LSTM to form inputs to RNN fused op * i) Added test case for RNN kernel ii) runs without error's * refactored and moved LSTM class to standalone file * Rename RNN -> Rnn , LSTM -> Lstm * WIP replace lstm slices to the consumer op * Slicing works on multiple RNN layers * fixed all bugs * - Added CPU RNN Recurrent Fusion - Added CPU LSTM fusion - removed debug code - style fix * - Added support to compute src_iter and dst_iter instead of taking zero_memory_desc - Added unit test to compute one LSTM cell * changed RNN op signature to accept number of states in basic unit of RNN(GRU/LSTM/ vanilla RNN) cell * added sanity checks for RNN op * Fixed issue related to patching the graph while replacing the RNN sliced outputs * Fixed issue to feed the input symbols in the order X0, X1, ...Xt to the RNN op * Added unit test for multi layer RNN fusion * Removed debug statements * Added mulitlayered serialized graph ii) fixed compilation issue * Addressed PR comments * i) WIP MKLDNN layout for RNN Op ii) added test case for INTERPRETER v/s CPU Rnn results * - Fixed bug w.r.to src_layer feature size in rnn mkldnn emitter code - Refactored cpu_fusion rnn test case * merge origin/master with branch pruthvi/lstm_fusion * style fix * Added test case for multiple RNN layers * i) make rnn as mkldnn op if it meets the constraints ii) assert if rnn is not mkldnn op * fix unit test failure * - Added support to reliabily identify the hiddent state and input symbols from the nodes collected by Pattern matcher - Fixed failing unit tests * style fix * - removed "node type" dependency to replace the intermediate LSTM outputs * Addressed PR comments * Fix unit test * - added MKLDNN emitter for LSTM op - graph pass to concat LSTM input recurrent state tensors - CPU layout assignment for LSTM Op - Fixed bug in rnn/lstm unit test's - made changes to use replace_output instead of replace_node for replacing matched graph nodes in LSTM/RNN fusion pass (cherry picked from commit d16fc709265cc0a73e60c6d5f6d2878e7b908aca) * style fix * Renamed passes and style fixes
-