1. 06 Dec, 2018 1 commit
    • Pruthvi's avatar
      Pruthvi/fix rnn precision (#1874) · 73da681a
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * style fix
      
      * disable GPU test
      73da681a
  2. 23 May, 2018 1 commit
    • Pruthvi's avatar
      LSTM fusion + RNN fusion across time slice's for single layer (#826) · 1d08f073
      Pruthvi authored
      * - Added pattren matcher for LSTM cell
      
      * WIP added support to replace lstm cell instead of subgraph
      
      * WIP LSTM pattern matcher, fuses recurrent cells
      
      * WIP added RNN CPU op
      
      * WIP mkldnn emmiter code for fprop RNN
      
      * WIP RNN mkldnn integration
      - Added mkldnn kernel for uni directional LSTM in the CPU emitter
      
      * add a getter for root node
      
      * recurrent graph rewrite
      
      * fix perms, rename match_root -> get_match_root
      
      * fix comp errors
      
      * make match_root return the topmost match; fix tests
      
      * - WIP GetOutputElement for handling multiple LSTM o/ps
      - use RecurrentGraphRewrite for replacing node after matching LSTM cells
      
      * WIP LSTM multi Output + debug prints
      
      * moved LSTM fusion to cpu_fusion
      
      * WIP added RNN superfused OP
      
      * WIP towards RNN layer fusion
      
      * WIP multiple output slicing RNN
      
      * WIP RNN mulitple o/ps fusion across layer
      
      * WIP corrected input params for fused RNN OP
      
      * concat corrosponding param's across differnt LSTM to form inputs to RNN fused op
      
      * i) Added  test case for RNN kernel ii) runs without error's
      
      * refactored and moved LSTM class to standalone file
      
      * Rename RNN -> Rnn , LSTM -> Lstm
      
      * WIP replace lstm slices to the consumer op
      
      * Slicing works on multiple RNN layers
      
      * fixed all bugs
      
      * - Added CPU RNN Recurrent Fusion
      - Added CPU LSTM fusion
      - removed debug code
      - style fix
      
      * - Added support to compute src_iter and dst_iter instead of taking zero_memory_desc
      - Added unit test to compute one LSTM cell
      
      *  changed RNN op signature to accept number of states in basic unit of RNN(GRU/LSTM/ vanilla RNN) cell
      
      * added sanity checks for RNN op
      
      * Fixed issue related to patching the graph while replacing the RNN sliced outputs
      
      * Fixed issue to feed the input symbols in the order X0, X1, ...Xt to the RNN op
      
      * Added unit test for multi layer RNN fusion
      
      * Removed debug statements
      
      * Added mulitlayered serialized graph ii) fixed compilation issue
      
      * Addressed PR comments
      
      * i) WIP MKLDNN layout for RNN Op ii) added test case for INTERPRETER v/s CPU Rnn results
      
      * - Fixed bug w.r.to src_layer feature size in rnn mkldnn emitter code
      - Refactored cpu_fusion rnn test case
      
      * merge origin/master with branch pruthvi/lstm_fusion
      
      * style fix
      
      * Added test case for multiple RNN layers
      
      * i) make rnn as mkldnn op if it meets the constraints ii) assert if rnn is not mkldnn op
      
      * fix unit test failure
      
      * - Added support to reliabily identify the hiddent state and input symbols from the nodes collected by Pattern matcher
      - Fixed failing unit tests
      
      * style fix
      
      * - removed "node type" dependency to replace the intermediate LSTM outputs
      
      * Addressed PR comments
      
      * Fix unit test
      
      * - added MKLDNN emitter for LSTM op
      - graph pass to concat LSTM input recurrent state tensors
      - CPU layout assignment for LSTM Op
      - Fixed bug in rnn/lstm unit test's
      - made changes to use replace_output instead of replace_node for replacing matched graph nodes in LSTM/RNN fusion pass
      
      (cherry picked from commit d16fc709265cc0a73e60c6d5f6d2878e7b908aca)
      
      * style fix
      
      * Renamed passes and style fixes
      1d08f073