1. 27 Feb, 2019 1 commit
    • Amy Zhuang's avatar
      Reuse memory for CPU backend. (#2238) · b277627a
      Amy Zhuang authored
      * Reuse memory for CPU backend.
      
      * Use NGRAPH_REUSE_MEMORY to enable memory reuse.
      
      * Add a test.
      
      * Move make_function to test_tools.cpp.
      
      * Add more comments.
      
      * Address PR Feedback: add a method to CPU backend.
      
      * *Add a member to CPUOpAnnotations to remove redundant code.
      
      *Overload compile function for CPU backend.
      
      * Move make_function out of test_tools.
      
      * Address PR Feedback.
      
      * Use modified liveness analysis in CPUMemoryAssignment pass.
      
      * Use lambda expression.
      
      * Fix style error.
      
      * Check if any user of the tensor has destructive io when building tensor alias map.
      
      * Fix a bug.
      
      * Check if tensor has multiple users.
      
      * Allow tensor alias for destructive oi node.
      
      * Update multiple_users_tensor set along the chain of in place ops.
      
      * No tensor alias if input is parameter or constant.
      
      * Use buffer sets in cpu memory assignment,
      tensors sharing the same memory buffer are put into the same set.
      
      * Add more checks and do not combine sets when allowing destructive oi.
      
      * Style fix.
      
      * Do no allow destructive oi if the input tensor uses function input memory.
      
      Update set label.
      
      * Add unit tests.
      
      * Style fix.
      
      * Get the correct size for memcpy when the input is padded.
      
      * Style fix.
      
      * Address PR feedback.
      
      * Address PR feedback.
      
      * Move make_function in cpu_test after #if 0 and before the disabled test.
      
      * Add utility functions.
      
      Use iterator.
      
      Rename variables.
      
      * Add pass attributes and move cpu memory assignment to common passes (#2504)
      b277627a
  2. 25 Feb, 2019 1 commit
    • Pruthvi's avatar
      Pruthvi/bi rnn (#2232) · a444f7a9
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * WIP bi rnn
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * added callback routine for bi-directional rnn
      
      * fix rnn op ctor, rnn mkldnn emitter to accomodate bi directional rnn
      
      * style fix
      
      * added helper function for rnn's to query direction and cell_type
      
      * fix clang error
      
      * - unit test case for bi rnn fusion
      - style fix
      
      * - updated bi-rnn graph pass to handle reverse and reverse_seq ops in the predicate
      - added bi-rnn inter v/s cpu unit test case
      - add support to in mkldnn_utils to create_md with tnc/ntc format
      
      * - added enum type to deduce rnn_type
      
      * Addressed PR comments
          - handle reshapes from {t, n, c} to {n, t, c} in the graph pass
      
      * fix style
      
      * fix clang error
      
      * fix style
      
      * i) move enum specific to rnn to seperate header
      a444f7a9
  3. 02 Feb, 2019 1 commit
    • Pruthvi's avatar
      Pruthvi/fix input matrix fusion (#2381) · 917efb94
      Pruthvi authored
      * -   check to verify if the data_slices shares the same weights
      
      * add the serialized graph
      
      * - explicitly fuse the data slices, so all the parameter partitioned by slices are in contigous memory location
      - fixes all the failing test cases
      917efb94
  4. 06 Dec, 2018 1 commit
    • Pruthvi's avatar
      Pruthvi/fix rnn precision (#1874) · 73da681a
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * style fix
      
      * disable GPU test
      73da681a
  5. 26 Sep, 2018 1 commit
    • Adam Straw's avatar
      add nGraph quantize op (#1661) · d640fac3
      Adam Straw authored
      * adding nGraph Quantize op
      
      * unit test failing for floating point exception
      
      * unit test working in float
      
      * unit test working in uint8
      
      * improved type checking and polished unit test - passing
      
      * quantized axes working
      
      * inclusive project method
      
      * add round mode
      
      * TODO cleanup
      
      * code format
      
      * adding serializer support - fails build
      
      * add serializer support
      
      * make CPU quantize op work; new tests for int8, clamp)
      
      * fix build failure
      
      * fix GPU build issue
      
      * fix GPU unit test manifest
      
      * use quantized offset
      
      * add is_quantized field to element::Type
      
      * add reduce function to coordinate.hpp
      d640fac3
  6. 30 Jun, 2018 1 commit
    • Pruthvi's avatar
      Pruthvi/fix rnn output (#1135) · c4c24cb0
      Pruthvi authored
      * - Fixed replace output for the multi layer recurrent cell state tensor output
      - Modified rnn add_output to consider direction and n_layer while calculating the output size for mkldnn dst_layer and dst_iter
      
      * fix unit test failure
      c4c24cb0
  7. 15 Jun, 2018 1 commit
    • Pruthvi's avatar
      RNN fusion across layers (#1085) · f75b8006
      Pruthvi authored
      * - Added graph pass for fusing RNN op across layer
      - Added test case for inter v/s cpu for verifying layer fused RNN
      - more sanity checks in the RNN fusion graph pass
      - added support to replace the recurrent cell state correctly in the fused RNN op
      
      * Fixed multi layer rnn fusion unit test failure
      
      * Addressed PR comments
      f75b8006
  8. 07 Jun, 2018 1 commit
    • Louis Feng's avatar
      ngraph-1676 batch dot fusion (#1071) · 6f5e3ac7
      Louis Feng authored
      * batch dot pattern wip.
      
      * batch dot pattern wip.
      
      * added batch dot op.
      
      * batch dot compute testing.
      
      * correct gemm parameters.
      
      * renaming matrix fusions passes and update tests.
      
      * clean up.
      
      * clang format.
      
      * more clean ups.
      
      * clang format.
      
      * added CPUBatchDotFusion to default cpu passes.
      
      * added missing header.
      
      * added element type check.
      6f5e3ac7
  9. 31 May, 2018 1 commit
  10. 23 May, 2018 1 commit
    • Pruthvi's avatar
      LSTM fusion + RNN fusion across time slice's for single layer (#826) · 1d08f073
      Pruthvi authored
      * - Added pattren matcher for LSTM cell
      
      * WIP added support to replace lstm cell instead of subgraph
      
      * WIP LSTM pattern matcher, fuses recurrent cells
      
      * WIP added RNN CPU op
      
      * WIP mkldnn emmiter code for fprop RNN
      
      * WIP RNN mkldnn integration
      - Added mkldnn kernel for uni directional LSTM in the CPU emitter
      
      * add a getter for root node
      
      * recurrent graph rewrite
      
      * fix perms, rename match_root -> get_match_root
      
      * fix comp errors
      
      * make match_root return the topmost match; fix tests
      
      * - WIP GetOutputElement for handling multiple LSTM o/ps
      - use RecurrentGraphRewrite for replacing node after matching LSTM cells
      
      * WIP LSTM multi Output + debug prints
      
      * moved LSTM fusion to cpu_fusion
      
      * WIP added RNN superfused OP
      
      * WIP towards RNN layer fusion
      
      * WIP multiple output slicing RNN
      
      * WIP RNN mulitple o/ps fusion across layer
      
      * WIP corrected input params for fused RNN OP
      
      * concat corrosponding param's across differnt LSTM to form inputs to RNN fused op
      
      * i) Added  test case for RNN kernel ii) runs without error's
      
      * refactored and moved LSTM class to standalone file
      
      * Rename RNN -> Rnn , LSTM -> Lstm
      
      * WIP replace lstm slices to the consumer op
      
      * Slicing works on multiple RNN layers
      
      * fixed all bugs
      
      * - Added CPU RNN Recurrent Fusion
      - Added CPU LSTM fusion
      - removed debug code
      - style fix
      
      * - Added support to compute src_iter and dst_iter instead of taking zero_memory_desc
      - Added unit test to compute one LSTM cell
      
      *  changed RNN op signature to accept number of states in basic unit of RNN(GRU/LSTM/ vanilla RNN) cell
      
      * added sanity checks for RNN op
      
      * Fixed issue related to patching the graph while replacing the RNN sliced outputs
      
      * Fixed issue to feed the input symbols in the order X0, X1, ...Xt to the RNN op
      
      * Added unit test for multi layer RNN fusion
      
      * Removed debug statements
      
      * Added mulitlayered serialized graph ii) fixed compilation issue
      
      * Addressed PR comments
      
      * i) WIP MKLDNN layout for RNN Op ii) added test case for INTERPRETER v/s CPU Rnn results
      
      * - Fixed bug w.r.to src_layer feature size in rnn mkldnn emitter code
      - Refactored cpu_fusion rnn test case
      
      * merge origin/master with branch pruthvi/lstm_fusion
      
      * style fix
      
      * Added test case for multiple RNN layers
      
      * i) make rnn as mkldnn op if it meets the constraints ii) assert if rnn is not mkldnn op
      
      * fix unit test failure
      
      * - Added support to reliabily identify the hiddent state and input symbols from the nodes collected by Pattern matcher
      - Fixed failing unit tests
      
      * style fix
      
      * - removed "node type" dependency to replace the intermediate LSTM outputs
      
      * Addressed PR comments
      
      * Fix unit test
      
      * - added MKLDNN emitter for LSTM op
      - graph pass to concat LSTM input recurrent state tensors
      - CPU layout assignment for LSTM Op
      - Fixed bug in rnn/lstm unit test's
      - made changes to use replace_output instead of replace_node for replacing matched graph nodes in LSTM/RNN fusion pass
      
      (cherry picked from commit d16fc709265cc0a73e60c6d5f6d2878e7b908aca)
      
      * style fix
      
      * Renamed passes and style fixes
      1d08f073
  11. 30 Mar, 2018 1 commit
  12. 09 Mar, 2018 1 commit
    • Pruthvi's avatar
      Pruthvi/sigmoid (#614) · 5885c09a
      Pruthvi authored
      * - Added sigmoid fusion pass
      - added mkldnn emitter code for sigmoid
      
      * - corrected sigmoid expected values
      - add layout assignment for sigmoid op
      
      * - added assert's in cpu fusion for sigmoid
      - style fix
      
      * remove debug prints
      
      * NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes
      
      * NGMX-371 #comment use shape_size() to calculate the 1d input size
      5885c09a
  13. 20 Feb, 2018 1 commit
  14. 14 Feb, 2018 1 commit
    • Pruthvi's avatar
      pattern matcher for BatchnormFprop + mkldnn integration in the CPU emitter (#468) · 34b1322d
      Pruthvi authored
      * fuse dot(a,b) + c
      
      cblas_gemm working on mlp
      
      rebase & small fixes
      
      enable debug output
      
      support replacing function's outputs
      
      * WIP pattern matching for variance
      
      * - Added pattern matcher graph to look up variance(sub graph) in bn
      - Added test case to verify the variance graph pattern
      
      * added batch norm mean pattern matcher.
      
      * remove reshapes
      
      (cherry picked from commit ecad321fb1b1bc3f7facda229beb940118ca0701)
      
      * fixed mean test to use Matcher.
      
      * resolve merge conflict in test/pattern.cpp
      
      * WIP bn fprop pattern
      
      * fprop bn fusion working
      
      * - Added unit test case to read the bn serializeed *.json file and run bn fprop fusion pass
      - Added batchnorm header file and defined the bn class to emit the mkldnn kernel
      - Added pattern matcher for fprop bn in CPU graph_rewrite pass
      
      * WIP MKLDNN fprop bn emitter code
      
      * completed fprop batchnorm kernel in CPU emitter
      
      * fixed bug in the emitter code for fprop bn
      
      * - Fixed copilation issues
      - unit tests are passing for bn emitter fprop code
      
      * Added support to compute fprop bn with mean annd variance as input
      
      * resolved compilation issues
      
      * refactored bn fprop code
      
      * - added batchnorm src file to the CMakeFilelist
      - moved bn fusion under CPU runtime/pass/cpu_fusion
      - fixed compilation issue
      
      * Resolved compilation issues in bn emitted code
      
      * Addded debug statements in fprop bn emitted code
      
      * added batchnorm.cpp src file
      
      * - Added test case to test fprop batchnorm with known tensor values
      - fixed bug related to defining weights in fprop bn
      
      * - Added test case for fprop batchnorm Op
      - Added test case for mean and variance pattern matcher
      - Added fprop bn *.json file with input having 4dmis mb2c3h2w2
      - refactored fprop bn op class
      
      * Style fix
      
      * - Removed Debug symbols
      
      * - Fixed header template with correct year
      - appended mkldnn.hpp in the CPU generated code
      
      *  Addressed PR review comments
       -  added support for batchnorm op in serializer and de-serializer
       - added more sanity in bn constructor
       - renamed "BatchnormFprop" -> BatchNorm
      
      * - Addressed PR review comments
      - replaced auto with speicfic mkldnn::type in emitted bn kernel
      - modified function signature to take 'eps' as double instead of <Node> type
      
      * added missing header files, resolved compilation issue
      
      * style fix
      
      * Addressed PR comments
      1. initilized member variables for bn in the same order as they are defined
      2. renamed bn member variables to start with m_* as per coding convention
      3. moved bn fusion test to test/cpu_fusion.cpp
      4. style fix
      5. added more checks to evaluate type and shape of inputs to bn
      
      * Added support for EMITDECL macro for batchnorm
      
      * - made correction to batchnorm src file name batchnorm -> batch_norm as per coding guidelines
      - corrected bn copy_with_new_args() method
      
      * Removed redundant SqrtOp support in serializer
      34b1322d
  15. 01 Feb, 2018 1 commit
  16. 17 Jan, 2018 1 commit
  17. 29 Dec, 2017 1 commit
    • Scott Cyphers's avatar
      Get value types out of public API, multi-values from Function (#340) · d092cb91
      Scott Cyphers authored
      * Function can have multiple results
      Remove external use of ValueType, TupleType, Tuple
      Remove many external uses of Output and Input
      
      * corresponding CPU backend changes
      
      * Update master changes.
      
      * Remove type arg from Function, add changes.md
      
      * Merge changes.
      
      * Move bodies to .cpp, add brief doc
      
      * Merge CPU changes.
      
      * Remove xla includes from non-xla files
      
      * Remove xla from tests
      
      * First part of xla tuple support
      
      * change fprop_cache to assume multi-output bprop functions
      
      * New wrappers for handling tuples with XLA
      
      * Review comments
      
      * remove old xla files
      
      * fix merge errors
      
      * hand edit models to use multi output instead of tuples
      d092cb91
  18. 28 Dec, 2017 1 commit
  19. 12 Dec, 2017 1 commit