1. 18 Jan, 2019 1 commit
    • Louis Feng's avatar
      Addes backprop to BatchDot op, allows fusion in training. (#2297) · ef778693
      Louis Feng authored
      * batch dot bprop WIP.
      
      * WIP.
      
      * testing.
      
      * clean up debug code.
      
      * comments and var name change.
      
      * clean up.
      
      * format style, batch dot differentiable pass.
      
      * removed debug output.
      
      * added unit test to autodiff, refactored make_function -> make_function_from_file.
      
      * fixed build warning.
      
      * fixed gpu build error.
      
      * clang format fix.
      
      * all test_tools.cpp to find SERIALIZED_ZOO
      
      * remove cmake redef.
      
      * fix unused macro.
      
      * making test cpu only.
      
      * testing build var
      
      * macro test
      
      * verbose makefile test
      
      * style fix
      
      * verbose make
      
      * test/util needs test/models.
      
      * removed debug output.
      
      * refactor fusion type.
      
      * refactor fusion type.
      ef778693
  2. 03 Jan, 2019 1 commit
  3. 19 Dec, 2018 1 commit
  4. 07 Dec, 2018 2 commits
    • Jayaram Bobba's avatar
      Update slice kernels (#2180) · a16c4961
      Jayaram Bobba authored
      * initial commit for update slice op
      
      * Finished up update_slice fusion and added codegen support
      
      * style fixes
      
      * Added unit test for in-place update-slice strided
      
      * change pattern name
      a16c4961
    • Robert Kimball's avatar
      Backend API change pre-work (#2064) · e0933553
      Robert Kimball authored
      * change compile call to return Handle
      
      * make CPU require compile() before call()
      
      * fix unit tests to call compile() before call()
      
      * fix failing ops
      
      * update unit test
      
      * revert some changes
      
      * more fixups
      
      * more diff cleanup
      
      * a few more issues addressed
      
      * more fixes
      
      * update API
      
      * more updates
      
      * fix test_ops.py
      
      * fix
      
      * another attempt to fix
      
      * fix unit test
      
      * fix test error
      e0933553
  5. 06 Dec, 2018 2 commits
    • Nick Korovaiko's avatar
      DEX Loop Kernel (updated) (#2156) · 8fc481a3
      Nick Korovaiko authored
      * one output
      
      passing tests
      
      clean up
      
      fix build breaks
      
      * move generators into a separate file
      8fc481a3
    • Pruthvi's avatar
      Pruthvi/fix rnn precision (#1874) · 73da681a
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * style fix
      
      * disable GPU test
      73da681a
  6. 05 Dec, 2018 1 commit
    • Pruthvi's avatar
      Support for 5D batchnorm (#2055) · d4f8bfdc
      Pruthvi authored
      * - modified cpu_assignment pass to support bn with input 5D
      - added test cases for 5D bn and 5D bn+relu
      
      * - Address PR comments
      - used mkldnn_utils to validate bn for mkldnn
      
      * fix compilation error
      
      * Addressed PR comments
      - added helpers in mkldnn_utils for assigning ngraph Op as MKLDNN op
      - helper funnction for bn mkldnn assignment
      
      * fix clang error
      d4f8bfdc
  7. 28 Nov, 2018 1 commit
    • Scott Cyphers's avatar
      Cyphers/bnorm back (#2129) · 403a09ce
      Scott Cyphers authored
      * Fix batchnorm argument order, cleanup some comments, fix backprop
      
      * Merge error
      
      * Clean up training function, organize inference test
      
      * BatchNormInference tests
      
      * Training case
      
      * Training test
      
      * Fix autodiff BatchNorm test
      
      * Cleanup
      
      * Move file to doc checkout
      
      * Update disabled test name in igpu manifest
      Fix unnused variable
      
      * Unit tests disables
      
      * Review comments
      403a09ce
  8. 21 Nov, 2018 1 commit
    • Jayaram Bobba's avatar
      Adding leaky relu (#2096) · 587b96e5
      Jayaram Bobba authored
      * Adding leaky relu
      
      * Silence compiler warning around fp compares
      
      * Fix copy-paste error and enable in-place for relu mkldnn kernels
      587b96e5
  9. 16 Nov, 2018 1 commit
  10. 11 Nov, 2018 1 commit
    • Fenglei's avatar
      add isfinite check for all_close (#2028) · 702d465a
      Fenglei authored
      * add isfinite check
      
      * style
      
      * output 5 diff and total diff
      
      * output limit of diff for all_close_f
      
      * dix bug
      
      * disable tests
      
      * remove failing unit test that does not make sense.
      702d465a
  11. 07 Nov, 2018 1 commit
  12. 31 Oct, 2018 1 commit
    • Robert Kimball's avatar
      Change Backend::create to return std::unique_ptr<Backend> (#1909) · 05a404a8
      Robert Kimball authored
      * create unique_ptr backend
      
      * unit test cleanup
      
      * address more code that was recently added
      
      * change from reference to pointer when passing backend to reduce the number of lines changed.
      
      * fix build error
      
      * fix python wrapper
      
      * style
      
      * more specific treatment for unique_ptr
      05a404a8
  13. 30 Oct, 2018 1 commit
    • gaurides's avatar
      Gauri/groupconv batchnorm (#1900) · c637d629
      gaurides authored
      * Initial implementation of GroupConv+BatchNorm fusion
      
      * Added GroupConv+BatchNorm with Relu fusion
      
      * Added changes to fuse with BoundedRelu
      
      * Changed BoundedRelu to Relu
      
      * Added test; Code cleanup
      
      * Code formatting
      
      * Removed dead code
      
      * Added test cases and other misc
      
      * Bug fix in group conv callback and general cleanup
      
      * Address PR feedback
      
      * Minor edit to comment. MKLDNN divides both input and output channels by groups
      
      * Style fixes and PR feedback
      c637d629
  14. 22 Oct, 2018 1 commit
    • Nick Korovaiko's avatar
      BatchNorm splitting into ops (2nd try) (#1828) · 1beec46b
      Nick Korovaiko authored
      * split bn into bn_inference bn_training
      
      * fix warnings
      
      * Add GPU support for the new BN ops (#1569)
      
      * Add GPU support and change batchnorm_globalstats test to use BNInference.
      
      * Changed test back to using BNTraining for global stats and updated cudnn backend to account for it.
      
      * Fix issues in merge with master.
      
      * Formatting.
      
      * CPU fixes
      
      * remove 5-arg training BN for now
      
      * more fixes
      
      * python batchnorm changes
      
      * fix onnx_import
      
      * fix a call BatchNormInference c-tor
      
      * yet another fix to BatchNormInference c-tor
      
      * AND yet another fix to batchnorm_inference c-tor
      
      * ops.py
      
      * address adam's feedback
      
      * Remove unnecessary parameter/argument.
      
      * remove batch_norm_training_relu_with_global_stats
      
      * remove bn_relu (training)
      1beec46b
  15. 15 Oct, 2018 1 commit
  16. 12 Oct, 2018 1 commit
  17. 08 Oct, 2018 1 commit
    • Jayaram Bobba's avatar
      IAT: More convolution folding optimizations (#1712) · 00b4453d
      Jayaram Bobba authored
      * Check output shape when setting memory layout for slice op.
      
      * Miscellaneous fusion and other optimizations for inception-resnetv2
      - ConvBias Batchnorm folding
      - ConvBias Affine folding
      - Check if MKLDNN can slice a given layout and select layouts
        appropriately
      
      * Fixed unit test and bug in conv bias pattern
      
      * Addressed PR feedback
      
      * Addressed PR feedback
      00b4453d
  18. 05 Oct, 2018 1 commit
  19. 02 Oct, 2018 1 commit
    • Pruthvi's avatar
      Pruthvi/rnn fusion (#1677) · 18e41513
      Pruthvi authored
      * WIP input * weights rnn optimization
      
      * concat + slcing + replacing new node works
      
      * WIP unit test case of fusing rnn inputs
      
      * - Added unit test case for fusing rnn input weights
      - registered CPURnnMatFusion_v1/v2 in codegen and DEX
      
      * fixed redeclaration of a variable
      
      * Refactored rnn input traformation passes into a single pass
      
      * Refactored CPURnnMatFusion call back functions
      
      * change random generator range to include -ve values in unit test
      
      * address PR comments
      
      * dont fuse if the shape of the data slices dont match
      18e41513
  20. 29 Sep, 2018 1 commit
  21. 21 Sep, 2018 1 commit
    • Amy Zhuang's avatar
      Add CPU horizontal fusion pass for inception. (#1577) · 2d2b3b2f
      Amy Zhuang authored
      * Add CPU horizontal fusion pass for inception.
      
      * Name change.
      
      * Move horizontal fusion to cpu_fusion.
      
      * Change horizontal fusion pass for inception to a general horizontal fusion pass.
      Add a unit test conv_horizontal_fusion to cpu_fusion.
      
      * Rename files.
      
      * Correct cpu_fusion.hpp.
      
      * Add NGRAPH_DEBUG.
      
      * Set native layout when input format of slice is nChw16c or nChw8c and lower bound of
      channels is not a multiple of 16 or 8.
      2d2b3b2f
  22. 14 Sep, 2018 1 commit
    • Scott Cyphers's avatar
      Cyphers/layout (#1602) · 2f79f707
      Scott Cyphers authored
      * Remove "view"
      Simplify layout
      
      * Fix merge error
      
      * fix build error
      
      * PR1602. IntelGPU backend. Compilation fixed.
      2f79f707
  23. 13 Sep, 2018 1 commit
    • Robert Kimball's avatar
      Handle unsupported op in nbench (#1531) · fe676f72
      Robert Kimball authored
      * add unsupported_op exception
      
      * unsupported_op test
      
      * add printout of unsupported op in model
      
      * fix GPU dispatcher check
      
      * fix test designation
      
      * catch exceptions on single file runs too
      
      * add unsupported_op exception where needed
      
      * remove unsupported_op class
      
      * add unassigned op exception
      
      * add unit test
      
      * catch unsupported op in nbench
      
      * add cpu test back
      
      * update all latest merges
      
      * mode change
      fe676f72
  24. 11 Sep, 2018 1 commit
    • gaurides's avatar
      Add conv add fusion (#1526) · 37174c90
      gaurides authored
      * Add conv add fusion
      
      * Updated file permissions and cpu_fusion order
      
      * Formatted code using maint/apply-code-format.sh
      
      * Fixed minor review comments
      
      * Use NODE_VALIDATION_ASSERT instead of throw ngraph_error;\nupgrade baseline and fix issues
      
      * Some more fixes
      37174c90
  25. 29 Aug, 2018 2 commits
  26. 27 Aug, 2018 1 commit
  27. 15 Aug, 2018 1 commit
  28. 13 Aug, 2018 1 commit
  29. 10 Aug, 2018 1 commit
  30. 07 Aug, 2018 1 commit
    • Jayaram Bobba's avatar
      Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27
      Jayaram Bobba authored
      * Switch to using mkldnn memory descriptors for layout
      
      * More changes for using mkldnn descriptor instead of format
      
      * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding
      
      * Rotate mkldnn layouts on transpose
      
      * Modifications to builder reshape to skip rotated layouts
      
      * More fixes to layouts and removes axis order from cpu layout descriptor
      
      * Code cleanup
      
      * Removed shuffle folding pass since the functionality is subsumed by the layout pass
      
      * Canonicalize a few more formats to keep MKLDNN happy.
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Addressed PR feedback and added reshape passthrough for non-transpose cases
      
      * Adjust named formats for weights tensors to keep MKLDNN happy
      
      * Style fixes
      
      * resolved merge issues
      69c51c27
  31. 18 Jul, 2018 1 commit
  32. 17 Jul, 2018 1 commit
    • Jayaram Bobba's avatar
      Added more convolution variants to DEX (#1223) · 9bb0b653
      Jayaram Bobba authored
      * CPU Direct Execution: Implement ConvertLayout and refactor
      
      * CPU Direct Execution: Implement Convolution
      
      * 1) Adds computation reuse to direct execution
      2) Add avg_pool, broadcast and convolution_bias to direct execution
      3) Moved some computation reuse utility functions to graph_utils
      
      * Use lists instead of vectors to avoid reallocation overheads
      
      * - Added convolution variants to direct execution
      - Removed ConvolutionBiasRelu, use ConvolutionBias instead
      - Reduced code duplication by moving functionality to mkldnn_emitter
        from cpu_emitter
      
      * Style fix
      
      * Moved mkldnn build_convolution to a templated method
      
      * Style fix
      
      * refactored mkldnn conv bprop builders
      
      * Style fix
      9bb0b653
  33. 11 Jul, 2018 1 commit
  34. 03 Jul, 2018 1 commit
  35. 02 Jul, 2018 3 commits
    • Sandeep's avatar
      move sigmoid to core fusion (#1132) · d05b5e39
      Sandeep authored
      * declare sigmoid for core fusion
      
      * add simple test for sigmoid
      
      * info fusion status
      
      * cp op as main op
      
      * builds as expected
      
      * move sigmoid fusion code
      
      * add reference kernel
      
      * sigmoid bprop reference kernel and clang-format
      
      * add delta to bprop
      
      * fprop called
      
      * compiles bprop
      
      * move tests
      
      * serializer support
      
      * address comments in code
      
      * add doc
      
      * naming similar to core ops
      
      * fix failing test
      
      * fix failing test
      
      * address clang issue
      
      * more changes
      
      * change test macro
      d05b5e39
    • Pruthvi's avatar
      MKLDNN BoundedRelu implementation for Relu6 (#1179) · eaa6091c
      Pruthvi authored
      * 1. Added MKLDNNN BoundedRelu op support for Relu6
      2. CpuLayout && CPU assignment pass for BoundedRelu Op
      3. Unit test inter v/s CPU for BoundedReluOp
      4. MKLDNN and default emitter code for BoundedReluOp
      
      * Removed Debug prints
      
      * 1. Added support for boundedrelu to work on any constant literal
      2. unit test case for rank2, rank3, rank4 for bounded relu without serialized graph
      
      * Removed is_six() method
      eaa6091c
    • Louis Feng's avatar
      Conv+bias shape check for better error detection (#1176) · e42e5815
      Louis Feng authored
      * Reshape bias to 1D for conv + bias bprop fusion
      
      * Reshape goe2 back to 2D before replacing
      
      * added shape checks to validate conv+bias op.
      
      * removed conv+bias backprop merge for separate PR review.
      
      * fixed conv_bias_bprop test.
      
      * minor changes to error messages.
      e42e5815