1. 11 Feb, 2019 1 commit
    • Jayaram Bobba's avatar
      Mixed-precision fusions (#2401) · 13b4966b
      Jayaram Bobba authored
      * CPUQuantFusion pass and some usions for converting mixed precision sub-graphs to int8 fused ops
      
      * - Added unit tests and misc bug fixes for mixed-precision fusions
      - Adjust fused sum_scale in quantization builders instead of mkldnn
        primitive creation
      13b4966b
  2. 04 Feb, 2019 1 commit
    • Robert Kimball's avatar
      Windows support. (#2394) · 45a0fb47
      Robert Kimball authored
      * fix windows build
      
      * wip
      
      * mkldnn seems to build
      
      * address various errors building cpu backend with MSVC
      
      * wip
      
      * wip
      
      * Windows support.
      
          * Delete dependency of LLVM when building with MSVC.
      
      * Define EIGEN_HAS_CONSTEXPR when using MSVS.
      
      * Fix MSVC build errors.
      
          * Incorrect argument to 'decltype'. It is VC bug. Work around the
          error with rename the function into different name.
      
          * MINMAX issue in matmul_bias.cpp.
      
          * Correct TBB_LINK_LIBS on Windows.
      
      * Fix MSVC link errors.
      
          1. redefine problems in cpu_builder.obj and convert_layout.obj. It
          is because cpu_builder.hpp contains an implicit implement of
          function runtime::cpu::Builder::build for cpu::op::ConvertLayout.
          The fix is deleting the registration item in cpu_builder.cpp and
          using REGISTER_CPU_OP_BUILDER in convert_layout.cpp.
      
          2. Fix the dependent libraries path on Windows. It should be *.lib
          not *.dll when linking these libraries.
      
      * Set visibility for CPU backend to fix the MSVC linker error.
      
          MSVC complain that the .def file exceed the size limitatoin
          when using CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS. All the functions
          with CPU_BACKEND_API are used by unit test or nbench.
      
      * Fix unit test build errors on Windows.
      
          * backend_unary_elementwise.in.cpp: Use all_close_f to test case
          BACKEDND sqrt
      
          * cpu_fustion.cpp: Fix 'NUM_STEPS' cannot be implicitly
          captured because no default capture mode has been specified
      
          * cpu_test.cpp: Use portable setenv and unsetenv from misc.hpp.
      
          * tools.cpp: Use portable fpopen from misc.hpp.
      
          * misc.hpp/misc.cpp: Add new files to host misc functions that Linux and
          Windows using different implementation.
      
      * Make Debug mode work with MSVC.
      
      * style
      
      * fix line ending
      45a0fb47
  3. 02 Feb, 2019 1 commit
    • Pruthvi's avatar
      Pruthvi/fix input matrix fusion (#2381) · 917efb94
      Pruthvi authored
      * -   check to verify if the data_slices shares the same weights
      
      * add the serialized graph
      
      * - explicitly fuse the data slices, so all the parameter partitioned by slices are in contigous memory location
      - fixes all the failing test cases
      917efb94
  4. 18 Jan, 2019 1 commit
    • Louis Feng's avatar
      Addes backprop to BatchDot op, allows fusion in training. (#2297) · ef778693
      Louis Feng authored
      * batch dot bprop WIP.
      
      * WIP.
      
      * testing.
      
      * clean up debug code.
      
      * comments and var name change.
      
      * clean up.
      
      * format style, batch dot differentiable pass.
      
      * removed debug output.
      
      * added unit test to autodiff, refactored make_function -> make_function_from_file.
      
      * fixed build warning.
      
      * fixed gpu build error.
      
      * clang format fix.
      
      * all test_tools.cpp to find SERIALIZED_ZOO
      
      * remove cmake redef.
      
      * fix unused macro.
      
      * making test cpu only.
      
      * testing build var
      
      * macro test
      
      * verbose makefile test
      
      * style fix
      
      * verbose make
      
      * test/util needs test/models.
      
      * removed debug output.
      
      * refactor fusion type.
      
      * refactor fusion type.
      ef778693
  5. 03 Jan, 2019 1 commit
  6. 19 Dec, 2018 1 commit
  7. 07 Dec, 2018 2 commits
    • Jayaram Bobba's avatar
      Update slice kernels (#2180) · a16c4961
      Jayaram Bobba authored
      * initial commit for update slice op
      
      * Finished up update_slice fusion and added codegen support
      
      * style fixes
      
      * Added unit test for in-place update-slice strided
      
      * change pattern name
      a16c4961
    • Robert Kimball's avatar
      Backend API change pre-work (#2064) · e0933553
      Robert Kimball authored
      * change compile call to return Handle
      
      * make CPU require compile() before call()
      
      * fix unit tests to call compile() before call()
      
      * fix failing ops
      
      * update unit test
      
      * revert some changes
      
      * more fixups
      
      * more diff cleanup
      
      * a few more issues addressed
      
      * more fixes
      
      * update API
      
      * more updates
      
      * fix test_ops.py
      
      * fix
      
      * another attempt to fix
      
      * fix unit test
      
      * fix test error
      e0933553
  8. 06 Dec, 2018 2 commits
    • Nick Korovaiko's avatar
      DEX Loop Kernel (updated) (#2156) · 8fc481a3
      Nick Korovaiko authored
      * one output
      
      passing tests
      
      clean up
      
      fix build breaks
      
      * move generators into a separate file
      8fc481a3
    • Pruthvi's avatar
      Pruthvi/fix rnn precision (#1874) · 73da681a
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * style fix
      
      * disable GPU test
      73da681a
  9. 05 Dec, 2018 1 commit
    • Pruthvi's avatar
      Support for 5D batchnorm (#2055) · d4f8bfdc
      Pruthvi authored
      * - modified cpu_assignment pass to support bn with input 5D
      - added test cases for 5D bn and 5D bn+relu
      
      * - Address PR comments
      - used mkldnn_utils to validate bn for mkldnn
      
      * fix compilation error
      
      * Addressed PR comments
      - added helpers in mkldnn_utils for assigning ngraph Op as MKLDNN op
      - helper funnction for bn mkldnn assignment
      
      * fix clang error
      d4f8bfdc
  10. 28 Nov, 2018 1 commit
    • Scott Cyphers's avatar
      Cyphers/bnorm back (#2129) · 403a09ce
      Scott Cyphers authored
      * Fix batchnorm argument order, cleanup some comments, fix backprop
      
      * Merge error
      
      * Clean up training function, organize inference test
      
      * BatchNormInference tests
      
      * Training case
      
      * Training test
      
      * Fix autodiff BatchNorm test
      
      * Cleanup
      
      * Move file to doc checkout
      
      * Update disabled test name in igpu manifest
      Fix unnused variable
      
      * Unit tests disables
      
      * Review comments
      403a09ce
  11. 21 Nov, 2018 1 commit
    • Jayaram Bobba's avatar
      Adding leaky relu (#2096) · 587b96e5
      Jayaram Bobba authored
      * Adding leaky relu
      
      * Silence compiler warning around fp compares
      
      * Fix copy-paste error and enable in-place for relu mkldnn kernels
      587b96e5
  12. 16 Nov, 2018 1 commit
  13. 11 Nov, 2018 1 commit
    • Fenglei's avatar
      add isfinite check for all_close (#2028) · 702d465a
      Fenglei authored
      * add isfinite check
      
      * style
      
      * output 5 diff and total diff
      
      * output limit of diff for all_close_f
      
      * dix bug
      
      * disable tests
      
      * remove failing unit test that does not make sense.
      702d465a
  14. 07 Nov, 2018 1 commit
  15. 31 Oct, 2018 1 commit
    • Robert Kimball's avatar
      Change Backend::create to return std::unique_ptr<Backend> (#1909) · 05a404a8
      Robert Kimball authored
      * create unique_ptr backend
      
      * unit test cleanup
      
      * address more code that was recently added
      
      * change from reference to pointer when passing backend to reduce the number of lines changed.
      
      * fix build error
      
      * fix python wrapper
      
      * style
      
      * more specific treatment for unique_ptr
      05a404a8
  16. 30 Oct, 2018 1 commit
    • gaurides's avatar
      Gauri/groupconv batchnorm (#1900) · c637d629
      gaurides authored
      * Initial implementation of GroupConv+BatchNorm fusion
      
      * Added GroupConv+BatchNorm with Relu fusion
      
      * Added changes to fuse with BoundedRelu
      
      * Changed BoundedRelu to Relu
      
      * Added test; Code cleanup
      
      * Code formatting
      
      * Removed dead code
      
      * Added test cases and other misc
      
      * Bug fix in group conv callback and general cleanup
      
      * Address PR feedback
      
      * Minor edit to comment. MKLDNN divides both input and output channels by groups
      
      * Style fixes and PR feedback
      c637d629
  17. 22 Oct, 2018 1 commit
    • Nick Korovaiko's avatar
      BatchNorm splitting into ops (2nd try) (#1828) · 1beec46b
      Nick Korovaiko authored
      * split bn into bn_inference bn_training
      
      * fix warnings
      
      * Add GPU support for the new BN ops (#1569)
      
      * Add GPU support and change batchnorm_globalstats test to use BNInference.
      
      * Changed test back to using BNTraining for global stats and updated cudnn backend to account for it.
      
      * Fix issues in merge with master.
      
      * Formatting.
      
      * CPU fixes
      
      * remove 5-arg training BN for now
      
      * more fixes
      
      * python batchnorm changes
      
      * fix onnx_import
      
      * fix a call BatchNormInference c-tor
      
      * yet another fix to BatchNormInference c-tor
      
      * AND yet another fix to batchnorm_inference c-tor
      
      * ops.py
      
      * address adam's feedback
      
      * Remove unnecessary parameter/argument.
      
      * remove batch_norm_training_relu_with_global_stats
      
      * remove bn_relu (training)
      1beec46b
  18. 15 Oct, 2018 1 commit
  19. 12 Oct, 2018 1 commit
  20. 08 Oct, 2018 1 commit
    • Jayaram Bobba's avatar
      IAT: More convolution folding optimizations (#1712) · 00b4453d
      Jayaram Bobba authored
      * Check output shape when setting memory layout for slice op.
      
      * Miscellaneous fusion and other optimizations for inception-resnetv2
      - ConvBias Batchnorm folding
      - ConvBias Affine folding
      - Check if MKLDNN can slice a given layout and select layouts
        appropriately
      
      * Fixed unit test and bug in conv bias pattern
      
      * Addressed PR feedback
      
      * Addressed PR feedback
      00b4453d
  21. 05 Oct, 2018 1 commit
  22. 02 Oct, 2018 1 commit
    • Pruthvi's avatar
      Pruthvi/rnn fusion (#1677) · 18e41513
      Pruthvi authored
      * WIP input * weights rnn optimization
      
      * concat + slcing + replacing new node works
      
      * WIP unit test case of fusing rnn inputs
      
      * - Added unit test case for fusing rnn input weights
      - registered CPURnnMatFusion_v1/v2 in codegen and DEX
      
      * fixed redeclaration of a variable
      
      * Refactored rnn input traformation passes into a single pass
      
      * Refactored CPURnnMatFusion call back functions
      
      * change random generator range to include -ve values in unit test
      
      * address PR comments
      
      * dont fuse if the shape of the data slices dont match
      18e41513
  23. 29 Sep, 2018 1 commit
  24. 21 Sep, 2018 1 commit
    • Amy Zhuang's avatar
      Add CPU horizontal fusion pass for inception. (#1577) · 2d2b3b2f
      Amy Zhuang authored
      * Add CPU horizontal fusion pass for inception.
      
      * Name change.
      
      * Move horizontal fusion to cpu_fusion.
      
      * Change horizontal fusion pass for inception to a general horizontal fusion pass.
      Add a unit test conv_horizontal_fusion to cpu_fusion.
      
      * Rename files.
      
      * Correct cpu_fusion.hpp.
      
      * Add NGRAPH_DEBUG.
      
      * Set native layout when input format of slice is nChw16c or nChw8c and lower bound of
      channels is not a multiple of 16 or 8.
      2d2b3b2f
  25. 14 Sep, 2018 1 commit
    • Scott Cyphers's avatar
      Cyphers/layout (#1602) · 2f79f707
      Scott Cyphers authored
      * Remove "view"
      Simplify layout
      
      * Fix merge error
      
      * fix build error
      
      * PR1602. IntelGPU backend. Compilation fixed.
      2f79f707
  26. 13 Sep, 2018 1 commit
    • Robert Kimball's avatar
      Handle unsupported op in nbench (#1531) · fe676f72
      Robert Kimball authored
      * add unsupported_op exception
      
      * unsupported_op test
      
      * add printout of unsupported op in model
      
      * fix GPU dispatcher check
      
      * fix test designation
      
      * catch exceptions on single file runs too
      
      * add unsupported_op exception where needed
      
      * remove unsupported_op class
      
      * add unassigned op exception
      
      * add unit test
      
      * catch unsupported op in nbench
      
      * add cpu test back
      
      * update all latest merges
      
      * mode change
      fe676f72
  27. 11 Sep, 2018 1 commit
    • gaurides's avatar
      Add conv add fusion (#1526) · 37174c90
      gaurides authored
      * Add conv add fusion
      
      * Updated file permissions and cpu_fusion order
      
      * Formatted code using maint/apply-code-format.sh
      
      * Fixed minor review comments
      
      * Use NODE_VALIDATION_ASSERT instead of throw ngraph_error;\nupgrade baseline and fix issues
      
      * Some more fixes
      37174c90
  28. 29 Aug, 2018 2 commits
  29. 27 Aug, 2018 1 commit
  30. 15 Aug, 2018 1 commit
  31. 13 Aug, 2018 1 commit
  32. 10 Aug, 2018 1 commit
  33. 07 Aug, 2018 1 commit
    • Jayaram Bobba's avatar
      Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27
      Jayaram Bobba authored
      * Switch to using mkldnn memory descriptors for layout
      
      * More changes for using mkldnn descriptor instead of format
      
      * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding
      
      * Rotate mkldnn layouts on transpose
      
      * Modifications to builder reshape to skip rotated layouts
      
      * More fixes to layouts and removes axis order from cpu layout descriptor
      
      * Code cleanup
      
      * Removed shuffle folding pass since the functionality is subsumed by the layout pass
      
      * Canonicalize a few more formats to keep MKLDNN happy.
      
      * Style fixes
      
      * Style fixes
      
      * Style fixes
      
      * Addressed PR feedback and added reshape passthrough for non-transpose cases
      
      * Adjust named formats for weights tensors to keep MKLDNN happy
      
      * Style fixes
      
      * resolved merge issues
      69c51c27
  34. 18 Jul, 2018 1 commit
  35. 17 Jul, 2018 1 commit
    • Jayaram Bobba's avatar
      Added more convolution variants to DEX (#1223) · 9bb0b653
      Jayaram Bobba authored
      * CPU Direct Execution: Implement ConvertLayout and refactor
      
      * CPU Direct Execution: Implement Convolution
      
      * 1) Adds computation reuse to direct execution
      2) Add avg_pool, broadcast and convolution_bias to direct execution
      3) Moved some computation reuse utility functions to graph_utils
      
      * Use lists instead of vectors to avoid reallocation overheads
      
      * - Added convolution variants to direct execution
      - Removed ConvolutionBiasRelu, use ConvolutionBias instead
      - Reduced code duplication by moving functionality to mkldnn_emitter
        from cpu_emitter
      
      * Style fix
      
      * Moved mkldnn build_convolution to a templated method
      
      * Style fix
      
      * refactored mkldnn conv bprop builders
      
      * Style fix
      9bb0b653
  36. 11 Jul, 2018 1 commit
  37. 03 Jul, 2018 1 commit