1. 05 Jun, 2019 1 commit
  2. 24 May, 2019 1 commit
    • Michał Karzyński's avatar
      [Fused] LeakyRelu op (#2919) · 5650e913
      Michał Karzyński authored
      * [Fused] LeakyRelu op
      
      * Add LeakyRelu to serializer
      
      * Add unit tests
      
      * Fix merge branch 'master' into mkarzyns/fused_leaky_relu
      
      * Change broadcasting rules to NumPy style
      
      * Remove std:: and ngraph:: prefixes
      
      * Rename CPU Runtime LeakyRelu to CPULeakyRelu
      
      * Style apply
      
      * Fix cpu_fusion.fuse_leaky_relu test
      
      * Use eigen's tanh in the fused sigmoid multiply kernel (#2946)
      
      * Merge branch 'master' into mkarzyns/fused_leaky_relu
      
      * Add LeakyRelu to Intel GPU backend op list
      
      * Add LeakyRelu to Intel GPU backend op list
      5650e913
  3. 23 May, 2019 2 commits
  4. 22 May, 2019 1 commit
    • Louis Feng's avatar
      Change FusionType to enum class and use EnumMask (#2957) · a65b5155
      Louis Feng authored
      * constexpr ctor for EnumMask
      
      * added pass properties to core passes.
      
      * change fusion type to have better type safety.
      
      * refactor to use enum mask.
      
      * remove extra code.
      
      * added constants for FusionType backward compatibility.
      
      * spelling.
      
      * grammar fix.
      a65b5155
  5. 13 May, 2019 1 commit
    • Scott Cyphers's avatar
      Fix clang compiler warnings (#2898) · b94a042d
      Scott Cyphers authored
      * Fix clang compiler warnings
      
      * Remove unintended file.
      
      * style
      
      * Not part of PR
      
      * Another extra closure ref
      
      * More warnings from merges
      
      * Lambda arg was used
      b94a042d
  6. 29 Apr, 2019 1 commit
  7. 26 Apr, 2019 1 commit
  8. 17 Apr, 2019 1 commit
    • gaurides's avatar
      DeconvBias (#2716) · 03f13e4b
      gaurides authored
      * deconv optimizations for dcgan
      
      * Added test cases
      
      * modified some tests, not working at this point
      
      * Removed temp code
      
      * fixes to get unit test to pass
      
      * Added node validation checks
      
      * Update mkldnn emitter to memory reuse design
      
      * Code cleanup
      
      * Fix to enable deconv select the right kernel
      
      * Fix file permissions
      
      * Disabled unit test cases
      
      * Remove unused variable
      
      * Address PR feedback
      
      * Removed dead code
      
      * Style check
      
      * removed dead code
      03f13e4b
  9. 16 Apr, 2019 1 commit
    • Jayaram Bobba's avatar
      Moves some fused convolution ops to core FusedOps (#2733) · 6b5016e5
      Jayaram Bobba authored
      * - Moves some fused convolution ops to core FusedOps
      - Adds support for decomposing and replacing multi-output FusedOps
      - Adds query callbacks to FusedOpDecomposition to check if a FusedOp is
        supported by a backend
      - Adds core fusion patterns for FusedOps
      -
      
      * style fix
      
      * Added comments on FOP_FUSIONS
      
      * gpu convolution 1d bug fix (#2741)
      
      * Fix bug with dex-only compilation and addressed PR comments
      6b5016e5
  10. 12 Apr, 2019 1 commit
    • Adam Procter's avatar
      Deprecate direct access to descriptor::Input and descriptor::Output (#2724) · 5490bae5
      Adam Procter authored
      * Add NodeInput and NodeOutput classes
      
      * Deprecate Node::get_inputs, Node::get_outputs, Node::get_output_inputs. Remove Node::get_input_from and Node::get_output_from
      
      * Privatize most fields of Node
      
      * Make deprecation of descriptor-munching classes optional
      
      * Review comments
      
      * Adapt ReshapeSinking to use raw pointers for NodeInput
      
      * Fix ZDTE (thought I had already done in this branch, weird); style
      
      * wip
      
      * Change get_node_outputs() and get_node_inputs() to return vectors
      
      * Updates after merge
      
      * Whoops, forgot to define these functions
      
      * {NodeInput,NodeOutput} -> {Input,Output}
      
      * Kill shared_ptr in Output
      
      * Move Input and Output into node.hpp
      
      * Templatize the underlying node (sub)type in Input and Output
      
      * Eliminate some get_input_* and get_output_* functions
      
      * Change get_outputs and get_inputs back to their original names; rename NGRAPH_DEPRECATE_IO_DESCRIPTORS to NGRAPH_DEPRECATE_OLD_NODE_APIS
      
      * Miscellaneous cleanup
      
      * More cleanup
      
      * Unbreak CPU build
      
      * Simplify unit tests
      
      * Make Node less friendly
      
      * Deprecate more get_output_* and get_input_* functions
      
      * A couple of PR comments
      
      * Make the deprecation stuff more generally available
      
      * Better comment
      
      * Be more consistent about [] vs. at
      5490bae5
  11. 11 Apr, 2019 1 commit
    • Louis Feng's avatar
      [Dynamic Shape] Moving BatchDot to Core Op (#2691) · cc8dd452
      Louis Feng authored
      * batch dot WIP.
      
      * cpu backend refactor and unit tests pass.
      
      * WIP.
      
      * batch dot interpreter impelementation.
      
      * minor clean up.
      
      * more clean up.
      
      * patching the gpu backends.
      
      * added more tests, fixes, etc.
      
      * fixed compile error.
      
      * renamed batch dot to batch matmul.
      
      * refactor WIP.
      
      * fixes some tests and formating.
      
      * more fixes.
      cc8dd452
  12. 09 Apr, 2019 1 commit
  13. 26 Mar, 2019 1 commit
  14. 21 Mar, 2019 1 commit
    • tsocha's avatar
      [ONNX] Enable Pad modes for ONNX pad operator (#2590) · f8146495
      tsocha authored
      * Add support for negative padding
      
      * Use std::bind in pad builder check
      
      * Add support for negative padding in CPU backend
      
      * Updated kernel to do pad+slice
      
      * Remove type conversion warnings
      
      * Fix review comments
      
      * Remove interior padding from core op and interpreter stuff
      
      * Update backends other than GPU for retirement of padding_interior
      
      * Skeleton of support for edge/reflect padding
      
      * Post-merge cleanup
      
      * Attempt reference implementation for EDGE.
      
      * Fix the edge-padding reference, and add some unit tests
      
      * Implement REFLECT padding ref; add tests
      
      * Fixes to the CPU stuff so it compiles now
      
      * Fix test
      
      * Add support for different pad modes
      
      * Restore a stub get_padding_interior function, and tweak some stale comments
      
      * Update ONNX importer to not supply interior padding value; add checks for padding-too-small for EDGE and REFLECT
      
      * Typo
      
      * Bop a warning
      
      * Attempt fix to INTELGPU backend
      
      * Attempt another fix to INTELGPU backend
      
      * Fix pyapi
      
      * Style apply
      
      * Add support for padding modes
      
      * Remove unnecesary node validation checks
      
      * Remove tests for minimal reflect and edge pad
      
      * Remove commented tests
      
      * Remove unnecesary Asserts
      
      * Little update of pad documentation
      
      * Monospace for pad_mode options
      
      * Revert "Remove tests for minimal reflect and edge pad"
      
      This reverts commit 81e4787ea47195b832cab1452dde698bc05776fe.
      
      * Revert "Remove unnecesary node validation checks"
      
      This reverts commit 7e68db7564f3c9b1fd40e7db1d1bda4e0677cad9.
      
      * Test only spatial dims
      
      * axis -> spatial axis
      
      * Fix typo
      
      * Style check
      
      * Update test
      
      * Add CoordinateDiff include
      
      * Remove pad_mode from tree visualization
      
      * Convert padding into NVShape
      
      * Skip failing tests on GPU
      
      * Revert mode change
      
      * Remove merge artifact
      
      * Rename pad kernel into pad_ref
      f8146495
  15. 18 Mar, 2019 1 commit
    • Robert Kimball's avatar
      Change floating point comparisons from == to all_close_f (#2620) · 56e160ba
      Robert Kimball authored
      * change float comparisons from == to all_close_f
      
      * style
      
      * address a few more direct float comparisons
      
      * add missing include
      
      * specify tightest tolerance for Broadcast and Reshape tests
      
      * Increased tightness of float testing
      
      Increased tightness of float testing via MIN_FLOAT_TOLERANCE_BITS parameter
      
      * style
      56e160ba
  16. 26 Feb, 2019 1 commit
    • Jayaram Bobba's avatar
      More quantized fusion patterns (#2480) · b8106133
      Jayaram Bobba authored
      * Add QuantizedConcat
      
      * Remove unused variables and add check for size of mins and maxes vector
      
      * Resolve conflicts
      
      * Merged with master and addressed some PR feedback
      
      * Maxpool and Avgpool fusions. Exclude Q from conv+relu fusion
      
      * Remove single-user check from fusions
      
      * Quantized concat fusion
      
      * workaround: do reshape sinking by default
      
      * style fix
      
      * check scales for QuantizedConcat
      
      * use compare_constants
      
      * remove stale comment
      
      * Handle all concat cases from arg size 2 to 6
      
      * addressed feedback
      b8106133
  17. 25 Feb, 2019 1 commit
    • Pruthvi's avatar
      Pruthvi/bi rnn (#2232) · a444f7a9
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * WIP bi rnn
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * added callback routine for bi-directional rnn
      
      * fix rnn op ctor, rnn mkldnn emitter to accomodate bi directional rnn
      
      * style fix
      
      * added helper function for rnn's to query direction and cell_type
      
      * fix clang error
      
      * - unit test case for bi rnn fusion
      - style fix
      
      * - updated bi-rnn graph pass to handle reverse and reverse_seq ops in the predicate
      - added bi-rnn inter v/s cpu unit test case
      - add support to in mkldnn_utils to create_md with tnc/ntc format
      
      * - added enum type to deduce rnn_type
      
      * Addressed PR comments
          - handle reshapes from {t, n, c} to {n, t, c} in the graph pass
      
      * fix style
      
      * fix clang error
      
      * fix style
      
      * i) move enum specific to rnn to seperate header
      a444f7a9
  18. 22 Feb, 2019 1 commit
  19. 11 Feb, 2019 1 commit
    • Jayaram Bobba's avatar
      Mixed-precision fusions (#2401) · 13b4966b
      Jayaram Bobba authored
      * CPUQuantFusion pass and some usions for converting mixed precision sub-graphs to int8 fused ops
      
      * - Added unit tests and misc bug fixes for mixed-precision fusions
      - Adjust fused sum_scale in quantization builders instead of mkldnn
        primitive creation
      13b4966b
  20. 04 Feb, 2019 1 commit
    • Robert Kimball's avatar
      Windows support. (#2394) · 45a0fb47
      Robert Kimball authored
      * fix windows build
      
      * wip
      
      * mkldnn seems to build
      
      * address various errors building cpu backend with MSVC
      
      * wip
      
      * wip
      
      * Windows support.
      
          * Delete dependency of LLVM when building with MSVC.
      
      * Define EIGEN_HAS_CONSTEXPR when using MSVS.
      
      * Fix MSVC build errors.
      
          * Incorrect argument to 'decltype'. It is VC bug. Work around the
          error with rename the function into different name.
      
          * MINMAX issue in matmul_bias.cpp.
      
          * Correct TBB_LINK_LIBS on Windows.
      
      * Fix MSVC link errors.
      
          1. redefine problems in cpu_builder.obj and convert_layout.obj. It
          is because cpu_builder.hpp contains an implicit implement of
          function runtime::cpu::Builder::build for cpu::op::ConvertLayout.
          The fix is deleting the registration item in cpu_builder.cpp and
          using REGISTER_CPU_OP_BUILDER in convert_layout.cpp.
      
          2. Fix the dependent libraries path on Windows. It should be *.lib
          not *.dll when linking these libraries.
      
      * Set visibility for CPU backend to fix the MSVC linker error.
      
          MSVC complain that the .def file exceed the size limitatoin
          when using CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS. All the functions
          with CPU_BACKEND_API are used by unit test or nbench.
      
      * Fix unit test build errors on Windows.
      
          * backend_unary_elementwise.in.cpp: Use all_close_f to test case
          BACKEDND sqrt
      
          * cpu_fustion.cpp: Fix 'NUM_STEPS' cannot be implicitly
          captured because no default capture mode has been specified
      
          * cpu_test.cpp: Use portable setenv and unsetenv from misc.hpp.
      
          * tools.cpp: Use portable fpopen from misc.hpp.
      
          * misc.hpp/misc.cpp: Add new files to host misc functions that Linux and
          Windows using different implementation.
      
      * Make Debug mode work with MSVC.
      
      * style
      
      * fix line ending
      45a0fb47
  21. 02 Feb, 2019 1 commit
    • Pruthvi's avatar
      Pruthvi/fix input matrix fusion (#2381) · 917efb94
      Pruthvi authored
      * -   check to verify if the data_slices shares the same weights
      
      * add the serialized graph
      
      * - explicitly fuse the data slices, so all the parameter partitioned by slices are in contigous memory location
      - fixes all the failing test cases
      917efb94
  22. 18 Jan, 2019 1 commit
    • Louis Feng's avatar
      Addes backprop to BatchDot op, allows fusion in training. (#2297) · ef778693
      Louis Feng authored
      * batch dot bprop WIP.
      
      * WIP.
      
      * testing.
      
      * clean up debug code.
      
      * comments and var name change.
      
      * clean up.
      
      * format style, batch dot differentiable pass.
      
      * removed debug output.
      
      * added unit test to autodiff, refactored make_function -> make_function_from_file.
      
      * fixed build warning.
      
      * fixed gpu build error.
      
      * clang format fix.
      
      * all test_tools.cpp to find SERIALIZED_ZOO
      
      * remove cmake redef.
      
      * fix unused macro.
      
      * making test cpu only.
      
      * testing build var
      
      * macro test
      
      * verbose makefile test
      
      * style fix
      
      * verbose make
      
      * test/util needs test/models.
      
      * removed debug output.
      
      * refactor fusion type.
      
      * refactor fusion type.
      ef778693
  23. 03 Jan, 2019 1 commit
  24. 19 Dec, 2018 1 commit
  25. 07 Dec, 2018 2 commits
    • Jayaram Bobba's avatar
      Update slice kernels (#2180) · a16c4961
      Jayaram Bobba authored
      * initial commit for update slice op
      
      * Finished up update_slice fusion and added codegen support
      
      * style fixes
      
      * Added unit test for in-place update-slice strided
      
      * change pattern name
      a16c4961
    • Robert Kimball's avatar
      Backend API change pre-work (#2064) · e0933553
      Robert Kimball authored
      * change compile call to return Handle
      
      * make CPU require compile() before call()
      
      * fix unit tests to call compile() before call()
      
      * fix failing ops
      
      * update unit test
      
      * revert some changes
      
      * more fixups
      
      * more diff cleanup
      
      * a few more issues addressed
      
      * more fixes
      
      * update API
      
      * more updates
      
      * fix test_ops.py
      
      * fix
      
      * another attempt to fix
      
      * fix unit test
      
      * fix test error
      e0933553
  26. 06 Dec, 2018 2 commits
    • Nick Korovaiko's avatar
      DEX Loop Kernel (updated) (#2156) · 8fc481a3
      Nick Korovaiko authored
      * one output
      
      passing tests
      
      clean up
      
      fix build breaks
      
      * move generators into a separate file
      8fc481a3
    • Pruthvi's avatar
      Pruthvi/fix rnn precision (#1874) · 73da681a
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * style fix
      
      * disable GPU test
      73da681a
  27. 05 Dec, 2018 1 commit
    • Pruthvi's avatar
      Support for 5D batchnorm (#2055) · d4f8bfdc
      Pruthvi authored
      * - modified cpu_assignment pass to support bn with input 5D
      - added test cases for 5D bn and 5D bn+relu
      
      * - Address PR comments
      - used mkldnn_utils to validate bn for mkldnn
      
      * fix compilation error
      
      * Addressed PR comments
      - added helpers in mkldnn_utils for assigning ngraph Op as MKLDNN op
      - helper funnction for bn mkldnn assignment
      
      * fix clang error
      d4f8bfdc
  28. 28 Nov, 2018 1 commit
    • Scott Cyphers's avatar
      Cyphers/bnorm back (#2129) · 403a09ce
      Scott Cyphers authored
      * Fix batchnorm argument order, cleanup some comments, fix backprop
      
      * Merge error
      
      * Clean up training function, organize inference test
      
      * BatchNormInference tests
      
      * Training case
      
      * Training test
      
      * Fix autodiff BatchNorm test
      
      * Cleanup
      
      * Move file to doc checkout
      
      * Update disabled test name in igpu manifest
      Fix unnused variable
      
      * Unit tests disables
      
      * Review comments
      403a09ce
  29. 21 Nov, 2018 1 commit
    • Jayaram Bobba's avatar
      Adding leaky relu (#2096) · 587b96e5
      Jayaram Bobba authored
      * Adding leaky relu
      
      * Silence compiler warning around fp compares
      
      * Fix copy-paste error and enable in-place for relu mkldnn kernels
      587b96e5
  30. 16 Nov, 2018 1 commit
  31. 11 Nov, 2018 1 commit
    • Fenglei's avatar
      add isfinite check for all_close (#2028) · 702d465a
      Fenglei authored
      * add isfinite check
      
      * style
      
      * output 5 diff and total diff
      
      * output limit of diff for all_close_f
      
      * dix bug
      
      * disable tests
      
      * remove failing unit test that does not make sense.
      702d465a
  32. 07 Nov, 2018 1 commit
  33. 31 Oct, 2018 1 commit
    • Robert Kimball's avatar
      Change Backend::create to return std::unique_ptr<Backend> (#1909) · 05a404a8
      Robert Kimball authored
      * create unique_ptr backend
      
      * unit test cleanup
      
      * address more code that was recently added
      
      * change from reference to pointer when passing backend to reduce the number of lines changed.
      
      * fix build error
      
      * fix python wrapper
      
      * style
      
      * more specific treatment for unique_ptr
      05a404a8
  34. 30 Oct, 2018 1 commit
    • gaurides's avatar
      Gauri/groupconv batchnorm (#1900) · c637d629
      gaurides authored
      * Initial implementation of GroupConv+BatchNorm fusion
      
      * Added GroupConv+BatchNorm with Relu fusion
      
      * Added changes to fuse with BoundedRelu
      
      * Changed BoundedRelu to Relu
      
      * Added test; Code cleanup
      
      * Code formatting
      
      * Removed dead code
      
      * Added test cases and other misc
      
      * Bug fix in group conv callback and general cleanup
      
      * Address PR feedback
      
      * Minor edit to comment. MKLDNN divides both input and output channels by groups
      
      * Style fixes and PR feedback
      c637d629
  35. 22 Oct, 2018 1 commit
    • Nick Korovaiko's avatar
      BatchNorm splitting into ops (2nd try) (#1828) · 1beec46b
      Nick Korovaiko authored
      * split bn into bn_inference bn_training
      
      * fix warnings
      
      * Add GPU support for the new BN ops (#1569)
      
      * Add GPU support and change batchnorm_globalstats test to use BNInference.
      
      * Changed test back to using BNTraining for global stats and updated cudnn backend to account for it.
      
      * Fix issues in merge with master.
      
      * Formatting.
      
      * CPU fixes
      
      * remove 5-arg training BN for now
      
      * more fixes
      
      * python batchnorm changes
      
      * fix onnx_import
      
      * fix a call BatchNormInference c-tor
      
      * yet another fix to BatchNormInference c-tor
      
      * AND yet another fix to batchnorm_inference c-tor
      
      * ops.py
      
      * address adam's feedback
      
      * Remove unnecessary parameter/argument.
      
      * remove batch_norm_training_relu_with_global_stats
      
      * remove bn_relu (training)
      1beec46b
  36. 15 Oct, 2018 1 commit
  37. 12 Oct, 2018 1 commit