- 11 Feb, 2019 1 commit
-
-
Jayaram Bobba authored
* CPUQuantFusion pass and some usions for converting mixed precision sub-graphs to int8 fused ops * - Added unit tests and misc bug fixes for mixed-precision fusions - Adjust fused sum_scale in quantization builders instead of mkldnn primitive creation
-
- 04 Feb, 2019 1 commit
-
-
Robert Kimball authored
* fix windows build * wip * mkldnn seems to build * address various errors building cpu backend with MSVC * wip * wip * Windows support. * Delete dependency of LLVM when building with MSVC. * Define EIGEN_HAS_CONSTEXPR when using MSVS. * Fix MSVC build errors. * Incorrect argument to 'decltype'. It is VC bug. Work around the error with rename the function into different name. * MINMAX issue in matmul_bias.cpp. * Correct TBB_LINK_LIBS on Windows. * Fix MSVC link errors. 1. redefine problems in cpu_builder.obj and convert_layout.obj. It is because cpu_builder.hpp contains an implicit implement of function runtime::cpu::Builder::build for cpu::op::ConvertLayout. The fix is deleting the registration item in cpu_builder.cpp and using REGISTER_CPU_OP_BUILDER in convert_layout.cpp. 2. Fix the dependent libraries path on Windows. It should be *.lib not *.dll when linking these libraries. * Set visibility for CPU backend to fix the MSVC linker error. MSVC complain that the .def file exceed the size limitatoin when using CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS. All the functions with CPU_BACKEND_API are used by unit test or nbench. * Fix unit test build errors on Windows. * backend_unary_elementwise.in.cpp: Use all_close_f to test case BACKEDND sqrt * cpu_fustion.cpp: Fix 'NUM_STEPS' cannot be implicitly captured because no default capture mode has been specified * cpu_test.cpp: Use portable setenv and unsetenv from misc.hpp. * tools.cpp: Use portable fpopen from misc.hpp. * misc.hpp/misc.cpp: Add new files to host misc functions that Linux and Windows using different implementation. * Make Debug mode work with MSVC. * style * fix line ending
-
- 02 Feb, 2019 1 commit
-
-
Pruthvi authored
* - check to verify if the data_slices shares the same weights * add the serialized graph * - explicitly fuse the data slices, so all the parameter partitioned by slices are in contigous memory location - fixes all the failing test cases
-
- 18 Jan, 2019 1 commit
-
-
Louis Feng authored
* batch dot bprop WIP. * WIP. * testing. * clean up debug code. * comments and var name change. * clean up. * format style, batch dot differentiable pass. * removed debug output. * added unit test to autodiff, refactored make_function -> make_function_from_file. * fixed build warning. * fixed gpu build error. * clang format fix. * all test_tools.cpp to find SERIALIZED_ZOO * remove cmake redef. * fix unused macro. * making test cpu only. * testing build var * macro test * verbose makefile test * style fix * verbose make * test/util needs test/models. * removed debug output. * refactor fusion type. * refactor fusion type.
-
- 03 Jan, 2019 1 commit
-
-
Robert Kimball authored
* update licenses for 2019 * style
-
- 19 Dec, 2018 1 commit
-
-
Robert Kimball authored
* make validate public * move compile call outside of call for unit tests * fix compile error * one more error
-
- 07 Dec, 2018 2 commits
-
-
Jayaram Bobba authored
* initial commit for update slice op * Finished up update_slice fusion and added codegen support * style fixes * Added unit test for in-place update-slice strided * change pattern name
-
Robert Kimball authored
* change compile call to return Handle * make CPU require compile() before call() * fix unit tests to call compile() before call() * fix failing ops * update unit test * revert some changes * more fixups * more diff cleanup * a few more issues addressed * more fixes * update API * more updates * fix test_ops.py * fix * another attempt to fix * fix unit test * fix test error
-
- 06 Dec, 2018 2 commits
-
-
Nick Korovaiko authored
* one output passing tests clean up fix build breaks * move generators into a separate file
-
Pruthvi authored
* - Added reorder support for rnn weights_layer/iter * i) fixed compilation issues ii) working but still observing precision error * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter * i) added support for src reorder to TNC from NTC * reorder support for rnn output fron NTC to TNC * - added support for rnn weight reorder ldgoi -> ldigo - code refactor for lstm/rnn kernel in mkldnn emitter * - refactor rnn mkldnnn kernel, change variable names * fix RNN codegen kernel * disbale layer rnn fusion pass, to test CI * method to validate recurrent rnn inputs * add correlated macthes for Recurrent RNN PM * - simplify reorder logic for rnn_weights - fix graph pattern for fusing rnn cell across time steps * do weights reorders in rnn timesteps fusion * refactored LSTM graph pass * - Bug fix for finding the lstm inputs determenstically - Refactored LSTM graph pass to single pass - made changes to LSTM RNN time step fusion graph pass * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass * fix compilation error * Fix GNMT rnn fusion * check if the node is in use before replacing in RNN graph passes * i) fix style ii) fix topo sort issue in RNN graph pass * style fix * fix bug in simplify_concat pass * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2 * cse for convert layout * addressed PR comments * - optimization pass to remove Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2 - conditional fusing of LSTM cells only for the decoder * made changes to multi layer RNN fusion callback * fix asserts in RNN op * - added support to fuse layers when slc=dlc for RNN cells - bug fix on the sanity checks for RNN Op * - support RNN layer fusion till slc = dlc - bug fixes in multi layer rnn fusion call back * capture reshape in the RNN weights * Addressed PR comments * - added comments in multi layer PM call back - fuse only if slc == DLC across layers * restore deleted 3_lstm_cell_forward.json file * fix typo * fix failing unit tets * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input. * Address PR feedback: process in place slice after propagating in place input. * Set INTERMEDIATE role before propagating in place input. * Do not add temporaries to the variable name map before propagating in place input in codegen. * Fix a bug in codegen. * Fix a bug in codegen slice. * reenable disabled rnn unit test * fix compiler error * - bug fix in the slicing logic for the layer fused rnn cell - fix failing rnn unit test * - Addressed PR comments - removed redundant checks from the rnn graph pass - simplified rnn call back replace node logic * - added new multilayer rnn *.json file - fix test case * [PRIVATE BRANCH] Style fixes (#2080) * Style fixes * change order of lstm gates * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113) * Style fixes for single-layer RNN fusion * Style fixes to multi-layer RNN * style fix * disable GPU test
-
- 05 Dec, 2018 1 commit
-
-
Pruthvi authored
* - modified cpu_assignment pass to support bn with input 5D - added test cases for 5D bn and 5D bn+relu * - Address PR comments - used mkldnn_utils to validate bn for mkldnn * fix compilation error * Addressed PR comments - added helpers in mkldnn_utils for assigning ngraph Op as MKLDNN op - helper funnction for bn mkldnn assignment * fix clang error
-
- 28 Nov, 2018 1 commit
-
-
Scott Cyphers authored
* Fix batchnorm argument order, cleanup some comments, fix backprop * Merge error * Clean up training function, organize inference test * BatchNormInference tests * Training case * Training test * Fix autodiff BatchNorm test * Cleanup * Move file to doc checkout * Update disabled test name in igpu manifest Fix unnused variable * Unit tests disables * Review comments
-
- 21 Nov, 2018 1 commit
-
-
Jayaram Bobba authored
* Adding leaky relu * Silence compiler warning around fp compares * Fix copy-paste error and enable in-place for relu mkldnn kernels
-
- 16 Nov, 2018 1 commit
-
-
Robert Kimball authored
* Move ParameterVector and ResultVector to the ngraph namespace where they belong * update python wrapper * more python fixes * style * Update setup.py * fix some new code
-
- 11 Nov, 2018 1 commit
-
-
Fenglei authored
* add isfinite check * style * output 5 diff and total diff * output limit of diff for all_close_f * dix bug * disable tests * remove failing unit test that does not make sense.
-
- 07 Nov, 2018 1 commit
-
-
Amy Zhuang authored
* Do not fuse nodes if one node is predecessor of another node in horizontal fusion. * Add dead node check and remove predecessor check in horizontal fusion.
-
- 31 Oct, 2018 1 commit
-
-
Robert Kimball authored
* create unique_ptr backend * unit test cleanup * address more code that was recently added * change from reference to pointer when passing backend to reduce the number of lines changed. * fix build error * fix python wrapper * style * more specific treatment for unique_ptr
-
- 30 Oct, 2018 1 commit
-
-
gaurides authored
* Initial implementation of GroupConv+BatchNorm fusion * Added GroupConv+BatchNorm with Relu fusion * Added changes to fuse with BoundedRelu * Changed BoundedRelu to Relu * Added test; Code cleanup * Code formatting * Removed dead code * Added test cases and other misc * Bug fix in group conv callback and general cleanup * Address PR feedback * Minor edit to comment. MKLDNN divides both input and output channels by groups * Style fixes and PR feedback
-
- 22 Oct, 2018 1 commit
-
-
Nick Korovaiko authored
* split bn into bn_inference bn_training * fix warnings * Add GPU support for the new BN ops (#1569) * Add GPU support and change batchnorm_globalstats test to use BNInference. * Changed test back to using BNTraining for global stats and updated cudnn backend to account for it. * Fix issues in merge with master. * Formatting. * CPU fixes * remove 5-arg training BN for now * more fixes * python batchnorm changes * fix onnx_import * fix a call BatchNormInference c-tor * yet another fix to BatchNormInference c-tor * AND yet another fix to batchnorm_inference c-tor * ops.py * address adam's feedback * Remove unnecessary parameter/argument. * remove batch_norm_training_relu_with_global_stats * remove bn_relu (training)
-
- 15 Oct, 2018 1 commit
-
-
Nick Korovaiko authored
-
- 12 Oct, 2018 1 commit
-
-
Amy Zhuang authored
-
- 08 Oct, 2018 1 commit
-
-
Jayaram Bobba authored
* Check output shape when setting memory layout for slice op. * Miscellaneous fusion and other optimizations for inception-resnetv2 - ConvBias Batchnorm folding - ConvBias Affine folding - Check if MKLDNN can slice a given layout and select layouts appropriately * Fixed unit test and bug in conv bias pattern * Addressed PR feedback * Addressed PR feedback
-
- 05 Oct, 2018 1 commit
-
-
Jaikrishnan Menon authored
-
- 02 Oct, 2018 1 commit
-
-
Pruthvi authored
* WIP input * weights rnn optimization * concat + slcing + replacing new node works * WIP unit test case of fusing rnn inputs * - Added unit test case for fusing rnn input weights - registered CPURnnMatFusion_v1/v2 in codegen and DEX * fixed redeclaration of a variable * Refactored rnn input traformation passes into a single pass * Refactored CPURnnMatFusion call back functions * change random generator range to include -ve values in unit test * address PR comments * dont fuse if the shape of the data slices dont match
-
- 29 Sep, 2018 1 commit
-
-
Robert Kimball authored
* rename files * rename runtime TensorView to Tensor * rename HostTensorView to HostTensor
-
- 21 Sep, 2018 1 commit
-
-
Amy Zhuang authored
* Add CPU horizontal fusion pass for inception. * Name change. * Move horizontal fusion to cpu_fusion. * Change horizontal fusion pass for inception to a general horizontal fusion pass. Add a unit test conv_horizontal_fusion to cpu_fusion. * Rename files. * Correct cpu_fusion.hpp. * Add NGRAPH_DEBUG. * Set native layout when input format of slice is nChw16c or nChw8c and lower bound of channels is not a multiple of 16 or 8.
-
- 14 Sep, 2018 1 commit
-
-
Scott Cyphers authored
* Remove "view" Simplify layout * Fix merge error * fix build error * PR1602. IntelGPU backend. Compilation fixed.
-
- 13 Sep, 2018 1 commit
-
-
Robert Kimball authored
* add unsupported_op exception * unsupported_op test * add printout of unsupported op in model * fix GPU dispatcher check * fix test designation * catch exceptions on single file runs too * add unsupported_op exception where needed * remove unsupported_op class * add unassigned op exception * add unit test * catch unsupported op in nbench * add cpu test back * update all latest merges * mode change
-
- 11 Sep, 2018 1 commit
-
-
gaurides authored
* Add conv add fusion * Updated file permissions and cpu_fusion order * Formatted code using maint/apply-code-format.sh * Fixed minor review comments * Use NODE_VALIDATION_ASSERT instead of throw ngraph_error;\nupgrade baseline and fix issues * Some more fixes
-
- 29 Aug, 2018 2 commits
-
-
Robert Kimball authored
* use line comments instead of multiline comments for license header * update more * update new files * more header updates * style
-
Pruthvi authored
disabled RNN test to workaround RNN unit test failure on MAC due to bug in MKLDNN scratchpad creation (#1502)
-
- 27 Aug, 2018 1 commit
-
-
Robert Kimball authored
* normalize comments * address review comments
-
- 15 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Fold affine transformations on 4d convolution * Handle more cases for affine parameters * Style fix
-
- 13 Aug, 2018 1 commit
-
-
Robert Kimball authored
* enable parameter validation for all unit tests
-
- 10 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Dex non-mkldnn version of clipped relu * Change to static_cast
-
- 07 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Switch to using mkldnn memory descriptors for layout * More changes for using mkldnn descriptor instead of format * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding * Rotate mkldnn layouts on transpose * Modifications to builder reshape to skip rotated layouts * More fixes to layouts and removes axis order from cpu layout descriptor * Code cleanup * Removed shuffle folding pass since the functionality is subsumed by the layout pass * Canonicalize a few more formats to keep MKLDNN happy. * Style fixes * Style fixes * Style fixes * Addressed PR feedback and added reshape passthrough for non-transpose cases * Adjust named formats for weights tensors to keep MKLDNN happy * Style fixes * resolved merge issues
-
- 18 Jul, 2018 1 commit
-
-
Nick Korovaiko authored
* cpu loop kernel fusion pass * remove extra code * bounded relu test * address scotts feedback
-
- 17 Jul, 2018 1 commit
-
-
Jayaram Bobba authored
* CPU Direct Execution: Implement ConvertLayout and refactor * CPU Direct Execution: Implement Convolution * 1) Adds computation reuse to direct execution 2) Add avg_pool, broadcast and convolution_bias to direct execution 3) Moved some computation reuse utility functions to graph_utils * Use lists instead of vectors to avoid reallocation overheads * - Added convolution variants to direct execution - Removed ConvolutionBiasRelu, use ConvolutionBias instead - Reduced code duplication by moving functionality to mkldnn_emitter from cpu_emitter * Style fix * Moved mkldnn build_convolution to a templated method * Style fix * refactored mkldnn conv bprop builders * Style fix
-
- 11 Jul, 2018 1 commit
-
-
Pruthvi authored
-
- 03 Jul, 2018 1 commit
-
-
Louis Feng authored
* hacking to support dot of 3 by 2 inputs with gemm_batch. * clean up.
-