- 15 Oct, 2018 1 commit
-
-
Nick Korovaiko authored
-
- 12 Oct, 2018 1 commit
-
-
Amy Zhuang authored
-
- 08 Oct, 2018 1 commit
-
-
Jayaram Bobba authored
* Check output shape when setting memory layout for slice op. * Miscellaneous fusion and other optimizations for inception-resnetv2 - ConvBias Batchnorm folding - ConvBias Affine folding - Check if MKLDNN can slice a given layout and select layouts appropriately * Fixed unit test and bug in conv bias pattern * Addressed PR feedback * Addressed PR feedback
-
- 05 Oct, 2018 1 commit
-
-
Jaikrishnan Menon authored
-
- 02 Oct, 2018 1 commit
-
-
Pruthvi authored
* WIP input * weights rnn optimization * concat + slcing + replacing new node works * WIP unit test case of fusing rnn inputs * - Added unit test case for fusing rnn input weights - registered CPURnnMatFusion_v1/v2 in codegen and DEX * fixed redeclaration of a variable * Refactored rnn input traformation passes into a single pass * Refactored CPURnnMatFusion call back functions * change random generator range to include -ve values in unit test * address PR comments * dont fuse if the shape of the data slices dont match
-
- 29 Sep, 2018 1 commit
-
-
Robert Kimball authored
* rename files * rename runtime TensorView to Tensor * rename HostTensorView to HostTensor
-
- 21 Sep, 2018 1 commit
-
-
Amy Zhuang authored
* Add CPU horizontal fusion pass for inception. * Name change. * Move horizontal fusion to cpu_fusion. * Change horizontal fusion pass for inception to a general horizontal fusion pass. Add a unit test conv_horizontal_fusion to cpu_fusion. * Rename files. * Correct cpu_fusion.hpp. * Add NGRAPH_DEBUG. * Set native layout when input format of slice is nChw16c or nChw8c and lower bound of channels is not a multiple of 16 or 8.
-
- 14 Sep, 2018 1 commit
-
-
Scott Cyphers authored
* Remove "view" Simplify layout * Fix merge error * fix build error * PR1602. IntelGPU backend. Compilation fixed.
-
- 13 Sep, 2018 1 commit
-
-
Robert Kimball authored
* add unsupported_op exception * unsupported_op test * add printout of unsupported op in model * fix GPU dispatcher check * fix test designation * catch exceptions on single file runs too * add unsupported_op exception where needed * remove unsupported_op class * add unassigned op exception * add unit test * catch unsupported op in nbench * add cpu test back * update all latest merges * mode change
-
- 11 Sep, 2018 1 commit
-
-
gaurides authored
* Add conv add fusion * Updated file permissions and cpu_fusion order * Formatted code using maint/apply-code-format.sh * Fixed minor review comments * Use NODE_VALIDATION_ASSERT instead of throw ngraph_error;\nupgrade baseline and fix issues * Some more fixes
-
- 29 Aug, 2018 2 commits
-
-
Robert Kimball authored
* use line comments instead of multiline comments for license header * update more * update new files * more header updates * style
-
Pruthvi authored
disabled RNN test to workaround RNN unit test failure on MAC due to bug in MKLDNN scratchpad creation (#1502)
-
- 27 Aug, 2018 1 commit
-
-
Robert Kimball authored
* normalize comments * address review comments
-
- 15 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Fold affine transformations on 4d convolution * Handle more cases for affine parameters * Style fix
-
- 13 Aug, 2018 1 commit
-
-
Robert Kimball authored
* enable parameter validation for all unit tests
-
- 10 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Dex non-mkldnn version of clipped relu * Change to static_cast
-
- 07 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Switch to using mkldnn memory descriptors for layout * More changes for using mkldnn descriptor instead of format * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding * Rotate mkldnn layouts on transpose * Modifications to builder reshape to skip rotated layouts * More fixes to layouts and removes axis order from cpu layout descriptor * Code cleanup * Removed shuffle folding pass since the functionality is subsumed by the layout pass * Canonicalize a few more formats to keep MKLDNN happy. * Style fixes * Style fixes * Style fixes * Addressed PR feedback and added reshape passthrough for non-transpose cases * Adjust named formats for weights tensors to keep MKLDNN happy * Style fixes * resolved merge issues
-
- 18 Jul, 2018 1 commit
-
-
Nick Korovaiko authored
* cpu loop kernel fusion pass * remove extra code * bounded relu test * address scotts feedback
-
- 17 Jul, 2018 1 commit
-
-
Jayaram Bobba authored
* CPU Direct Execution: Implement ConvertLayout and refactor * CPU Direct Execution: Implement Convolution * 1) Adds computation reuse to direct execution 2) Add avg_pool, broadcast and convolution_bias to direct execution 3) Moved some computation reuse utility functions to graph_utils * Use lists instead of vectors to avoid reallocation overheads * - Added convolution variants to direct execution - Removed ConvolutionBiasRelu, use ConvolutionBias instead - Reduced code duplication by moving functionality to mkldnn_emitter from cpu_emitter * Style fix * Moved mkldnn build_convolution to a templated method * Style fix * refactored mkldnn conv bprop builders * Style fix
-
- 11 Jul, 2018 1 commit
-
-
Pruthvi authored
-
- 03 Jul, 2018 1 commit
-
-
Louis Feng authored
* hacking to support dot of 3 by 2 inputs with gemm_batch. * clean up.
-
- 02 Jul, 2018 3 commits
-
-
Sandeep authored
* declare sigmoid for core fusion * add simple test for sigmoid * info fusion status * cp op as main op * builds as expected * move sigmoid fusion code * add reference kernel * sigmoid bprop reference kernel and clang-format * add delta to bprop * fprop called * compiles bprop * move tests * serializer support * address comments in code * add doc * naming similar to core ops * fix failing test * fix failing test * address clang issue * more changes * change test macro
-
Pruthvi authored
* 1. Added MKLDNNN BoundedRelu op support for Relu6 2. CpuLayout && CPU assignment pass for BoundedRelu Op 3. Unit test inter v/s CPU for BoundedReluOp 4. MKLDNN and default emitter code for BoundedReluOp * Removed Debug prints * 1. Added support for boundedrelu to work on any constant literal 2. unit test case for rank2, rank3, rank4 for bounded relu without serialized graph * Removed is_six() method
-
Louis Feng authored
* Reshape bias to 1D for conv + bias bprop fusion * Reshape goe2 back to 2D before replacing * added shape checks to validate conv+bias op. * removed conv+bias backprop merge for separate PR review. * fixed conv_bias_bprop test. * minor changes to error messages.
-
- 30 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
* collector * keeping track of inputs; simplifying a merging stratey; adding LKGraph * LoopKernel Collector * address feedback * address feedback 2 * address feedback 3
-
- 26 Jun, 2018 1 commit
-
-
Jayaram Bobba authored
* inplace compute * fix warnings * Initial support for convolution sum fusion * Added in-place support for conv sum fusion and test cases * reverting spurious changes * Bug fix to account for inplace input in conv sum fusion * fix compilation error * Addressed PR feedback
-
- 22 Jun, 2018 1 commit
-
-
Matthew Brookhart authored
-
- 19 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
* loop kernel + tests * remove commented out code * remove commented code; add comments * copy_with_new_args +test * add comment * fix comp errors
-
- 15 Jun, 2018 1 commit
-
-
Pruthvi authored
* - Added graph pass for fusing RNN op across layer - Added test case for inter v/s cpu for verifying layer fused RNN - more sanity checks in the RNN fusion graph pass - added support to replace the recurrent cell state correctly in the fused RNN op * Fixed multi layer rnn fusion unit test failure * Addressed PR comments
-
- 13 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
* group conv init * add GroupConvolution op; refine checks in fusion logic * add an emitter, cpu assigment * cpu_layout * add checks to algebraic simplification * updating emitter logic for groupconvolution * working before refactoring * moving primitive creation logic to mkldnn_emitter * group convolution graph test * rename an opt * address jbobba's feedback
-
- 07 Jun, 2018 1 commit
-
-
Louis Feng authored
* batch dot pattern wip. * batch dot pattern wip. * added batch dot op. * batch dot compute testing. * correct gemm parameters. * renaming matrix fusions passes and update tests. * clean up. * clang format. * more clean ups. * clang format. * added CPUBatchDotFusion to default cpu passes. * added missing header. * added element type check.
-
- 06 Jun, 2018 1 commit
-
-
Nishant Patel authored
* Support 3-D pool with mkldnn * Move execute() to test_tools.hpp
-
- 31 May, 2018 1 commit
-
-
Louis Feng authored
-
- 30 May, 2018 2 commits
-
-
Nick Korovaiko authored
* refactor cpworkspaceinsertion for mxnet * rename mxnet functions to adhere to ngraph naming convention * define a member static const int in a cpp file to resolve a linking issue
-
Nishant Patel authored
-
- 23 May, 2018 1 commit
-
-
Pruthvi authored
* - Added pattren matcher for LSTM cell * WIP added support to replace lstm cell instead of subgraph * WIP LSTM pattern matcher, fuses recurrent cells * WIP added RNN CPU op * WIP mkldnn emmiter code for fprop RNN * WIP RNN mkldnn integration - Added mkldnn kernel for uni directional LSTM in the CPU emitter * add a getter for root node * recurrent graph rewrite * fix perms, rename match_root -> get_match_root * fix comp errors * make match_root return the topmost match; fix tests * - WIP GetOutputElement for handling multiple LSTM o/ps - use RecurrentGraphRewrite for replacing node after matching LSTM cells * WIP LSTM multi Output + debug prints * moved LSTM fusion to cpu_fusion * WIP added RNN superfused OP * WIP towards RNN layer fusion * WIP multiple output slicing RNN * WIP RNN mulitple o/ps fusion across layer * WIP corrected input params for fused RNN OP * concat corrosponding param's across differnt LSTM to form inputs to RNN fused op * i) Added test case for RNN kernel ii) runs without error's * refactored and moved LSTM class to standalone file * Rename RNN -> Rnn , LSTM -> Lstm * WIP replace lstm slices to the consumer op * Slicing works on multiple RNN layers * fixed all bugs * - Added CPU RNN Recurrent Fusion - Added CPU LSTM fusion - removed debug code - style fix * - Added support to compute src_iter and dst_iter instead of taking zero_memory_desc - Added unit test to compute one LSTM cell * changed RNN op signature to accept number of states in basic unit of RNN(GRU/LSTM/ vanilla RNN) cell * added sanity checks for RNN op * Fixed issue related to patching the graph while replacing the RNN sliced outputs * Fixed issue to feed the input symbols in the order X0, X1, ...Xt to the RNN op * Added unit test for multi layer RNN fusion * Removed debug statements * Added mulitlayered serialized graph ii) fixed compilation issue * Addressed PR comments * i) WIP MKLDNN layout for RNN Op ii) added test case for INTERPRETER v/s CPU Rnn results * - Fixed bug w.r.to src_layer feature size in rnn mkldnn emitter code - Refactored cpu_fusion rnn test case * merge origin/master with branch pruthvi/lstm_fusion * style fix * Added test case for multiple RNN layers * i) make rnn as mkldnn op if it meets the constraints ii) assert if rnn is not mkldnn op * fix unit test failure * - Added support to reliabily identify the hiddent state and input symbols from the nodes collected by Pattern matcher - Fixed failing unit tests * style fix * - removed "node type" dependency to replace the intermediate LSTM outputs * Addressed PR comments * Fix unit test * - added MKLDNN emitter for LSTM op - graph pass to concat LSTM input recurrent state tensors - CPU layout assignment for LSTM Op - Fixed bug in rnn/lstm unit test's - made changes to use replace_output instead of replace_node for replacing matched graph nodes in LSTM/RNN fusion pass (cherry picked from commit d16fc709265cc0a73e60c6d5f6d2878e7b908aca) * style fix * Renamed passes and style fixes
-
- 21 May, 2018 1 commit
-
-
Jayaram Bobba authored
* Batch norm folding * Addressed PR feedback * Style fixes * Style fix
-
- 16 May, 2018 1 commit
-
-
Nick Korovaiko authored
* give frontends some flexibility over fusions they would like to run * address jbobbas feedback
-
- 08 May, 2018 1 commit
-
-
Nick Korovaiko authored
* MaxPoolWithIndices CPU Fusion * fix test to pass checks in cpu_fusion * pass test * clean up * add a new pass, add layouts * remove the opt from cpu_fusion * refactor cpu_layout logic for maxpool, clean up comments * add comment w.r.t. indices tensor * rename to cpu_workspace_insertion * add CPUWorkspaceInsertion pass for TF
-
- 04 May, 2018 1 commit
-
-
Jayaram Bobba authored
* Adding support for mkldnn convolution+bias+relu op to use in batch norm folding * Style fix * Style fix
-