- 29 Aug, 2018 2 commits
-
-
Robert Kimball authored
* use line comments instead of multiline comments for license header * update more * update new files * more header updates * style
-
Pruthvi authored
disabled RNN test to workaround RNN unit test failure on MAC due to bug in MKLDNN scratchpad creation (#1502)
-
- 27 Aug, 2018 1 commit
-
-
Robert Kimball authored
* normalize comments * address review comments
-
- 15 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Fold affine transformations on 4d convolution * Handle more cases for affine parameters * Style fix
-
- 13 Aug, 2018 1 commit
-
-
Robert Kimball authored
* enable parameter validation for all unit tests
-
- 10 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Dex non-mkldnn version of clipped relu * Change to static_cast
-
- 07 Aug, 2018 1 commit
-
-
Jayaram Bobba authored
* Switch to using mkldnn memory descriptors for layout * More changes for using mkldnn descriptor instead of format * Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding * Rotate mkldnn layouts on transpose * Modifications to builder reshape to skip rotated layouts * More fixes to layouts and removes axis order from cpu layout descriptor * Code cleanup * Removed shuffle folding pass since the functionality is subsumed by the layout pass * Canonicalize a few more formats to keep MKLDNN happy. * Style fixes * Style fixes * Style fixes * Addressed PR feedback and added reshape passthrough for non-transpose cases * Adjust named formats for weights tensors to keep MKLDNN happy * Style fixes * resolved merge issues
-
- 18 Jul, 2018 1 commit
-
-
Nick Korovaiko authored
* cpu loop kernel fusion pass * remove extra code * bounded relu test * address scotts feedback
-
- 17 Jul, 2018 1 commit
-
-
Jayaram Bobba authored
* CPU Direct Execution: Implement ConvertLayout and refactor * CPU Direct Execution: Implement Convolution * 1) Adds computation reuse to direct execution 2) Add avg_pool, broadcast and convolution_bias to direct execution 3) Moved some computation reuse utility functions to graph_utils * Use lists instead of vectors to avoid reallocation overheads * - Added convolution variants to direct execution - Removed ConvolutionBiasRelu, use ConvolutionBias instead - Reduced code duplication by moving functionality to mkldnn_emitter from cpu_emitter * Style fix * Moved mkldnn build_convolution to a templated method * Style fix * refactored mkldnn conv bprop builders * Style fix
-
- 11 Jul, 2018 1 commit
-
-
Pruthvi authored
-
- 03 Jul, 2018 1 commit
-
-
Louis Feng authored
* hacking to support dot of 3 by 2 inputs with gemm_batch. * clean up.
-
- 02 Jul, 2018 3 commits
-
-
Sandeep authored
* declare sigmoid for core fusion * add simple test for sigmoid * info fusion status * cp op as main op * builds as expected * move sigmoid fusion code * add reference kernel * sigmoid bprop reference kernel and clang-format * add delta to bprop * fprop called * compiles bprop * move tests * serializer support * address comments in code * add doc * naming similar to core ops * fix failing test * fix failing test * address clang issue * more changes * change test macro
-
Pruthvi authored
* 1. Added MKLDNNN BoundedRelu op support for Relu6 2. CpuLayout && CPU assignment pass for BoundedRelu Op 3. Unit test inter v/s CPU for BoundedReluOp 4. MKLDNN and default emitter code for BoundedReluOp * Removed Debug prints * 1. Added support for boundedrelu to work on any constant literal 2. unit test case for rank2, rank3, rank4 for bounded relu without serialized graph * Removed is_six() method
-
Louis Feng authored
* Reshape bias to 1D for conv + bias bprop fusion * Reshape goe2 back to 2D before replacing * added shape checks to validate conv+bias op. * removed conv+bias backprop merge for separate PR review. * fixed conv_bias_bprop test. * minor changes to error messages.
-
- 30 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
* collector * keeping track of inputs; simplifying a merging stratey; adding LKGraph * LoopKernel Collector * address feedback * address feedback 2 * address feedback 3
-
- 26 Jun, 2018 1 commit
-
-
Jayaram Bobba authored
* inplace compute * fix warnings * Initial support for convolution sum fusion * Added in-place support for conv sum fusion and test cases * reverting spurious changes * Bug fix to account for inplace input in conv sum fusion * fix compilation error * Addressed PR feedback
-
- 22 Jun, 2018 1 commit
-
-
Matthew Brookhart authored
-
- 19 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
* loop kernel + tests * remove commented out code * remove commented code; add comments * copy_with_new_args +test * add comment * fix comp errors
-
- 15 Jun, 2018 1 commit
-
-
Pruthvi authored
* - Added graph pass for fusing RNN op across layer - Added test case for inter v/s cpu for verifying layer fused RNN - more sanity checks in the RNN fusion graph pass - added support to replace the recurrent cell state correctly in the fused RNN op * Fixed multi layer rnn fusion unit test failure * Addressed PR comments
-
- 13 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
* group conv init * add GroupConvolution op; refine checks in fusion logic * add an emitter, cpu assigment * cpu_layout * add checks to algebraic simplification * updating emitter logic for groupconvolution * working before refactoring * moving primitive creation logic to mkldnn_emitter * group convolution graph test * rename an opt * address jbobba's feedback
-
- 07 Jun, 2018 1 commit
-
-
Louis Feng authored
* batch dot pattern wip. * batch dot pattern wip. * added batch dot op. * batch dot compute testing. * correct gemm parameters. * renaming matrix fusions passes and update tests. * clean up. * clang format. * more clean ups. * clang format. * added CPUBatchDotFusion to default cpu passes. * added missing header. * added element type check.
-
- 06 Jun, 2018 1 commit
-
-
Nishant Patel authored
* Support 3-D pool with mkldnn * Move execute() to test_tools.hpp
-
- 31 May, 2018 1 commit
-
-
Louis Feng authored
-
- 30 May, 2018 2 commits
-
-
Nick Korovaiko authored
* refactor cpworkspaceinsertion for mxnet * rename mxnet functions to adhere to ngraph naming convention * define a member static const int in a cpp file to resolve a linking issue
-
Nishant Patel authored
-
- 23 May, 2018 1 commit
-
-
Pruthvi authored
* - Added pattren matcher for LSTM cell * WIP added support to replace lstm cell instead of subgraph * WIP LSTM pattern matcher, fuses recurrent cells * WIP added RNN CPU op * WIP mkldnn emmiter code for fprop RNN * WIP RNN mkldnn integration - Added mkldnn kernel for uni directional LSTM in the CPU emitter * add a getter for root node * recurrent graph rewrite * fix perms, rename match_root -> get_match_root * fix comp errors * make match_root return the topmost match; fix tests * - WIP GetOutputElement for handling multiple LSTM o/ps - use RecurrentGraphRewrite for replacing node after matching LSTM cells * WIP LSTM multi Output + debug prints * moved LSTM fusion to cpu_fusion * WIP added RNN superfused OP * WIP towards RNN layer fusion * WIP multiple output slicing RNN * WIP RNN mulitple o/ps fusion across layer * WIP corrected input params for fused RNN OP * concat corrosponding param's across differnt LSTM to form inputs to RNN fused op * i) Added test case for RNN kernel ii) runs without error's * refactored and moved LSTM class to standalone file * Rename RNN -> Rnn , LSTM -> Lstm * WIP replace lstm slices to the consumer op * Slicing works on multiple RNN layers * fixed all bugs * - Added CPU RNN Recurrent Fusion - Added CPU LSTM fusion - removed debug code - style fix * - Added support to compute src_iter and dst_iter instead of taking zero_memory_desc - Added unit test to compute one LSTM cell * changed RNN op signature to accept number of states in basic unit of RNN(GRU/LSTM/ vanilla RNN) cell * added sanity checks for RNN op * Fixed issue related to patching the graph while replacing the RNN sliced outputs * Fixed issue to feed the input symbols in the order X0, X1, ...Xt to the RNN op * Added unit test for multi layer RNN fusion * Removed debug statements * Added mulitlayered serialized graph ii) fixed compilation issue * Addressed PR comments * i) WIP MKLDNN layout for RNN Op ii) added test case for INTERPRETER v/s CPU Rnn results * - Fixed bug w.r.to src_layer feature size in rnn mkldnn emitter code - Refactored cpu_fusion rnn test case * merge origin/master with branch pruthvi/lstm_fusion * style fix * Added test case for multiple RNN layers * i) make rnn as mkldnn op if it meets the constraints ii) assert if rnn is not mkldnn op * fix unit test failure * - Added support to reliabily identify the hiddent state and input symbols from the nodes collected by Pattern matcher - Fixed failing unit tests * style fix * - removed "node type" dependency to replace the intermediate LSTM outputs * Addressed PR comments * Fix unit test * - added MKLDNN emitter for LSTM op - graph pass to concat LSTM input recurrent state tensors - CPU layout assignment for LSTM Op - Fixed bug in rnn/lstm unit test's - made changes to use replace_output instead of replace_node for replacing matched graph nodes in LSTM/RNN fusion pass (cherry picked from commit d16fc709265cc0a73e60c6d5f6d2878e7b908aca) * style fix * Renamed passes and style fixes
-
- 21 May, 2018 1 commit
-
-
Jayaram Bobba authored
* Batch norm folding * Addressed PR feedback * Style fixes * Style fix
-
- 16 May, 2018 1 commit
-
-
Nick Korovaiko authored
* give frontends some flexibility over fusions they would like to run * address jbobbas feedback
-
- 08 May, 2018 1 commit
-
-
Nick Korovaiko authored
* MaxPoolWithIndices CPU Fusion * fix test to pass checks in cpu_fusion * pass test * clean up * add a new pass, add layouts * remove the opt from cpu_fusion * refactor cpu_layout logic for maxpool, clean up comments * add comment w.r.t. indices tensor * rename to cpu_workspace_insertion * add CPUWorkspaceInsertion pass for TF
-
- 04 May, 2018 1 commit
-
-
Jayaram Bobba authored
* Adding support for mkldnn convolution+bias+relu op to use in batch norm folding * Style fix * Style fix
-
- 23 Apr, 2018 1 commit
-
-
Nick Korovaiko authored
* any -> skip * run style check
-
- 18 Apr, 2018 1 commit
-
-
Nick Korovaiko authored
* CPU weight fusion initial version * add tests for weight_fusion * address @jbobba's feedback * before cleaning up convolution_weight_optimization.cpp * clean up, rename, fix perms, fix format
-
- 16 Apr, 2018 1 commit
-
-
Nick Korovaiko authored
* get_input_op -> get_argument * more replacing * more replacing2
-
- 13 Apr, 2018 1 commit
-
-
Robert Kimball authored
* remove deprecated * remove all legacy Backend API usage remove deprecated files * pull in changes from master * fix GPU calls * disable tests in convolution generator * update per PR comments. Enable performance counter feature. * update per PR comments * fix build error * fix conditionally compiled test :(
-
- 09 Apr, 2018 1 commit
-
-
Jaikrishnan Menon authored
* CPU: Fuse zero-padded convolution backprop filters * CPU: Add a testcase for zero-padded convolution backprop filters fusion
-
- 04 Apr, 2018 1 commit
-
-
Nick Korovaiko authored
* refactor Adjoints to support multi-output ops * passing tests * switch to generate_adjoints(deltas) and backprop_node * remove debugging code * fix error msg * fix typo adjoitns * fix comp errors in mnist_mlp
-
- 03 Apr, 2018 1 commit
-
-
Scott Cyphers authored
* Fix clang warnings on macos * Conditionalize warning on Apple clang version.
-
- 02 Apr, 2018 1 commit
-
-
Pruthvi authored
* WIP support bn training for global_stats (cherry picked from commit eb81a37328ea177b1d58c9eebdbb345e0fa25f0d) * - Style fix - Fix test case * Addressed PR comments - added support for bn training/inference with a same ctor - added more verbose comments in bn header * Fixed bn serializer and default value in bn ctor for bwd compatibility * proposed docs change * - Addressed PR comments - added support to compute bn inference/training using same mkldnn kernel with global stats * fix unit bn relu unit test
-
- 30 Mar, 2018 1 commit
-
-
Nick Korovaiko authored
* initial refactoring using PM * unit test pass * cosmetic changes * add another rnn test * address louis' feedback * lower-case labels
-
- 29 Mar, 2018 1 commit
-
-
Nick Korovaiko authored
-