- 27 Feb, 2019 11 commits
-
-
Sang Ik Lee authored
-
Amy Zhuang authored
* Refactor to create MKLDNN primitives on the first iteration: add, avg_pool, batch_norm, bounded_relu, concat, convert_layout, leaky_relu, lrn, max_pool, quantized_avg_pool, quantized_max_pool, relu, sigmoid, slice, softmax. * Refactor to create MKLDNN primitives on the first iteration: pooling backward, convolution. * Refactor to create MKLDNN primitives on the first iteration: convolution backward, rnn, lstm, quantization, dequantization. * Delete one duplicate declaration. * Create and pass mkldnn descriptors/primitive-descriptors for ops. * Create and pass mkldnn descriptors for convolution backward ops. * Remove one unused variable. * Remove unused variables. * Remove unused variables. * Address PR feedback. * Fix a bug. * Add one parameter to build_quantize_reorder. * Address PR feedback. * Fix bi-rnn issue.
-
Robert Kimball authored
* rename and document the node name methods * address more references to renamed methods * fix compile error * fix build error
-
Amy Zhuang authored
* Reuse memory for CPU backend. * Use NGRAPH_REUSE_MEMORY to enable memory reuse. * Add a test. * Move make_function to test_tools.cpp. * Add more comments. * Address PR Feedback: add a method to CPU backend. * *Add a member to CPUOpAnnotations to remove redundant code. *Overload compile function for CPU backend. * Move make_function out of test_tools. * Address PR Feedback. * Use modified liveness analysis in CPUMemoryAssignment pass. * Use lambda expression. * Fix style error. * Check if any user of the tensor has destructive io when building tensor alias map. * Fix a bug. * Check if tensor has multiple users. * Allow tensor alias for destructive oi node. * Update multiple_users_tensor set along the chain of in place ops. * No tensor alias if input is parameter or constant. * Use buffer sets in cpu memory assignment, tensors sharing the same memory buffer are put into the same set. * Add more checks and do not combine sets when allowing destructive oi. * Style fix. * Do no allow destructive oi if the input tensor uses function input memory. Update set label. * Add unit tests. * Style fix. * Get the correct size for memcpy when the input is padded. * Style fix. * Address PR feedback. * Address PR feedback. * Move make_function in cpu_test after #if 0 and before the disabled test. * Add utility functions. Use iterator. Rename variables. * Add pass attributes and move cpu memory assignment to common passes (#2504)
-
Scott Cyphers authored
* Add info about lib versions in an easy to find place * Review comments
-
Sergey Shalnov authored
-
Robert Kimball authored
* function call working * fix compile error * fix compile error * add attribute support to plot_graph * fix build error * fix merge error * better colors for FunctionCall op
-
Leona C authored
* Cleaner API doc reference for compile call * Add a useful table for nGraph namespaces * Remove layout namespace * Show exploding kernel problem on illustration like IEEE preso * WIP branch for new documentation restructuring that is a huge pain * Fix the doc reorg mess * Fix underline * List of passes disclaimer note * Update disclaimers on README * More cleanup of doc reorg * Update core docs * Update overview on core * Add PR feedback * Get rid of all the gazillion of doc build errors from rearranging stuff * Add section on tutorials * Update branch * Cleanup intro * Add better detail to overview * Revise buildlb instructions and add better title for contributing to doc * Note about unit tests * Editing * Update core overview namespace table and fix more broken links due to ToC changes * Update normalized boolean build defaults * Update for PR 2507 * Incorporate new PR feedback review
-
Ayan Moitra authored
* Int unit tests that fail with bfloat * move tests out of single file * style * Incorporate Bob's comments * edits * Incorporate comments * style * edits * Add failing test to intel gpu manifest * comments incoprorated
-
Sergey Shalnov authored
-
tsocha authored
* Remove get_numpy_broadcast_shape helper function * Remove numpy_style_broadcast_for_binary_operation helper function * Remove TODO * Review fix pt. 1 * Remove parameters as shape containers * Fix LSTM * Review fix pt. 1 * Style apply * Use old comment
-
- 26 Feb, 2019 11 commits
-
-
Jayaram Bobba authored
* Add QuantizedConcat * Remove unused variables and add check for size of mins and maxes vector * Resolve conflicts * Merged with master and addressed some PR feedback * Maxpool and Avgpool fusions. Exclude Q from conv+relu fusion * Remove single-user check from fusions * Quantized concat fusion * workaround: do reshape sinking by default * style fix * check scales for QuantizedConcat * use compare_constants * remove stale comment * Handle all concat cases from arg size 2 to 6 * addressed feedback
-
Sergey Shalnov authored
* IntelGPU backend: Relu and Sigmoid datatypes support * fix for OpenCL constants * add const to variables * PR2500. Style fix
-
Adam Rogowiec authored
* Utility functions for calculating Lp norm. * Use functor object as a reduction operation. * Use new api of make_ng_reduction_op. * Use utility norm functions for reduction operations. * Onnx GlobalLpPool operator. * Ensure correct shapes after lp_norm reduction. * Remove unused function overload. * Fix shapes and tensor types. * Unit tests. * Update comments. * Update supported ops status table. * Fix: take absolute value of input tensor elements. * UT: with odd value p-norm. * Fix: move taking abs value into respective lp-norm functions. * Fix clang -Wdocumentation-unknown-command error. * Update supported op status table with new Jira ticket for Erf op. * Update supported_ops status table. * Update interface of make_ng_reduction_op - accept std::function object. * Update to use new make_ng_reduction_op api. * Remove unused header. * Fix errors on CentOS.
-
Robert Kimball authored
* Move codewriter out of codegen to ngraph root. It is useful for more than writing code. * remove codewriter.* from intel gpu backend and use ngraph version * fix merge issues
-
Rob Earhart authored
* Add a direct-to-Tile op * Disable dequantize_dynamic_offset * Add missing Py op defn * Generic passthrough op; serialization * Appease Linux builds * Add gpu handlers * Disable floor_int32 for now
-
Sang Ik Lee authored
-
Sandeep authored
* fix a bug on finalize when uninitialized bool * change this_init_comm -> m_init_comm move init to header
-
Pruthvi authored
* - MKLDNN would choose the algorithm which will potentially give best performance based on - convolution dimensions number of logical processors available. - (For auto-dispatching to work as intended, - use the same thread affinity settings when creating the convolution as when executing the convolution.) - The relationship between convolution sizes and the best performing algorithm is empirically based on performance observations * bump mkldnn version to V0.18-rc * Revert "- MKLDNN would choose the algorithm which will potentially give best performance based on" This reverts commit 904beb8ad8d4e829fbae5f38a803ea80a72b3ffd. * Update mkl-dnn patch for soversion removal.
-
Adam Rogowiec authored
-
Robert Kimball authored
-
Tomasz Dołbniak authored
-
- 25 Feb, 2019 5 commits
-
-
Michał Karzyński authored
-
Aleksey Marchuk authored
* Update of MLSL git tag * Use last MLSL commit * Use last valid MLSL commit
-
Sang Ik Lee authored
Update TBB build script for Windows. Fix typo. Fix incorrect omp lib name on Windows. Fix incorrect tbb.dll path on Windows. Make LIBRARY and ARCHIVE output directory consistent. Function missing on Windows. Update test::util::all_close() to fix compilation issue on Windows Export CPU_Executable on Windows. Change nbench path for unit-test on Windows. Change copy to copy_if_different. Install CPU backend on Windows. Disable tools test on Windows. Disable two failing unit test on Windows CPU. Fix incorrect CPU backend install path on Windows.
-
Diego Caballero authored
* [CPUCodegen] Remove unnecessary forward declaration. * [CPUCodegen] Introduce CPURuntimeContextCG for standalone codegen generation. This patch introduces CPURuntimeContextCG. This class is aimed at removing the dependency between nGraph and the generated code in codegen mode. It will be used to hold the runtime context in codegen mode and it will be emitted in the generated code. For now, CPURuntimeContextCG only contains TBB's graph and global context. Follow-up patches will migrate more members in CPURuntimeContext to CPURuntimeContextCG for codegen mode. Testing results: - Before: NGRAPH_CODEGEN=1 test/unit-test [----------] Global test environment tear-down [==========] 2503 tests from 54 test cases ran. (290406 ms total) [ PASSED ] 2490 tests. - After: NGRAPH_CODEGEN=1 test/unit-test [----------] Global test environment tear-down [==========] 2503 tests from 54 test cases ran. (412616 ms total) [ PASSED ] 2490 tests. * [CPUCodegen] Refactor function parameters string * Fix bug in CPU_CallFrame destructor impacting DEX * [Standalone] Replace assert with NGRAPH_ASSERT
-
Pruthvi authored
* - Added reorder support for rnn weights_layer/iter * i) fixed compilation issues ii) working but still observing precision error * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter * i) added support for src reorder to TNC from NTC * reorder support for rnn output fron NTC to TNC * - added support for rnn weight reorder ldgoi -> ldigo - code refactor for lstm/rnn kernel in mkldnn emitter * - refactor rnn mkldnnn kernel, change variable names * fix RNN codegen kernel * disbale layer rnn fusion pass, to test CI * method to validate recurrent rnn inputs * add correlated macthes for Recurrent RNN PM * - simplify reorder logic for rnn_weights - fix graph pattern for fusing rnn cell across time steps * do weights reorders in rnn timesteps fusion * refactored LSTM graph pass * - Bug fix for finding the lstm inputs determenstically - Refactored LSTM graph pass to single pass - made changes to LSTM RNN time step fusion graph pass * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass * fix compilation error * Fix GNMT rnn fusion * check if the node is in use before replacing in RNN graph passes * i) fix style ii) fix topo sort issue in RNN graph pass * style fix * fix bug in simplify_concat pass * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2 * cse for convert layout * addressed PR comments * - optimization pass to remove Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2 - conditional fusing of LSTM cells only for the decoder * made changes to multi layer RNN fusion callback * fix asserts in RNN op * - added support to fuse layers when slc=dlc for RNN cells - bug fix on the sanity checks for RNN Op * - support RNN layer fusion till slc = dlc - bug fixes in multi layer rnn fusion call back * capture reshape in the RNN weights * Addressed PR comments * - added comments in multi layer PM call back - fuse only if slc == DLC across layers * restore deleted 3_lstm_cell_forward.json file * fix typo * fix failing unit tets * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input. * Address PR feedback: process in place slice after propagating in place input. * Set INTERMEDIATE role before propagating in place input. * Do not add temporaries to the variable name map before propagating in place input in codegen. * Fix a bug in codegen. * Fix a bug in codegen slice. * reenable disabled rnn unit test * fix compiler error * - bug fix in the slicing logic for the layer fused rnn cell - fix failing rnn unit test * - Addressed PR comments - removed redundant checks from the rnn graph pass - simplified rnn call back replace node logic * - added new multilayer rnn *.json file - fix test case * [PRIVATE BRANCH] Style fixes (#2080) * Style fixes * change order of lstm gates * WIP bi rnn * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113) * Style fixes for single-layer RNN fusion * Style fixes to multi-layer RNN * added callback routine for bi-directional rnn * fix rnn op ctor, rnn mkldnn emitter to accomodate bi directional rnn * style fix * added helper function for rnn's to query direction and cell_type * fix clang error * - unit test case for bi rnn fusion - style fix * - updated bi-rnn graph pass to handle reverse and reverse_seq ops in the predicate - added bi-rnn inter v/s cpu unit test case - add support to in mkldnn_utils to create_md with tnc/ntc format * - added enum type to deduce rnn_type * Addressed PR comments - handle reshapes from {t, n, c} to {n, t, c} in the graph pass * fix style * fix clang error * fix style * i) move enum specific to rnn to seperate header
-
- 23 Feb, 2019 4 commits
-
-
Sergey Shalnov authored
-
Leona C authored
* Cleaner API doc reference for compile call * Add a useful table for nGraph namespaces * Remove layout namespace * Show exploding kernel problem on illustration like IEEE preso * WIP branch for new documentation restructuring that is a huge pain * Fix the doc reorg mess * Fix underline * List of passes disclaimer note * Update disclaimers on README * More cleanup of doc reorg * Update core docs * Update overview on core * Add PR feedback * Get rid of all the gazillion of doc build errors from rearranging stuff * Add section on tutorials * Update branch * Cleanup intro * Add better detail to overview
-
Adam Rogowiec authored
* Function for retrieving number of node outputs. * Handle optional trimmed outputs. * Fix compilation err on clang. * Fix error for number of outputs. - Iterate over the minimum of number of outputs we return and the number of outputs of respective node in the graph. Some outputs may be optional and trimmed, as well as for some op implementations we may return not all outputs (ie. Dropout - where we do not return additional optional output). * Update graph.cpp * Add dropout ONNX op. * Revert to iterate over node outputs in graph. * Use more apropriate word in comment.
-
Amy Zhuang authored
-
- 22 Feb, 2019 6 commits
-
-
Sang Ik Lee authored
-
Nishant Patel authored
* Add QuantizedConcat * Remove unused variables and add check for size of mins and maxes vector * Resolve conflicts * Merged with master and addressed some PR feedback * Avoid float comparison * make min/max vector, add dequant/quanti * fix dequant/quant scales * fix CI build issue
-
Robert Kimball authored
* use calls for new backend API in unit tests * fix compile error * fix compile error
-
Sergey Shalnov authored
* IntelGPU backend: Comvolution support for double and code minor clean up * PR2479. custom kernel selection fix
-
tsocha authored
* [ONNX] Overriding custom ops * Add UT * Style Check * Review & style fix
-
aslepko authored
Changing clDNN to latest commit.
-
- 21 Feb, 2019 3 commits
-
-
Sergey Shalnov authored
* IntelGPU backend: Quantize operations * Update intelgpu_op_custom_kernels.cpp
-
tsocha authored
-
Tomasz Dołbniak authored
-