1. 27 Feb, 2019 2 commits
  2. 26 Feb, 2019 11 commits
    • Jayaram Bobba's avatar
      More quantized fusion patterns (#2480) · b8106133
      Jayaram Bobba authored
      * Add QuantizedConcat
      
      * Remove unused variables and add check for size of mins and maxes vector
      
      * Resolve conflicts
      
      * Merged with master and addressed some PR feedback
      
      * Maxpool and Avgpool fusions. Exclude Q from conv+relu fusion
      
      * Remove single-user check from fusions
      
      * Quantized concat fusion
      
      * workaround: do reshape sinking by default
      
      * style fix
      
      * check scales for QuantizedConcat
      
      * use compare_constants
      
      * remove stale comment
      
      * Handle all concat cases from arg size 2 to 6
      
      * addressed feedback
      b8106133
    • Sergey Shalnov's avatar
      IntelGPU backend: Relu and Sigmoid datatypes support (#2500) · 3863180d
      Sergey Shalnov authored
      * IntelGPU backend: Relu and Sigmoid datatypes support
      
      * fix for OpenCL constants
      
      * add const to variables
      
      * PR2500. Style fix
      3863180d
    • Adam Rogowiec's avatar
      [ONNX] GlobalLpPool operator (#2476) · d357cb92
      Adam Rogowiec authored
      * Utility functions for calculating Lp norm.
      
      * Use functor object as a reduction operation.
      
      * Use new api of make_ng_reduction_op.
      
      * Use utility norm functions for reduction operations.
      
      * Onnx GlobalLpPool operator.
      
      * Ensure correct shapes after lp_norm reduction.
      
      * Remove unused function overload.
      
      * Fix shapes and tensor types.
      
      * Unit tests.
      
      * Update comments.
      
      * Update supported ops status table.
      
      * Fix: take absolute value of input tensor elements.
      
      * UT: with odd value p-norm.
      
      * Fix: move taking abs value into respective lp-norm functions.
      
      * Fix clang -Wdocumentation-unknown-command error.
      
      * Update supported op status table with new Jira ticket for Erf op.
      
      * Update supported_ops status table.
      
      * Update interface of make_ng_reduction_op - accept std::function object.
      
      * Update to use new make_ng_reduction_op api.
      
      * Remove unused header.
      
      * Fix errors on CentOS.
      d357cb92
    • Robert Kimball's avatar
      Move CodeWriter out of codegen to ngraph root. (#2473) · c2974ac2
      Robert Kimball authored
      * Move codewriter out of codegen to ngraph root. It is useful for more than writing code.
      
      * remove codewriter.* from intel gpu backend and use ngraph version
      
      * fix merge issues
      c2974ac2
    • Rob Earhart's avatar
      Convert PlaidML Tile op to generic ngraph passthrough op (#2361) · cf33669b
      Rob Earhart authored
      * Add a direct-to-Tile op
      
      * Disable dequantize_dynamic_offset
      
      * Add missing Py op defn
      
      * Generic passthrough op; serialization
      
      * Appease Linux builds
      
      * Add gpu handlers
      
      * Disable floor_int32 for now
      cf33669b
    • Sang Ik Lee's avatar
      fb4db5f6
    • Sandeep's avatar
      fix a bug on finalize when uninitialized bool (#2498) · ee5567c4
      Sandeep authored
      * fix a bug on finalize when uninitialized bool
      
      * change this_init_comm -> m_init_comm
      
      move init to header
      ee5567c4
    • Pruthvi's avatar
      Upgrades MKLDNN to V0.18-rc (#2486) · 278632dd
      Pruthvi authored
      * - MKLDNN would choose the algorithm which will potentially give best performance based on
      - convolution dimensions number of logical processors available.
      
      - (For auto-dispatching to work as intended,
      - use the same thread affinity settings when creating the convolution as when executing the convolution.)
      - The relationship between convolution sizes and the best performing algorithm is empirically based on performance observations
      
      * bump mkldnn version to V0.18-rc
      
      * Revert "- MKLDNN would choose the algorithm which will potentially give best performance based on"
      
      This reverts commit 904beb8ad8d4e829fbae5f38a803ea80a72b3ffd.
      
      * Update mkl-dnn patch for soversion removal.
      278632dd
    • Adam Rogowiec's avatar
      [ONNX] Enhance LSTM support. (#2408) · 6e6c8af4
      Adam Rogowiec authored
      6e6c8af4
    • Robert Kimball's avatar
    • Tomasz Dołbniak's avatar
      e8538ba0
  3. 25 Feb, 2019 5 commits
    • Michał Karzyński's avatar
      521e31fd
    • Aleksey Marchuk's avatar
      Update of MLSL git tag (#2474) · d3453447
      Aleksey Marchuk authored
      * Update of MLSL git tag
      
      * Use last MLSL commit
      
      * Use last valid MLSL commit
      d3453447
    • Sang Ik Lee's avatar
      Update mkl-dnn build script. (#2487) · 65ac0e68
      Sang Ik Lee authored
      Update TBB build script for Windows.
      
      Fix typo.
      
      Fix incorrect omp lib name on Windows.
      
      Fix incorrect tbb.dll path on Windows.
      
      Make LIBRARY and ARCHIVE output directory consistent.
      
      Function missing on Windows.
      
      Update test::util::all_close() to fix compilation issue on Windows
      
      Export CPU_Executable on Windows.
      
      Change nbench path for unit-test on Windows.
      
      Change copy to copy_if_different.
      
      Install CPU backend on Windows.
      
      Disable tools test on Windows.
      
      Disable two failing unit test on Windows CPU.
      
      Fix incorrect CPU backend install path on Windows.
      65ac0e68
    • Diego Caballero's avatar
      [Standalone] Introduce CPURuntimeContextCG for standalone codegen generation. (#2421) · e9162eb5
      Diego Caballero authored
      * [CPUCodegen] Remove unnecessary forward declaration.
      
      * [CPUCodegen] Introduce CPURuntimeContextCG for standalone codegen generation.
      
      This patch introduces CPURuntimeContextCG. This class is aimed at
      removing the dependency between nGraph and the generated code in
      codegen mode. It will be used to hold the runtime context in
      codegen mode and it will be emitted in the generated code. For now,
      CPURuntimeContextCG only contains TBB's graph and global context.
      Follow-up patches will migrate more members in CPURuntimeContext to
      CPURuntimeContextCG for codegen mode.
      
      Testing results:
        - Before: NGRAPH_CODEGEN=1 test/unit-test
          [----------] Global test environment tear-down
          [==========] 2503 tests from 54 test cases ran. (290406 ms total)
          [  PASSED  ] 2490 tests.
      
        - After: NGRAPH_CODEGEN=1 test/unit-test
          [----------] Global test environment tear-down
          [==========] 2503 tests from 54 test cases ran. (412616 ms total)
          [  PASSED  ] 2490 tests.
      
      * [CPUCodegen] Refactor function parameters string
      
      * Fix bug in CPU_CallFrame destructor impacting DEX
      
      * [Standalone] Replace assert with NGRAPH_ASSERT
      e9162eb5
    • Pruthvi's avatar
      Pruthvi/bi rnn (#2232) · a444f7a9
      Pruthvi authored
      * - Added reorder support for rnn weights_layer/iter
      
      * i) fixed compilation issues ii) working but still observing precision error
      
      * i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter
      
      * i) added support for src reorder to TNC from NTC
      
      * reorder support for rnn output fron NTC to TNC
      
      * - added support for rnn weight reorder ldgoi -> ldigo
      - code refactor for lstm/rnn kernel in mkldnn emitter
      
      * - refactor rnn mkldnnn kernel, change variable names
      
      * fix RNN codegen kernel
      
      * disbale layer rnn fusion pass, to test CI
      
      * method to validate recurrent rnn inputs
      
      * add correlated macthes for Recurrent RNN PM
      
      * - simplify reorder logic for rnn_weights
      - fix graph pattern for fusing rnn cell across time steps
      
      * do weights reorders in rnn timesteps fusion
      
      * refactored LSTM graph pass
      
      * - Bug fix for finding the lstm inputs determenstically
      - Refactored LSTM graph pass to single pass
      - made changes to LSTM RNN time step fusion graph pass
      
      * - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass
      
      * fix compilation error
      
      * Fix GNMT rnn fusion
      
      * check if the node is in use before replacing in RNN graph passes
      
      *  i) fix style ii) fix topo sort issue in RNN graph pass
      
      * style fix
      
      * fix bug in simplify_concat pass
      
      * replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2
      
      * cse for convert layout
      
      * addressed PR comments
      
      * - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
      - conditional fusing of LSTM cells only for the decoder
      
      * made changes to multi layer RNN fusion callback
      
      * fix asserts in RNN op
      
      * - added support to fuse layers when slc=dlc for RNN cells
      - bug fix on the sanity checks for RNN Op
      
      * - support RNN layer fusion till slc = dlc
      - bug fixes in multi layer rnn fusion call back
      
      * capture reshape in the RNN weights
      
      * Addressed PR comments
      
      * - added comments in multi layer PM call back
      - fuse only if slc == DLC across layers
      
      * restore deleted 3_lstm_cell_forward.json file
      
      * fix typo
      
      * fix failing unit tets
      
      * When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.
      
      * Address PR feedback: process in place slice after propagating in place input.
      
      * Set INTERMEDIATE role before propagating in place input.
      
      * Do not add temporaries to the variable name map before propagating in place input in codegen.
      
      * Fix a bug in codegen.
      
      * Fix a bug in codegen slice.
      
      * reenable disabled rnn unit test
      
      * fix compiler error
      
      * - bug fix in the slicing logic for the layer fused rnn cell
      - fix failing rnn unit test
      
      * - Addressed PR comments
      - removed redundant checks from the rnn graph pass
      - simplified rnn call back replace node logic
      
      * - added new multilayer rnn *.json file
      - fix test case
      
      * [PRIVATE BRANCH] Style fixes (#2080)
      
      * Style fixes
      
      * change order of lstm gates
      
      * WIP bi rnn
      
      * [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)
      
      * Style fixes for single-layer RNN fusion
      
      * Style fixes to multi-layer RNN
      
      * added callback routine for bi-directional rnn
      
      * fix rnn op ctor, rnn mkldnn emitter to accomodate bi directional rnn
      
      * style fix
      
      * added helper function for rnn's to query direction and cell_type
      
      * fix clang error
      
      * - unit test case for bi rnn fusion
      - style fix
      
      * - updated bi-rnn graph pass to handle reverse and reverse_seq ops in the predicate
      - added bi-rnn inter v/s cpu unit test case
      - add support to in mkldnn_utils to create_md with tnc/ntc format
      
      * - added enum type to deduce rnn_type
      
      * Addressed PR comments
          - handle reshapes from {t, n, c} to {n, t, c} in the graph pass
      
      * fix style
      
      * fix clang error
      
      * fix style
      
      * i) move enum specific to rnn to seperate header
      a444f7a9
  4. 23 Feb, 2019 4 commits
    • Sergey Shalnov's avatar
      f8632ea0
    • Leona C's avatar
      Reorganize doc folders for core-related doc on fusion, graph rewrite, and compiler passes (#2466) · fd0ed37c
      Leona C authored
      * Cleaner API doc reference for compile call
      
      * Add a useful table for nGraph namespaces
      
      * Remove layout namespace
      
      * Show exploding kernel problem on illustration like IEEE preso
      
      * WIP branch for new documentation restructuring that is a huge pain
      
      * Fix the doc reorg mess
      
      * Fix underline
      
      * List of passes disclaimer note
      
      * Update disclaimers on README
      
      * More cleanup of doc reorg
      
      * Update core docs
      
      * Update overview on core
      
      * Add PR feedback
      
      * Get rid of all the gazillion of doc build errors from rearranging stuff
      
      * Add section on tutorials
      
      * Update branch
      
      * Cleanup intro
      
      * Add better detail to overview
      fd0ed37c
    • Adam Rogowiec's avatar
      [ONNX] Handle trimmed optional outputs. (#2434) · 12b5f085
      Adam Rogowiec authored
      * Function for retrieving number of node outputs.
      
      * Handle optional trimmed outputs.
      
      * Fix compilation err on clang.
      
      * Fix error for number of outputs.
      
      - Iterate over the minimum of number of outputs we return and the number
        of outputs of respective node in the graph. Some outputs may be
        optional and trimmed, as well as for some op implementations we may
        return not all outputs (ie. Dropout - where we do not return additional
        optional output).
      
      * Update graph.cpp
      
      * Add dropout ONNX op.
      
      * Revert to iterate over node outputs in graph.
      
      * Use more apropriate word in comment.
      12b5f085
    • Amy Zhuang's avatar
  5. 22 Feb, 2019 6 commits
  6. 21 Feb, 2019 3 commits
  7. 20 Feb, 2019 2 commits
  8. 19 Feb, 2019 3 commits
  9. 18 Feb, 2019 2 commits
  10. 16 Feb, 2019 2 commits