- 28 Jun, 2018 3 commits
-
-
Fenglei authored
* enable multi datatpye support for Cudnn. refactor binary ops using cudnn * fix bugs * add tests to skip list that CUDNN does not support * not int support on cudnn for backward pooling * no GPU.dot_4d_5d_multi_axis_big_fp64_VERY_SLOW test anymore * clang format * throw if datatype is int8 or int32 for backward pooling * comments * fix list in unit_test.manifest * add type support for alpha, beta * fix bugs * datatype support for alpha, beta * missing () * clang format * batchnorm backward bug fix * remove debug info * change member function name to snake case. remove comments * use nullptr instead of NULL * code style, use cuDNN everywhere in comments * add cudnn host parameters memory manager. * change name to allocate_by_datatype * compiled * debug * fix bug: using list instead of vector, vector address will change each time it resize * add CUDNN_DATA_UINT8 and CUDNN_DATA_UINT8x4
-
Adam Straw authored
* constant broadcast folding * code review feedback
-
Chris Sullivan authored
* Move maxpool and avgpool into CudaKernelBuilder and add cache parameters to kernel name for broadcast which are required for correct lookup. * Styling. * Add space before avg_pool.
-
- 27 Jun, 2018 5 commits
-
-
Fenglei authored
* add gpu_timer to external function * compiled version * working version * using block_begin and block_end * add the missing ' ;'
-
Nick Korovaiko authored
* get_get_output_elements * fix comp error * address scott's feedback
-
Nick Korovaiko authored
* group conv fix * group conv fix * fix typo
-
Pruthvi authored
* 1. Added mkldnn support for Softmax 2. layout assignment for mkldnn softmax * added assert to check softmax axis for mkldnn
-
Artur Wojcik authored
* onnx: add importer cmakes * onnx: use file(DOWNLOAD ...) command to download onnx.proto * onnx: add Protobuf minimal required version
-
- 26 Jun, 2018 10 commits
-
-
Robert Kimball authored
-
Robert Kimball authored
-
Robert Kimball authored
-
Robert Kimball authored
* cmake runs for interpreter * more updates towards building on windows
-
Jayaram Bobba authored
* inplace compute * fix warnings * Initial support for convolution sum fusion * Added in-place support for conv sum fusion and test cases * reverting spurious changes * Bug fix to account for inplace input in conv sum fusion * fix compilation error * Addressed PR feedback
-
Nick Korovaiko authored
-
shssf authored
* First IntelGPU backend based on clDNN with empty functions * Backend/API:Conflicts resolved and comments addressed
-
L.S. Cook authored
* editing how to execute computation file for clarity and linenos * Add placeholder for runtime docs * Update section on backends, interpreter, and FPGA options * add updated master to fix python_ci * Weird autosummary issue reverted * Clarify new section * fix up docs * Update pattern matcher doc based on Nik's presentation slides WIP * Update doc structure and examples * remove old folder * Fix broken Tensorview refs * . helping people document code more efficiently * PR review edits * Finish PR review comment fixes so far * split patternmatcher PR * small fixes to PM docs * remove mark tags from source code * Final PR cleanup edits
-
Chris Sullivan authored
Moved maxpool padding to GPUAllocator, changed pad_required bool to include asymmetric padding check, and remove an error in gpu_emitter where allocation was happening twice for temporary memory (merge failure). (#1152)
-
Igor Kaplounenko authored
* updated to work with llvm 8.1 that tensorflow is built with * sane extensions on the mac * not doing rpath on apple * apply style
-
- 25 Jun, 2018 4 commits
-
-
Nick Korovaiko authored
-
Nick Korovaiko authored
* inplace compute * fix warnings * address bob's feedback * bob's feedback 2 * bobs feedback 3 * address bob's feedback 4
-
Robert Kimball authored
* remove reference to ngraph core code from codegen. add stand-alone implementations of needed funcions * fixed potential pointer leak * clean up file_util * more file util cleanup, removing unused functions * interpreter works on mac * CPU and INTERPRETER build and pass unmit tests on macos * move get_directory to file_util * cleanup
-
Nick Korovaiko authored
* switch to using has_class for op::Skip * apply format
-
- 23 Jun, 2018 1 commit
-
-
Nick Korovaiko authored
-
- 22 Jun, 2018 2 commits
-
-
Nick Korovaiko authored
-
Matthew Brookhart authored
-
- 21 Jun, 2018 2 commits
-
-
Adam Straw authored
* adding constant propagation pass * adding test/constant_propagation.cpp * template make_constant_reshape function * code review feedback * add missing files
-
Robert Kimball authored
-
- 20 Jun, 2018 3 commits
-
-
Nick Korovaiko authored
* serialize logic for reverse_sequence * Added serializer support for Softmax
-
Adam Procter authored
* Fix bug with concat for 0-size tensors * Simplify test for zero-length axes, per PR comments
-
Scott Cyphers authored
* [doc] Fix code snippet in derive-for-training * Fix another code snippet in derive-for-training
-
- 19 Jun, 2018 4 commits
-
-
Nick Korovaiko authored
* add assert to make sure we don't replace unreachable nodes * fix unittest failures * sparsity fix
-
Robert Kimball authored
* fix mkldnn rpath * fix compile warning * close backends when exiting * set backend output directory of backends to the ngraph output directory * Aprocter/patch patch (#1119) * Move more rpath stuff inside if(NOT APPLE) * fix repatch problem with mkldnn library * add updated patch command for older versions of cmake
-
Nick Korovaiko authored
* loop kernel + tests * remove commented out code * remove commented code; add comments * copy_with_new_args +test * add comment * fix comp errors
-
Jayaram Bobba authored
* Move to depth-first serialization of graph for better cache behavior * Added comment * Force 64 byte stack alignment to avoid crashes from unaligned AVX loads/stores * Revert "Force 64 byte stack alignment to avoid crashes from unaligned AVX loads/stores" This reverts commit 84346420fbd0fbd5d05a4a1e8f5fae12bdc7348b. * revert to breadth-first serialization
-
- 18 Jun, 2018 6 commits
-
-
Jayaram Bobba authored
DEX Part 2
-
Jayaram Bobba authored
-
Nick Korovaiko authored
-
Jaikrishnan Menon authored
-
Fenglei authored
* enable more gpu ops test
-
Jayaram Bobba authored
-