- 13 Sep, 2018 6 commits
-
-
Robert Kimball authored
* add unsupported_op exception * unsupported_op test * add printout of unsupported op in model * fix GPU dispatcher check * fix test designation * catch exceptions on single file runs too * add unsupported_op exception where needed * remove unsupported_op class * add unassigned op exception * add unit test * catch unsupported op in nbench * add cpu test back * update all latest merges * mode change
-
Chris Sullivan authored
Clang chooses to use a __vectorcall optimization in which address pointers are vector loaded in the gpu::invoke_primitive. This results in a segfault when stack alignment is absent. Since the GPU transformer does not rely on CPU for compute, we disable the optimizations of the emitted function.
-
Amy Zhuang authored
* Modify DEX OneHot op: use generator. * Cast index to int.
-
Nick Korovaiko authored
* topological sort with cdeps * add control deps API, fix unit tests * rollback adjoints changes * fix test failures,add more tests * remove dead code * address scott's feedback
-
Fenglei authored
-
gaurides authored
-
- 12 Sep, 2018 7 commits
-
-
Jayaram Bobba authored
-
gaurides authored
* Add in_place suport for ReplaceSlice * Add emit_replace_slice_inplace kernel * changed file permissions to original * Formatted code using maint/apply-code-format.sh * Removed data type check and removed dead code * Removed setting mkldnn_op(true). ReplaceSlice is not mkldnn op
-
Adam Rogowiec authored
* Add missing header. * Test for ReduceSum * Simple tests for reductions - L1/L2/LogSum/LogSumExp/Max/Mean/Min/Prod/SumSquare. * Add floating point literal suffix * Fix typo
-
L.S. Cook authored
* Update fusion doc and add ONNX build flag to buildlb doc * Fix PR comments * Final PR review comments addreswsed * Fix link on reformmatted doc README * Delete index.rst.save
-
Robert Kimball authored
* add option to copy intput/output data for each iteration * add support for stale buffers
-
Nishant Patel authored
* Add support for Quantized Pooling(Max + Avg) op via mkldnn for IA backend (codegen + DEX) * Add checks for min and max * Extracting out the common code from codegen and DEX * Use call_with_validate
-
tsocha authored
* [ONNX] Shape operator * Review fix pt. 1 * Style check
-
- 11 Sep, 2018 4 commits
-
-
Robert Kimball authored
* wip * interperter use switch instead of if/else * more cleanup * make nop elimination run on all backends * revert * use single include file to define all ops so there is only one instance * move op.tbl to ngraph/op dir as it is useful. Added useage example. * add some comments where needed * revert some changes to reduce delta * add const * add more const * simplify using NodeWrapper * update per review comments * update per review comments * update per review comments * remove switch warning as it is not supported in older gcc
-
Nick Korovaiko authored
-
Michał Karzyński authored
-
gaurides authored
* Add conv add fusion * Updated file permissions and cpu_fusion order * Formatted code using maint/apply-code-format.sh * Fixed minor review comments * Use NODE_VALIDATION_ASSERT instead of throw ngraph_error;\nupgrade baseline and fix issues * Some more fixes
-
- 10 Sep, 2018 1 commit
-
-
shssf authored
* IntelGPU backend: BatchNorm operation optimization * PR1579. Function moved by request
-
- 08 Sep, 2018 1 commit
-
-
Adam Rogowiec authored
* ReduceSum and ReduceSumSquare ONNX operations. * Add new reduction ops. - ReduceLogSum, - ReduceLogSumExp, - ReduceMax, - ReduceMin, - ReduceMean, - ReduceProd. * Add ReduceL1 and ReduceL2 * Utility generic functions generating monotonic sequences of values. * Review comments: return AxisSet not std::vector * Use common functions for generating monotonic sequence. * Review comments.
-
- 07 Sep, 2018 5 commits
-
-
shssf authored
-
Nishant Patel authored
* Add support for Dequantize op via mkldnn for IA backend (codegen + DEX) * Remove unused variable * Static cast target range
-
Nick Korovaiko authored
* constant + pad * adding broadcast test back
-
shssf authored
-
tsocha authored
-
- 06 Sep, 2018 4 commits
-
-
Sang Ik Lee authored
* Implement TopK. * Update python wrappers for TopK, ArgMin and ArgMax. * Address some reviewer comments. * Add type property check tests for TopK. Set correct TopK behavior for K==0. * TopK: Add 1d and 3d unit tests. * Address more reviewer comments. * Apply code style.
-
L.S. Cook authored
* Update doc build v and fix doc on captioning * Clarify to build the library * update link on README
-
Chris Sullivan authored
Double curly-brace initialization (required by clang for non-templated functions) causes a compiler error in centos. (#1561) Since the warning is not enforced in clang for templated functions, we can get around the centos compiler error with only a single set of curly braces here.
-
Artur Wojcik authored
* onnx: add missing header files Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> * onnxifi: implementation of onnxGetBackendIDs Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> * onnxifi: add unit tests for onnxGetBackendIDs Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> * onnxifi: change std::out_of_range to std::length_error Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> * onnxifi: after review changes Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
-
- 05 Sep, 2018 2 commits
-
-
Nick Korovaiko authored
* simplify result copy elimination * gpu fix * remove include header * circumvent gpu issue * add a whitepace
-
Adam Rogowiec authored
-
- 04 Sep, 2018 10 commits
-
-
Robert Kimball authored
-
Fenglei authored
* add cuda reduce * clang format * fix bugs * fix bug * add 1d reduce * clang format * fix bugs * unroll loop * remove debug info * revert tests * unroll 1D reduce op * add comments * using cudnn for nd to scalar reduction * remove cuda 1d reduction since cudnn version is faster * remove 1D kernel * fix bugs * 1d multi block size * remove debug * change kernel name * add reduce to scalar optimization, add test * fix bugs and tune parameters * clang format * update comments * update comments * update comments * clang format * update comments * remove wrong comments, apply clang format * resolve Bob's comment * clang format * pass shared mem size from cuLaunchKernel, set unroll loop size through host code * remove unused code.clang format * change reduce to thread with shfl for each warp first * add seed * unroll size
-
tsocha authored
-
Scott Cyphers authored
* Merge descriptor::TensorView into descriptor::Tensot * fix GPU build
-
Avijit authored
* Added cmake flags to specify D_GLIBCXX_USE_CXX11_ABI and disable building of doc * Renamed the NGRAPH_DOC_BUILD_ENABLE flag based on PR feedback
-
shssf authored
* IntelGPU backend: Sum operation optimization * PR1545. Comments addressed. Test added. Helper function refactored.
-
Michał Karzyński authored
-
Michał Karzyński authored
-
tsocha authored
-
Artur Wojcik authored
-