- 12 Mar, 2018 3 commits
-
-
fenglei.tian authored
-
fenglei.tian authored
-
Christian Convey authored
-
- 11 Mar, 2018 4 commits
-
-
Robert Kimball authored
* fix detailed timing flag * more detailed info
-
Robert Kimball authored
-
Robert Kimball authored
use op::Constant's data rather than emitting the data in the generated cpp code. This make compile times for trained models something like 100x faster. (#624)
-
Jayaram Bobba authored
-
- 10 Mar, 2018 5 commits
-
-
Jayaram Bobba authored
Add mkldnn layouts to Maxpool and Maxpoolbackprop
-
Jayaram Bobba authored
-
fenglei.tian authored
-
fenglei.tian authored
-
fenglei.tian authored
-
- 09 Mar, 2018 12 commits
-
-
Chris Sullivan authored
* Refactored unary elementwise ops into a single interface that is adaptable to elementwise ops with arbitrary number of inputs. * Renamed EmitUnaryElementwise -> EmitElementwise. Implemented first binary elementwise op (Power). * Refactored some of the boiler plate code for emitting cuda kernels to nvrtc out of the emit functions and into the CudaFunctionPool static singleton. CodeWriter now saves cuda kernels to ./gpu_codegen. * Added ops Divide, Subtract & Sign to the GPU transformer. Subtract and Sign both use custom device helper functions which have math kernels defined for the op in gpu_cuda_kernel_ops.hpp, and which are built by a new get_device_helper function.
-
Louis Feng authored
NGMX-296 Convolution + Bias with MKLDNN
-
Louis Feng authored
-
Louis Feng authored
-
Louis Feng authored
Also fixed conv+bias cpu layout bugs.
-
Louis Feng authored
* fixed memory leak. * clang format.
-
Fenglei authored
-
Fenglei authored
* update gpu_emitter use template * add template
-
fenglei.tian authored
-
Nick Korovaiko authored
-
Nick Korovaiko authored
* removing extra copies due to op::Result * remove comment * fix comment * switch to a flag version * add copyright header #pragma once * add impl file, rename result_elimination.hpp to result_copy_elimination.hpp to match the opt name * add cpp suffix to result_copy_elimination * use member in-class member init
-
Pruthvi authored
* - Added sigmoid fusion pass - added mkldnn emitter code for sigmoid * - corrected sigmoid expected values - add layout assignment for sigmoid op * - added assert's in cpu fusion for sigmoid - style fix * remove debug prints * NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes * NGMX-371 #comment use shape_size() to calculate the 1d input size
-
- 08 Mar, 2018 16 commits
-
-
Jayaram Bobba authored
-
Jayaram Bobba authored
-
Jayaram Bobba authored
-
fenglei.tian authored
-
fenglei.tian authored
-
Jayaram Bobba authored
Jbobba/batchnorm layouts
-
Jayaram Bobba authored
-
Jayaram Bobba authored
-
Louis Feng authored
-
Jayaram Bobba authored
-
Nick Korovaiko authored
* remove broadcast from matmulbias * fix comments * working gemm-based broadcast * fix clang warning
-
fenglei.tian authored
:
-
Fenglei Tian authored
Merge branch 'tfl/gpu_emitter_template' of github.com:NervanaSystems/private-ngraph-cpp into tfl/gpu_emitter_template
-
Fenglei Tian authored
-
fenglei.tian authored
-
Fenglei Tian authored
-