Commits · bd51497bb2733fcf9d7d90a8c06f7138b7e65c93 · submodule / ngraph

24 Apr, 2018 3 commits
- infra for algebraic simplification and simplifications for Add and Mu… (#878) · bd51497b
  Nick Korovaiko authored Apr 24, 2018
```
* infra for algebraic simplification and simplifications for Add and Multiply (including broadcast consts)

* add tests, fix bugs

* negative tests, 0*0, 0*1, 0+0

* possible fix for 0*1

* remove stale test

* fix merge comp errors

* fix comp errors
```
  bd51497b
- Update to enable pass backend unit tests (#904) · 1eb9f9bf
  Robert Kimball authored Apr 24, 2018
```
* get all ops working

* enable autodiff tests for IE backend
```
  1eb9f9bf
- fix bug the dir create after the first ops.txt store (#887) · a8a68452
  Fenglei authored Apr 24, 2018
  
  a8a68452
23 Apr, 2018 6 commits
- Add logical-and, logical-or ops (#892) · 12e8b9b7
  Adam Procter authored Apr 23, 2018
```
* Add logical-and, logical-or ops

* Restore accidentally-deleted test

* add new ops to IE backend
```
  12e8b9b7
- Enable users to request default/row-major layouts on result nodes (#884) · c74da83e
  Jayaram Bobba authored Apr 23, 2018
```
* Enable users to request default/row-major layouts on result nodes

* copy default layout attribute when copying the result ops

* Result nodes cannot be replaced. use direct graph manipulation instead

* Add unit test to verify default layouts on result nodes when requested
```
  c74da83e
- Rename Any to Skip since that is exactly what it should have been called from the beginning (#891) · 478a94f9
  Nick Korovaiko authored Apr 23, 2018
```
* any -> skip

* run style check
```
  478a94f9
- Add CUDNN_SAFE_CALL and CUBLAS_SAFE_CALL (#889) · 870200d1
  Fenglei authored Apr 23, 2018
```
* add CUDNN_SAFE_CALL and CUBLAS_SAFE_CALL

* using sstream

* passed all unit test

* format error msg

* fix ( ) bug
```
  870200d1
- Remove obsolete python files (#895) · c0aa199a
  Scott Cyphers authored Apr 23, 2018
  
  c0aa199a
- fix convolution bug by retain primary context for device (#894) · 0a1060a6
  Fenglei authored Apr 23, 2018
```
* fix convolution bug by retain primary context for device
* add release to destructor
```
  0a1060a6
21 Apr, 2018 3 commits

Add Inference Engine (IE) backend (#883) · 3d590dea

Adam Straw authored Apr 21, 2018

* ie backend and manager with passing unit tests except for select/function

* fix function_call and select

* simplify implemenation by removing support for convert and select

* remove manager

3d590dea

Modified CUDNN headers throughout nGraph: cudnn_v7.h -> cudnn.h. (#879) · fec7bee5
Chris Sullivan authored Apr 21, 2018
```
This better supports non-ubuntu/debian systems.
```
fec7bee5

Support concat with mkldnn and add a test case (#825) · 1a73f10c

Nishant Patel authored Apr 21, 2018

* Support Concat with mkldnn (two inputs)

* Support concat with mkldnn (multiple inputs)

* Address feedback

* Remove unused variable

* Allow rank two tensor to mkldnn for concat & add a test case for 2D inputs

* Add mkldnn_any layout to concat

* Make API changes to get consistent with master

1a73f10c

20 Apr, 2018 4 commits
- Remove linker option -pie since that option is only used when building (#882) · 85ba0160
  Sang Ik Lee authored Apr 20, 2018
```
executables not shared libraries. This change will remove the following
warning.
ld: warning: -pie being ignored. It is only used when linking a main executable
```
  85ba0160
- Update run_test.sh (#868) · ab77514c
  Michał Karzyński authored Apr 20, 2018
```
Force test to fail if any env fails.
```
  ab77514c
- Move runtime::Manager functionality into runtime::Backend (#875) · f430eac7
  Robert Kimball authored Apr 20, 2018
```
* Move runtime::Manager functionality into runtime::Backend

* Remove unused files

* remove obsolete function
```
  f430eac7
- New PR with framework DO docs only (#896) · ec26acf2
  L.S. Cook authored Apr 20, 2018
  
  ec26acf2
18 Apr, 2018 5 commits

remove obsolete classes (#867) · 37fca35c
Robert Kimball authored Apr 18, 2018
```
* remove obsolete classes
```
37fca35c

Remove usage of CMAKE_MAKE_PROGRAM as it slows down parallel build (#871) · 392eeb3f

Sang Ik Lee authored Apr 18, 2018

* Remove usage of CMAKE_MAKE_PROGRAM as it slows down parallel build

* Make make properly propagate to child and add back targeted build.

* Revert "Make make properly propagate to child and add back targeted build."

This reverts commit b4b1d8db0f0d42850e53d4e0f773261c292ccaf6.

392eeb3f

GPU Padding - add support for custom pad value and interior padding (#860) · 0be581c0

Chris Sullivan authored Apr 18, 2018

* * cuda_emitter::build_pad now utilizes pad_value.

* Added TypeInfo class for dispatching c-type information from the underlying ngraph element::Type.
  Adjusted test to use all_close when comparing floating point values (max_pool_2d_1channel_1image_overpadded).

* Refactored max_pool_1d into cuda_emitter so that numeric_limits<c_type>::lowest() could be used for initial max value.
Test max_pool_2d_1channel_1image_padded_negative_values now enabled and passes.

* Removed old function and switch to size_t to match ngraph.

* Added virtual dtor.

* Adding support for interior padding. All op::Pad functionality is now included.

* More info in runtime_error for checking of tensor dimensions. Removed commented code.

0be581c0

Weight Fusion (#853) · 8cb48d37

Nick Korovaiko authored Apr 18, 2018

* CPU weight fusion initial version

* add tests for weight_fusion

* address @jbobba's feedback

* before cleaning up convolution_weight_optimization.cpp

* clean up, rename, fix perms, fix format

8cb48d37

added clang compiler vector report and handle errors from argument parsing. (#876) · 3562da83
Louis Feng authored Apr 18, 2018

3562da83

17 Apr, 2018 3 commits
- reenable unit test (#869) · 9d57eee5
  Robert Kimball authored Apr 17, 2018
```
* reenable unit test
```
  9d57eee5
- Ease use of reshape operator. (#873) · 547f0524
  arogowie-intel authored Apr 17, 2018
```
- Set default input axes order.
```
  547f0524
- Refactor type annotation for reduce parameter. (#870) · 56bd183a
  arogowie-intel authored Apr 17, 2018
  
  56bd183a
16 Apr, 2018 8 commits
- Remove collect_tensor_views and clean up CallFrames (#866) · c6d1af4f
  Robert Kimball authored Apr 16, 2018
```
* remove tensor_call from backends

* remove obsolete methods
```
  c6d1af4f
- Fix element type for create_tensor of cached fprop nodes in backprop_derivative (#862) · aadc9ce4
  Adam Procter authored Apr 16, 2018
  
  aadc9ce4
- rename get_input_op to get_argument in python wrapper (#872) · d60cd1d5
  Robert Kimball authored Apr 16, 2018
  
  d60cd1d5
- CMake: Allow build target arch to be overridden (#859) · 99e02417
  Jaikrishnan Menon authored Apr 16, 2018
```
* CMake: Allow build target arch to be overridden

* Add DNGRAPH_TARGET_ARCH option to install docs
```
  99e02417
- get_input_op -> get_argument (#852) · 16571afd
  Nick Korovaiko authored Apr 16, 2018
```
* get_input_op -> get_argument

* more replacing

* more replacing2
```
  16571afd
- working version (#858) · d7216dfc
  Fenglei authored Apr 16, 2018
  
  d7216dfc
- [Py] Modify default paths (#845) · c7438a66
  tsocha authored Apr 16, 2018
```
* Update default paths in setup.py

* Update defaults arguments in tox
```
  c7438a66
- Update python wrapper to new Backend API (#863) · b5a0d734
  Robert Kimball authored Apr 16, 2018
```
* remove obsolete

* change to use new Backend API

* rename parameter
```
  b5a0d734
13 Apr, 2018 7 commits

Remove legacy Backend API (#848) · ec501913

Robert Kimball authored Apr 13, 2018

* remove deprecated

* remove all legacy Backend API usage

remove deprecated files

* pull in changes from master

* fix GPU calls

* disable tests in convolution generator

* update per PR comments. Enable performance counter feature.

* update per PR comments

* fix build error

* fix conditionally compiled test :(

ec501913

BatchNorm documentation (#856) · 1e091f6f
Scott Cyphers authored Apr 13, 2018
```
* BatchNorm documentation

* Fix typo, install URL

* Switch to desired BatchNorm
```
1e091f6f
make sure matcher respects argument order for non-commutative ops (#847) · b32b5c23
Nick Korovaiko authored Apr 13, 2018

b32b5c23
added the reference OS marker to the image name defined in the contrib/docker/Makefile (#841) · 638f36ee
DawnStone authored Apr 13, 2018
```
fixed variable settings in contrib/docker/make-dimage.sh script
```
638f36ee

[Py] Add python wrapper for nGraph Reduce operation. (#827) · c80a1076

arogowie-intel authored Apr 13, 2018

* Add python wrapper for nGraph Reduce operation.

- Add UT.

* Refactoring.

- Add UT case with default reduction on all axes.

* Extend `reduce` operation signature to also accept `Function` object.

- Add UT case.

* Fix formatting errors.

c80a1076

Add backend call validation and unit tests (#857) · e7cf2662
Robert Kimball authored Apr 13, 2018

e7cf2662

Add GPURuntimeContext and GPUPrimitiveEmitter to the gpu transformer (#837) · 026bede0

Chris Sullivan authored Apr 13, 2018

* Begin prototype of cudnn_emitter.

* Added GPURuntimeContext to gpu_external_function for passing through to JIT functions.

* gpu_emitters now utilize gpu runtime context.

* Moved cublas and cudnn handles into GPURuntimeContext pointer and out of callframe EntryPoint.

* Added CUDNNEmitter, comparable to MKLDNNEmitter,
which allows for cudnn kernels to be defined via
lambda primitives that are emitted and
subsequently called during graph execution.
An example implementation is provided for op::Sum.

* Added GPURuntimeContext to gpu_external_function for passing through to JIT functions.

* gpu_emitters now utilize gpu runtime context.

* Moved cublas and cudnn handles into GPURuntimeContext pointer and out of callframe EntryPoint.

* GPURuntimeContext should be stored as unique_ptr in external function.

* Extract raw pointer from unique for cudnn_emitter.

* Removing unrelated code from PR.

* GPURuntimeContext needs to be a strict C interface in case
the native compiler and clang are utilizing different glibc ABIs.
Updated to reflect this.

* Added cudnn::primitive typedef for better readability.

* Moved allocation of CudaFunctionPool to external function
so that it is available during gpu emission.

* Fixed too-late initialization of cudart.

* CUDNNEmitter moved into superset class GPUPrimitiveEmitter.
The GPUPrimitiveEmitter handles the emission of all gpu primitives,
including cudnn, cuda, and cublas. CUBLASEmitter support not yet included.

* Added unordered_map for cacheing primitives in the gpu_emitter.

* Added dtor to GPUPrimitiveEmitter to cleanup compiled functions.

* Adding back a serialized model graph that was accidentally rem* Added a few additional helpers to use ngraph::row_major_strides.

* added whitespace per @fengleitian's comment

* Remove implicit type conversions from size_t to int.

* Add op::MaxPool, op::MaxPoolBackprop and op::Pad to GPU transformer (#817)

* Added pooling for 1 and 2dimensions. 1d uses a cuda kernel and 2d utilizes cudnn.
Padding is not yet supported.

* Normalized call signature on gpu emission for 1d max pool. Added a few comments.

* Max pool backprop impl. inprogress. Amend this commit.

* Max pool backprop implemented. Note that cuDNN
requests the output tensor for the maxpool operation but it is not required for computation.

* Formatting and invokation for maxpool changed.

* Fixed too-late initialization of cudart.

* Added padding kernel that is used with maxpool. Need to investigate remaining tests.

* Changed dimensionality check to correctly
determine if data is 1d or not.

* Added 3d MaxPooling (forward), verified by forcing 2d case to use Nd pooling routines.

* Added 3d MaxPooling (backward), verified by forcing 2d case to use Nd pooling routines.

* Moved cudnn prologues for maxpool into ngraph runtime and out of primitive so
that the only execution occuring on the JIT runtime is the evaluation of the op kernel.

* Refactored forward and backward pooling into single CUDNNEmitter::build_pooling interface
with a runtime switch to determine if the op is forward or backward propagation.

* Cache preconstructed cudnn kernel for maxpool if it has already been constructed.

* Forgot to add padding arrays back into cudnn kernel for MaxPool in the 2d case.

* Fixed namespace issues and use join(...,'_')

* Refactored 4d/Nd tensor descriptor builder into single function.

* Changed conditionals and comments. Now throws if MaxPool on more than 3 spatial dimensions is requested.

* Fixed forward declare for GPURuntimeContext (class -> struct).

* Clang complains about missing braces on brace-initializer. Fixed implicit conversions.

* Fixed implicit conversions (clang).

* Reverting changes on autodiff test for maxpool. @Krovatkin will update later.

026bede0

12 Apr, 2018 1 commit
- CPU: Fix element count calculation (#850) · dfae57c1
  Jaikrishnan Menon authored Apr 12, 2018
  
  dfae57c1