Commits · b409cc72e7b69bdbd4a52f97c9393f86eec2f16d · submodule / ngraph

20 Jul, 2018 15 commits
- CPU Direct Execution: Implement Maximum and Minimum · b409cc72
  Jaikrishnan Menon authored Jul 20, 2018
  
  b409cc72
- CPU Direct Execution: Implement Sum reductions · 57f28799
  Jaikrishnan Menon authored Jul 20, 2018
  
  57f28799
- CPU Direct Execution: Implement logical and relational op kernels · 95279e32
  Jaikrishnan Menon authored Jul 20, 2018
  
  95279e32
- CPU Direct Execution: Add cwise Pow, Log and Exp kernels · 5dc3867b
  Jaikrishnan Menon authored Jul 20, 2018
  
  5dc3867b
- CPU Direct Execution: Implement Min · 08ab88be
  Jaikrishnan Menon authored Jul 20, 2018
  
  08ab88be
- CPU Direct Execution: Implement common reduction builder and Max · 60ae9e8c
  Jaikrishnan Menon authored Jul 20, 2018
```
Also modify existing kernel so it works within the builder framework
```
  60ae9e8c
- CPU Direct Execution: Implement Broadcast · 2d543ee4
  Jaikrishnan Menon authored Jul 20, 2018
  
  2d543ee4
- CPU Direct Execution: Migrate Reshape to auto-registration · 751e6269
  Jaikrishnan Menon authored Jul 20, 2018
  
  751e6269
- CPU Direct Execution: Migrate all convolution variants to auto-registration · f8d4daa2
  Jaikrishnan Menon authored Jul 20, 2018
  
  f8d4daa2
- CPU Direct Execution: Migrate AvgPool to auto-registration · 9376bd0c
  Jaikrishnan Menon authored Jul 20, 2018
  
  9376bd0c
- CPU Direct Execution: Disable transitive behavior for pass execution · aebd416a
  Jaikrishnan Menon authored Jul 20, 2018
  
  aebd416a
- CPU Direct Execution: Implement Floor, Negative and Sqrt · b56828fb
  Jaikrishnan Menon authored Jul 20, 2018
  
  b56828fb
- CPU Direct Execution: Implement Logical and Relational binary ops · 0692dd8c
  Jaikrishnan Menon authored Jul 20, 2018
```
and remove broadcast, which will be replaced with an Eigen implementation
```
  0692dd8c
- CPU Direct Execution: Implement Subtract · e2d1b5bc
  Jaikrishnan Menon authored Jul 20, 2018
  
  e2d1b5bc
- CPU Direct Execution: Refactor and implement builder auto-registration · 6576932f
  Jaikrishnan Menon authored Jul 20, 2018
```
This allows op builders to be self-contained changesets
```
  6576932f
19 Jul, 2018 2 commits

update version and add glossary defs (#1215) · 1df7602e

L.S. Cook authored Jul 19, 2018

* update version and add glossary defs

* clean up graph rewrite code blocks

* PR feedback

* add better details to LSTM def

* RNN def generalized

* adding fancy formulas to RNN def glossary entry

* Address API breaking change in PR 1164

* all of the documentation re default install path needed updated with pr 1164

* Assert manual compilation process to build ngraph_dist locally as a sensible default

1df7602e

IntelGPUBackend: const, div, maxpool and max operations (#1234) · 8908c9df

shssf authored Jul 19, 2018

* IntelGPUBackend: const, div, maxpool and max operations

* IntelGPUBackend: negative, abs, relu, sqrt, tanh and substract operations

* Update intelgpu_backend.cpp

8908c9df

18 Jul, 2018 13 commits

Pool tests updated to check all backends (#1245) · e2255fbd
Robert Kimball authored Jul 18, 2018
```
* make pool test check backends other than CPU

* more unit test cleanup
```
e2255fbd
Fix incorrect divide-by-zero test (#1243) · 7c7c5d62
Jaikrishnan Menon authored Jul 18, 2018

7c7c5d62
Fix run_passes semantics, but default to old transitive behavior (#1242) · e8fa980a
Jaikrishnan Menon authored Jul 18, 2018

e8fa980a

onnx [3]: add 'constant' operator (#1197) · ad12723b

Artur Wojcik authored Jul 18, 2018

* onnx: add 'constant' operator
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: getting attribute value by name
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: fix code style
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: fix clang compilation warnings
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: incorporate review comments
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

ad12723b

CPU Loop Kernel Fusion optimization (#1190) · e3ad1b31

Nick Korovaiko authored Jul 18, 2018

* cpu loop kernel fusion pass

*  remove extra code

* bounded relu test

* address scotts feedback

e3ad1b31

IntelGPUBackend: BatchNorm 5x1 operation (#1244) · 9a3a0314

shssf authored Jul 18, 2018

* IntelGPUBackend: BatchNorm 5x1 operation

* Update intelgpu_op_batchnorm.cpp

* PR1244 Comments are adressed

9a3a0314

change install rpath to (#1237) · f3d7946b
Robert Kimball authored Jul 18, 2018

f3d7946b
Emit cacheable attribute on parameter node (#1240) · e91897e2
Jayaram Bobba authored Jul 18, 2018

e91897e2
Fix a segfault in the strided conv optimization (#1230) · 4334fe56
Adam Procter authored Jul 18, 2018
```
* Fix a segfault in the strided conv optimization

* Only bail if all *live* users are not Convolution
```
4334fe56
Patch typo (#1233) · 782594d2
L.S. Cook authored Jul 18, 2018
```
* Draft of updates for JIRA tasks WIP

* meta to correct spelling
```
782594d2

Modify TBB graph nodes creation and deletion (#1226) · 5190ee90

Amy Zhuang authored Jul 18, 2018

* Modify TBB graph nodes creation and deletion
* Add a graph* member to CPURuntimeContext.
* Create nodes the first time a function is called, all the following calls only exectue the computation.
* Delete nodes when cleanup_runtime_context is called.

* Add TBB global_control* and task_scheduler_init* members to CPURuntimeContext.

* Remove one comment.
Do not write two TBB header files and one #define to generated c++ source code.

* Move TBB header file and #define before other header files in generated c++ source code.

* Move one comment to the top in generated c++ source code.

5190ee90

Inplace results (#1162) · 36473a8a
Nick Korovaiko authored Jul 18, 2018
```
* inplace results

* fix parameter propagation

* fix python tests
```
36473a8a
Make Common Function Elimination a pass (#1236) · f078800c
Robert Kimball authored Jul 18, 2018
```
* change GPU to use cfe pass

* update per review comments
```
f078800c

17 Jul, 2018 2 commits

Fix broken tensor write/bug introduced by a cleanup (#1232) · 4d3e4721
Jaikrishnan Menon authored Jul 17, 2018

4d3e4721

Added more convolution variants to DEX (#1223) · 9bb0b653

Jayaram Bobba authored Jul 17, 2018

* CPU Direct Execution: Implement ConvertLayout and refactor

* CPU Direct Execution: Implement Convolution

* 1) Adds computation reuse to direct execution
2) Add avg_pool, broadcast and convolution_bias to direct execution
3) Moved some computation reuse utility functions to graph_utils

* Use lists instead of vectors to avoid reallocation overheads

* - Added convolution variants to direct execution
- Removed ConvolutionBiasRelu, use ConvolutionBias instead
- Reduced code duplication by moving functionality to mkldnn_emitter
  from cpu_emitter

* Style fix

* Moved mkldnn build_convolution to a templated method

* Style fix

* refactored mkldnn conv bprop builders

* Style fix

9bb0b653

14 Jul, 2018 4 commits
- move long building tests to the be the first tests built with the hope of… · cce0c224
  Robert Kimball authored Jul 14, 2018
```
move long building tests to the be the first tests built with the hope of reducing build time. (#1229)
```
  cce0c224
- Remove regex from Backend.cpp (#1228) · 295ed8eb
  Robert Kimball authored Jul 14, 2018
  
  295ed8eb
- make cuda call async, add cuda async timer (#1204) · e6c3d5e3
  Fenglei authored Jul 14, 2018
```
* using async gpu timers

* remove sync for cuda calls, add async gpu stopwatch, add count to timing-detail

* add debug sync

* make timer static

* move timer to runtime context
```
  e6c3d5e3
- Draft of updates for JIRA tasks WIP (#1227) · 1a074e5a
  L.S. Cook authored Jul 14, 2018
```
* Draft of updates for JIRA tasks WIP

* correct typo

* more cleanup

* more cleanup
```
  1a074e5a
13 Jul, 2018 4 commits

Refactored GPU backend state into BackendContext (#1186) · 55a25d41

Chris Sullivan authored Jul 13, 2018

* Refactored GPU backend state into BackendContext and moved it to the highest level GPU_Backend.
Some bugs have appeared in so doing. Needs investigation.

* extra *block_size

* change grid_size to threads

* Bug fix in softmax cache parameters.

* Additional bug fix for maxpool1d cache parameters.

* Bug fix in softmax cache parameters.

* Additional bug fix for maxpool1d cache parameters.

* Remove temporary print statements.

* Use nthreads in primitive hash.

* Switched from using stack references for cudnn and cublas handles to heap pointers held only the c-struct GPURuntimeContext but managed by the GPU_Backend.

* Refactored the use of GPURuntimeContext* ctx throughout the emitters.

* Use std::prev instead of operator-- for memory iteratory capture

* bug fix from abaf1d7

55a25d41

Backend/API: Implementation of the call method for IntelGPU (#1199) · 8bde818c

dmyershov authored Jul 13, 2018

* Backend/API: Implementation of the call method for IntelGPU

* intel_gpu_style_fix_1199

* Copy memory from clDNN to Tensor

* Code style fix in 1199.2

8bde818c

get_subgraph_outputs (towards checking that intermediate nodes in a matched graph not used) (#1207) · 83e7dba5
Nick Korovaiko authored Jul 13, 2018
```
* get_subgraph_outputs

* simplify the condition
```
83e7dba5
minor speed increase (#1218) · 33b54ce1
Robert Kimball authored Jul 13, 2018

33b54ce1