Commits · 1df7602e78aa8598795de48bb8f39741bfb1cdd6 · submodule / ngraph

19 Jul, 2018 2 commits

update version and add glossary defs (#1215) · 1df7602e

L.S. Cook authored Jul 19, 2018

* update version and add glossary defs

* clean up graph rewrite code blocks

* PR feedback

* add better details to LSTM def

* RNN def generalized

* adding fancy formulas to RNN def glossary entry

* Address API breaking change in PR 1164

* all of the documentation re default install path needed updated with pr 1164

* Assert manual compilation process to build ngraph_dist locally as a sensible default

1df7602e

IntelGPUBackend: const, div, maxpool and max operations (#1234) · 8908c9df

shssf authored Jul 19, 2018

* IntelGPUBackend: const, div, maxpool and max operations

* IntelGPUBackend: negative, abs, relu, sqrt, tanh and substract operations

* Update intelgpu_backend.cpp

8908c9df

18 Jul, 2018 13 commits

Pool tests updated to check all backends (#1245) · e2255fbd
Robert Kimball authored Jul 18, 2018
```
* make pool test check backends other than CPU

* more unit test cleanup
```
e2255fbd
Fix incorrect divide-by-zero test (#1243) · 7c7c5d62
Jaikrishnan Menon authored Jul 18, 2018

7c7c5d62
Fix run_passes semantics, but default to old transitive behavior (#1242) · e8fa980a
Jaikrishnan Menon authored Jul 18, 2018

e8fa980a

onnx [3]: add 'constant' operator (#1197) · ad12723b

Artur Wojcik authored Jul 18, 2018

* onnx: add 'constant' operator
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: getting attribute value by name
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: fix code style
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: fix clang compilation warnings
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: incorporate review comments
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

ad12723b

CPU Loop Kernel Fusion optimization (#1190) · e3ad1b31

Nick Korovaiko authored Jul 18, 2018

* cpu loop kernel fusion pass

*  remove extra code

* bounded relu test

* address scotts feedback

e3ad1b31

IntelGPUBackend: BatchNorm 5x1 operation (#1244) · 9a3a0314

shssf authored Jul 18, 2018

* IntelGPUBackend: BatchNorm 5x1 operation

* Update intelgpu_op_batchnorm.cpp

* PR1244 Comments are adressed

9a3a0314

change install rpath to (#1237) · f3d7946b
Robert Kimball authored Jul 18, 2018

f3d7946b
Emit cacheable attribute on parameter node (#1240) · e91897e2
Jayaram Bobba authored Jul 18, 2018

e91897e2
Fix a segfault in the strided conv optimization (#1230) · 4334fe56
Adam Procter authored Jul 18, 2018
```
* Fix a segfault in the strided conv optimization

* Only bail if all *live* users are not Convolution
```
4334fe56
Patch typo (#1233) · 782594d2
L.S. Cook authored Jul 18, 2018
```
* Draft of updates for JIRA tasks WIP

* meta to correct spelling
```
782594d2

Modify TBB graph nodes creation and deletion (#1226) · 5190ee90

Amy Zhuang authored Jul 18, 2018

* Modify TBB graph nodes creation and deletion
* Add a graph* member to CPURuntimeContext.
* Create nodes the first time a function is called, all the following calls only exectue the computation.
* Delete nodes when cleanup_runtime_context is called.

* Add TBB global_control* and task_scheduler_init* members to CPURuntimeContext.

* Remove one comment.
Do not write two TBB header files and one #define to generated c++ source code.

* Move TBB header file and #define before other header files in generated c++ source code.

* Move one comment to the top in generated c++ source code.

5190ee90

Inplace results (#1162) · 36473a8a
Nick Korovaiko authored Jul 18, 2018
```
* inplace results

* fix parameter propagation

* fix python tests
```
36473a8a
Make Common Function Elimination a pass (#1236) · f078800c
Robert Kimball authored Jul 18, 2018
```
* change GPU to use cfe pass

* update per review comments
```
f078800c

17 Jul, 2018 2 commits

Fix broken tensor write/bug introduced by a cleanup (#1232) · 4d3e4721
Jaikrishnan Menon authored Jul 17, 2018

4d3e4721

Added more convolution variants to DEX (#1223) · 9bb0b653

Jayaram Bobba authored Jul 17, 2018

* CPU Direct Execution: Implement ConvertLayout and refactor

* CPU Direct Execution: Implement Convolution

* 1) Adds computation reuse to direct execution
2) Add avg_pool, broadcast and convolution_bias to direct execution
3) Moved some computation reuse utility functions to graph_utils

* Use lists instead of vectors to avoid reallocation overheads

* - Added convolution variants to direct execution
- Removed ConvolutionBiasRelu, use ConvolutionBias instead
- Reduced code duplication by moving functionality to mkldnn_emitter
  from cpu_emitter

* Style fix

* Moved mkldnn build_convolution to a templated method

* Style fix

* refactored mkldnn conv bprop builders

* Style fix

9bb0b653

14 Jul, 2018 4 commits
- move long building tests to the be the first tests built with the hope of… · cce0c224
  Robert Kimball authored Jul 14, 2018
```
move long building tests to the be the first tests built with the hope of reducing build time. (#1229)
```
  cce0c224
- Remove regex from Backend.cpp (#1228) · 295ed8eb
  Robert Kimball authored Jul 14, 2018
  
  295ed8eb
- make cuda call async, add cuda async timer (#1204) · e6c3d5e3
  Fenglei authored Jul 14, 2018
```
* using async gpu timers

* remove sync for cuda calls, add async gpu stopwatch, add count to timing-detail

* add debug sync

* make timer static

* move timer to runtime context
```
  e6c3d5e3
- Draft of updates for JIRA tasks WIP (#1227) · 1a074e5a
  L.S. Cook authored Jul 14, 2018
```
* Draft of updates for JIRA tasks WIP

* correct typo

* more cleanup

* more cleanup
```
  1a074e5a
13 Jul, 2018 11 commits

Refactored GPU backend state into BackendContext (#1186) · 55a25d41

Chris Sullivan authored Jul 13, 2018

* Refactored GPU backend state into BackendContext and moved it to the highest level GPU_Backend.
Some bugs have appeared in so doing. Needs investigation.

* extra *block_size

* change grid_size to threads

* Bug fix in softmax cache parameters.

* Additional bug fix for maxpool1d cache parameters.

* Bug fix in softmax cache parameters.

* Additional bug fix for maxpool1d cache parameters.

* Remove temporary print statements.

* Use nthreads in primitive hash.

* Switched from using stack references for cudnn and cublas handles to heap pointers held only the c-struct GPURuntimeContext but managed by the GPU_Backend.

* Refactored the use of GPURuntimeContext* ctx throughout the emitters.

* Use std::prev instead of operator-- for memory iteratory capture

* bug fix from abaf1d7

55a25d41

Backend/API: Implementation of the call method for IntelGPU (#1199) · 8bde818c

dmyershov authored Jul 13, 2018

* Backend/API: Implementation of the call method for IntelGPU

* intel_gpu_style_fix_1199

* Copy memory from clDNN to Tensor

* Code style fix in 1199.2

8bde818c

get_subgraph_outputs (towards checking that intermediate nodes in a matched graph not used) (#1207) · 83e7dba5
Nick Korovaiko authored Jul 13, 2018
```
* get_subgraph_outputs

* simplify the condition
```
83e7dba5
minor speed increase (#1218) · 33b54ce1
Robert Kimball authored Jul 13, 2018

33b54ce1
CPU Direct Execution: Implement Reshape (#1225) · 346f480f
Jaikrishnan Menon authored Jul 13, 2018

346f480f

Jbobba/dex computation reuse (#1219) · 7d59542d

Jayaram Bobba authored Jul 13, 2018

* CPU Direct Execution: Implement ConvertLayout and refactor

* CPU Direct Execution: Implement Convolution

* 1) Adds computation reuse to direct execution
2) Add avg_pool, broadcast and convolution_bias to direct execution
3) Moved some computation reuse utility functions to graph_utils

* Use lists instead of vectors to avoid reallocation overheads

* - Style fix

* style fix

7d59542d

gpu_external_function and gpu constant memory refactor (#1189) · 260cb90d

Fenglei authored Jul 13, 2018

* refactor external function

* wokring version

* fix bug

* add emit_fucntions, emit_declare_constants, emit_declare_functions

* add std::

* add functions declaration

* fix bugs

* fix bugs

* separate temp memory allocation and release

* add invoke_constant_ptr function, clean up outputs for function

* fix bugs, compiled ok

* add ctx to emit_declare_constant

* cleanup code, code style

* remove using std, code style

* revert std changes

* change function names based Chris's comments

* add ResultCopyElimination to pass_manager

* clang format

260cb90d

Backend/API: Implementation of ADD and MUL operations in the compile() (#1200) · 2c345798

shssf authored Jul 13, 2018

* Backend/API: Implementation of ADD and MUL operations in the compile method for IntelGPU

* Branch merge conflicts resolved

* Parameters number check moved to function. RESULT operation handling added.

2c345798

reshape inplace without copy data if possible. (#1206) · 268853d0
Louis Feng authored Jul 13, 2018

268853d0

Fix incorrect hash strings for softmax and 1d maxpool. (#1195) · 4659d60d

Chris Sullivan authored Jul 13, 2018

* Bug fix in softmax cache parameters.

* Additional bug fix for maxpool1d cache parameters.

* Formatting.

* Use nthreads in primitive hash.

4659d60d

gpu reshape optimization (#1174) · b5e69eaa

Fenglei authored Jul 13, 2018

* add gpu_timer to external function

* compiled version

* working version

* using block_begin and block_end

* add the missing '
;'

* move slice to cuda emiter

* change size_t to uint32_t in kernel

* working version

* change block size from 1 to 64

* fix bugs

* nthreads need to be size_t in broadcast op

* add rank to kernel name hash

* change reshape to cuda_emitter

* fix bugs

* bug, remove rank from kernel

* clang format

* update slice in convolution

* resolve index conflict

* change align to align_to_blocksize, add overflow check

* add gird size check and fix pool merge bug

* code style, change names

* fix merge conflict

* change kernel_runner to kernel_launch

b5e69eaa

12 Jul, 2018 4 commits

Added reshape and broadcast to CSE (#1221) · cf568ef9

Louis Feng authored Jul 12, 2018

* reshape inplace without copy data if possible.

* added reshape and broadcast to CSE.

* Fixed debug messages.

cf568ef9

remove custom install path (#1164) · 41942f8b

Robert Kimball authored Jul 12, 2018

* remove custom install path

* fix travis build

* Add NGRAPH_INSTALL_PREFIX as an alias for CMAKE_INSTALL_PREFIX to make our unit tests pass.

* change install path setting

41942f8b

Bob/backend list (#1220) · 8e1954d0

Robert Kimball authored Jul 12, 2018

* open only the unversioned library but check that it is built against the correct version of ngraph

* review comments

8e1954d0

gpu safe call - add CUDA_RT_SAFE_CALL (#1222) · 97b19515

Fenglei authored Jul 12, 2018

* add CUDA_SAFE_CALL to all cuda calls

* add CUDA_RT_SAFE_CALL

* add null ptr check before free

* init pointer to nullptr

* consolidate conditions

97b19515

11 Jul, 2018 2 commits
- DEX Part 3 (#1184) · d37fa712
  Jaikrishnan Menon authored Jul 11, 2018
```
* CPU Direct Execution: Implement ConvertLayout and refactor

* CPU Direct Execution: Implement Convolution
```
  d37fa712
- Disabeled RNN fusion pass in IA transformer (#1217) · 4cd2c602
  Pruthvi authored Jul 11, 2018
  
  4cd2c602
10 Jul, 2018 1 commit
- [Py] Enable retrieve data from constant node. (#1214) · 785c1ce7
  Adam Rogowiec authored Jul 10, 2018
```
* Enable retrieving data from Constant in python.

* Test on wide value range.
```
  785c1ce7
09 Jul, 2018 1 commit

Liveness optimizations (#1210) · 0c721561

Robert Kimball authored Jul 09, 2018

* Faster liveness.

Memory manager optimized for non-sharing of tensors.
Add pass manager profiler.

* Move pass profiler to a separate PR

* Move Memory Layout optimizations to a separate PR

* use find instead of count

0c721561