Commits · 237c4803c59138bd4c521765977b0e46dd94c333 · submodule / ngraph

02 Aug, 2018 11 commits

LRN (#1282) · 237c4803

Nick Korovaiko authored Aug 02, 2018

* lrn init

* fix comment

* mkldnn lrn (#1295)

* add serializer + fix compiler warnings

237c4803

Fix first_iteration (#1294) · 83a9d252

Jaikrishnan Menon authored Aug 02, 2018

* Fix the first_iteration flag so it works when more than one call-frame exists

Static variables defined in lambda expressions are not private to a lambda so
move this to the runtime context

* Shave off a few microseconds by initializing intermediates exactly once

* Make all execution paths use first_iteration in the runtime context

83a9d252

[Py] Add __repr__ to Strides and CoordDiff (#1291) · 870ab827
Michał Karzyński authored Aug 02, 2018
```
* [Py] Add __repr__ to Strides and CoordDiff

* Apply clang-format

* Repr fix

* Apply clang-format
```
870ab827
[Py] Add convolution_backprop_data to API (#1292) · 5c56923a
Michał Karzyński authored Aug 02, 2018
```
* [Py] Add convolution_backprop_data to API

* Conv fix
```
5c56923a

softmax & convolution memory primitive cacheing (#1290) · bb94fa85

Chris Sullivan authored Aug 02, 2018

* Updated softmax.

* Formatting.

* Updated convolution.

* Use build_primitive overloading. Add helper to emit type_string given a node.

* Formatting.

* Update ConvolutionBackpropData.

* convolution backprop & max pool memory primitive cacheing (#1303)

* Updated ConvolutionBackpropFilters.
* Update MaxPool.

* Update Max and Min. (#1307)

bb94fa85

gpu element op optimize (#1287) · eba9439b
Fenglei authored Aug 02, 2018
```
* move add,mult,min,max,sqrt to elementwise_op, increase op per threads
```
eba9439b
Implement trigonometric ops for direct execution. (#1289) · bcd1daa2
Amy Zhuang authored Aug 02, 2018
```
* Implement trigonometric ops for direct execution.

* Rename files.
```
bcd1daa2

Fix SUSE build and run errors (#1284) · 54e5a816

Robert Kimball authored Aug 02, 2018

* build on suse w/gcc 4.8.5

* fix SUSE build error

* add comments

* remove template function

* update per review comment

* fix nan check emitted code

54e5a816

Interpreter implementation of batch norm bprop (#934) · c6a0fae3

varun-intel authored Aug 02, 2018

* updated

* type prop

* disable test in manifest

* try to exclude

* style

* double

* dobule

* more

* style

* more

* vecs

* fix goe

c6a0fae3

wip (#1293) · 2a64baca
Robert Kimball authored Aug 02, 2018

2a64baca

Work around some buggy (and deprecated) rpath directives (#1256) · 84546bbc

Jaikrishnan Menon authored Aug 02, 2018

* Work around some buggy (and deprecated) rpath directives

* Add missing newline

* Revert "Add missing newline"

This reverts commit 95aebb7f14850afcd59c53ece0bb4663b8c38660.

* Encoding fixes

84546bbc

01 Aug, 2018 6 commits

More efficient sum for some cases (#1251) · f8941a12

Louis Feng authored Aug 01, 2018

* hacking to support dot of 3 by 2 inputs with gemm_batch.

* clean up.

* testing inplace reshape.

* fixed a compile error.

* added comments on todo.

* check for output.

* check for annotation.

* more optimizations WIP.

* sum simd.

* moved parallel for

* testing sum vectorization.

* fixed merge errors.

* sum wip.

* more logic.

* sum refactor and clean up.

* clean up.

* removed unrelated changes.

* removed related changes from merge.

* fixed clang compile errors.

f8941a12

IntelGPU backend: Sum and redeveloped Broadcast operation (#1276) · 92adea38
shssf authored Aug 01, 2018

92adea38

move onehot and reverse op to cuda_emitter (#1266) · cb84305e

Fenglei authored Aug 01, 2018

* move to cuda_emiiter

* fix bug, clang format

* size_t to uint32_t

* reverse_axes

* add rank back, clang format

* remove unused code and file

* remove unused code and file

* manually merge with master

cb84305e

IntelGPU backend: Power, Sigmoid and ReluBackprop operations (#1286) · 167844e4

Anna Alberska authored Aug 01, 2018

* IntelGPU backend: Power, Sigmoid and ReluBackprop operations

* style changed to ReluBackprop

* Update intelgpu_backend.cpp

167844e4

IntelGPU backend: Convolution operation (#1285) · f534650d
dmyershov authored Aug 01, 2018

f534650d

Refactoring MMB (#1224) · 6bca3efd

Nick Korovaiko authored Aug 01, 2018

* rank3xrank2 cpu_emitter version 1

* refactoring matmulbias

* add comment

6bca3efd

29 Jul, 2018 2 commits

IntelGPU backend: Dot operation (partially implemented) (#1275) · 5927bbe4

shssf authored Jul 29, 2018

* IntelGPU backend: Dot operation (partially implemented)

* PR1275. Debug output deleted.

* PR1275. Comments addressed

5927bbe4

Add no-throw error checks (#1264) · c007740b

Chris Sullivan authored Jul 29, 2018

* Broadcast and Pad bug fix.

* Added NO_THROW version of the cuda error checking defines. Now utilizing these in dtors.

This reverts commit 68d9d6eafb1475c83c47229ab3c784c3d392ddbd.

* Revert "Broadcast and Pad bug fix."

This reverts commit 099c79792a2e7b9b8727b48de90f623953691f4c.

c007740b

28 Jul, 2018 3 commits

Rama/travis fixes fornow (#1272) · cd5fe431

rsketine authored Jul 28, 2018

* Update Dockerfile

took out make -j 8 as we are running out of virtual memory.
added df -k for getting the disk space info in logs.

* Update .travis.yml

As we bring up 3 containers here i am reducing to 2 and trying to see if this is deployed on different  machines then should not matter.

* Update .travis.yml

3 in parallel should pass also 2 of then are passing so trying this out

cd5fe431

IntelGPU backend: Pad operation (#1267) · e2e7042a
shssf authored Jul 28, 2018
```
* IntelGPU backend: Pad operation

* PR1267. Comments addressed
```
e2e7042a

Add TBB flow graphs to DEX. (#1247) · 1c2e5b7a

Amy Zhuang authored Jul 28, 2018

* Add TBB flow graphs to DEX.

* Make edges from dummy start node to head nodes when traversing nodes.

* Use static_cast to cast TBB graph node.
Undefine __TBB_PREVIEW_LIGHTWEIGHT_POLICY.

* Code formatting.

* Remove clang wreserved-id-macro warning.

1c2e5b7a

27 Jul, 2018 8 commits

Add NGRAPH_INTRA_OP_PARALLELISM to control size of thread pools. (#1248) · 8ad38f2e
Amy Zhuang authored Jul 27, 2018
```
* Add NGRAPH_INTRA_OP_PARALLELISM to control size of thread pools.

* Initialize variable.
```
8ad38f2e
is_contained (#1257) · 81c48453
Nick Korovaiko authored Jul 27, 2018

81c48453
switch to using get_subgraph_outputs in LoopKernel (#1255) · 6457ed2e
Nick Korovaiko authored Jul 27, 2018

6457ed2e
CSE constant (#1271) · 953c65f8
Nick Korovaiko authored Jul 27, 2018

953c65f8
[Py] change python loger names from __file__ to __name__ (#1260) · f0283c6f
tsocha authored Jul 27, 2018
```
* Update input_validation.py

* Update runtime.py

* Update types.py

* Update broadcasting.py
```
f0283c6f
gpu concat optimize (#1259) · 38ba5c12
Fenglei authored Jul 27, 2018
```
* optimize concat

* compile sucess

* multi inputs

* clang format
```
38ba5c12

Add some convenience macros/classes for error messages (#1258) · deacf29a

Adam Procter authored Jul 27, 2018

* Testing out some ideas for better error messages on AvgPool

* Add uncaught_exception() check to ConstructionAssertLogger dtor

* More general assertion class, not homed inside Node

* Minor formatting change

* NODE_ASSERT for type prop failure

* Produce lighter-weight DummyAssertionHandler when assertion succeeds

* New ctor for AssertionHelper that takes a single location arg; more const&-ness for the constructors

* Remove move constructor for AssertionHelper; fix broken test in assertion.cpp

* Miscellaneous improvements

* Templatized AssertionHelper so different exception classes can be used; implemented TYPE_CHECK_ASSERT around this
* Changed from a "stack" of locations to a single location (the stack was too complicated)
* Added "FAIL" classes/macros which do not take a condition

* Rename a helper function

* Cleanup, cruft removal

* Add test to make sure the assert helper has the lifetime we expect

* Missing includes

deacf29a

numpy is also needed to build python API docs (#1253) · 289586ab
L.S. Cook authored Jul 27, 2018

289586ab

26 Jul, 2018 4 commits

change function to take reference rather than shared_ptr (#1238) · 3b578db4
Robert Kimball authored Jul 26, 2018

3b578db4

Doc distributed training (#1104) · f78133d2

L.S. Cook authored Jul 26, 2018

* editing how to execute computation file for clarity and linenos

* Add placeholder for runtime docs

* Update section on backends, interpreter, and FPGA options

* add updated master to fix python_ci

* Weird autosummary issue reverted

* Clarify new section

* fix up docs

* Update pattern matcher doc based on Nik's presentation slides WIP

* Update doc structure and examples

* remove old folder

* Fix broken Tensorview refs

* new section on distr training

* updated index w/drafted outline

* . helping people document code more efficiently

* edit WIP branch

* WIP editing

* WIP editing

* init distributed doc

* PR review edits

* modify dist doc and dist mnist_mlp

* Finish PR review comment fixes so far

* Improving distributed training docs

* Fix build error now that we have documented inteface backends use

* update example build and run

* update how-to distributed training doc

* Editing distr train docs

* Reword section to avoid strange doc build error

* rebuild for zero errors for CI

* split patternmatcher PR

* PR feedback added

* Add more help and detail for MXNet and neon distr

* Resolve merge conflicts due to patternmatcher doc split

* Resolve merge conflicts due to patternmatcher doc split

* Resolve build errors manually

* These files are already added to the branch

* fix style

* update with glossary def and link to Intel paper on synchronous SGD

* fix link to sgd

* remove comm_rank in dist example

f78133d2

Fix launch parameter bug for broadcast and pad. (#1261) · dcdaf26e
Chris Sullivan authored Jul 26, 2018
```
* Broadcast and Pad bug fix.
```
dcdaf26e

IntelGPU backend: broadcast operation (#1252) · d4349db8

shssf authored Jul 26, 2018

* IntelGPUBackend: Broadcast operation

* IntelGPUBackend: more tests for Broadcast operation

* Move macro to static C function in Broadcast tests

d4349db8

25 Jul, 2018 1 commit
- Merge pull request #1249 from NervanaSystems/jmenon/dex4 · 8c1aad8f
  Jayaram Bobba authored Jul 25, 2018
```
CPU Direct Execution Part 4
```
  8c1aad8f
23 Jul, 2018 5 commits
- Merge branch 'master' into dex4 · 0b20b1a7
  Jaikrishnan Menon authored Jul 23, 2018
  
  0b20b1a7
- CMake: Allow switching out the default linker for gold (#1254) · 7a35cf81
  Jaikrishnan Menon authored Jul 23, 2018
  
  7a35cf81
- CPU Direct Execution: Implement Or · 2c69f2bf
  Jaikrishnan Menon authored Jul 23, 2018
  
  2c69f2bf
- CPU Direct Execution: Implement And · cf220930
  Jaikrishnan Menon authored Jul 23, 2018
  
  cf220930
- Revert "CI: Enable gold" · 05545092
  Jaikrishnan Menon authored Jul 23, 2018
```
This reverts commit 549a4fd1.
```
  05545092