Commits · 8009b47568821278ddc13a3871589dd986c008f1 · submodule / ngraph

04 Aug, 2018 2 commits
- Fix bugs in StaticInitializer and CudaContextManager (#1321) · 8009b475
  Chris Sullivan authored Aug 04, 2018
```
* Bug fix: StaticInitializer.

* Make CudaContextManager a member of GPU_Backend::BackendContext.

* fix formatting
```
  8009b475
- IntelGPU backend: Code refactored. No algo changed. (#1328) · 8ab89b29
  shssf authored Aug 04, 2018
  
  8ab89b29
03 Aug, 2018 15 commits

nbench: add option to run all models in a directory (#1279) · 2b26df18
Robert Kimball authored Aug 03, 2018
```
* add option to run all models in a directory

* add print for exception from benchmark
```
2b26df18
bn bprop test fix, comments and throws (#1325) · 11b992a7
Nick Korovaiko authored Aug 03, 2018

11b992a7

Preallocate intermediate buffers (#1231) · 0599a628

Chris Sullivan authored Aug 03, 2018

* Utilize GPUMemoryManager/Allocator for preallocation of intermediate tensor buffer memory.

* Formatting.

* Merge with master required rework of memory due to CFE pass. Moved function memory pool allocation to pass as a result.

* Formatting.

* Added pass source files.

* Updated tests to account for new assert check. All GPUAllocators should be deconstructed before allocation is made in GPUMemoryManager.

* GPUAllocator::close() can be used to close the allocator prior to destruction

* Removed open allocators. Replaced check with inspection of pass::MemoryManager node list.

* Formatting.

* Rename m_memory_buffers -> m_tensor_memory_buffers. Use full path to static alignment variable.

* FunctionMemoryReservation -> TensorMemoryReservation. Only return true in pass if reservation is made (bug fix).

* Moved static compilation mutex.

* Update external function with new pass name.

* GPU_ExternalFunction: Add s_memory_pool_alignment, remove optimize_and_assemble method.

0599a628

IntelGPU backend: BatchNorm operation completly redeveloped (#1318) · 45b50d06
shssf authored Aug 03, 2018

45b50d06
fix travis build...I hope (#1317) · 39278e7d
Robert Kimball authored Aug 03, 2018

39278e7d

Upstream for versioning (#1316) · b99dc1ef

L.S. Cook authored Aug 03, 2018

* update frameworkdocs

* revise docs with new MXNet bridge code instructions

* revise docs with new MXNet bridge code instructions

* remove broken merge conflict

b99dc1ef

IntelGPU backend: Select operation (#1314) · 77a703c2
dmyershov authored Aug 03, 2018

77a703c2

Upstream for versioning (#1309) · f8926a7b

L.S. Cook authored Aug 03, 2018

* update frameworkdocs

* revise docs with new MXNet bridge code instructions

* revise docs with new MXNet bridge code instructions

f8926a7b

Added DEX execution support for ReluBprop (#1305) · 87b5758d
Pruthvi authored Aug 03, 2018

87b5758d
Added DEX support for (MaxPool + AvgPool) Backprop op for CPU backend (#1302) · 1fdf2d98
Pruthvi authored Aug 03, 2018
```
* - Added DEX support for MaxPoolBackprop op for CPU backend

* Added DEX execution support for AvgPoolBackprop
```
1fdf2d98
Propagate input buffers for passthrough kernels (#1312) · 2a0e43ef
Jayaram Bobba authored Aug 03, 2018

2a0e43ef
Start of windows build (#1306) · ef309cf6
Robert Kimball authored Aug 03, 2018
```
* compiles but does not link
```
ef309cf6
IntelGPU backend: Slice operation (#1304) · 7d6a41f3
shssf authored Aug 03, 2018

7d6a41f3
DEX MaxPoolWithIndices (#1299) · c38c76a7
Nick Korovaiko authored Aug 03, 2018
```
* dex max_pool_with_indices

* maxpoolwithindices (#1300)
```
c38c76a7
dex group convolution (#1297) · b1239af4
Nick Korovaiko authored Aug 03, 2018

b1239af4

02 Aug, 2018 12 commits

CPU Direct Execution: Implement product reductions (#1296) · 1011f6c7
Jaikrishnan Menon authored Aug 02, 2018

1011f6c7

LRN (#1282) · 237c4803

Nick Korovaiko authored Aug 02, 2018

* lrn init

* fix comment

* mkldnn lrn (#1295)

* add serializer + fix compiler warnings

237c4803

Fix first_iteration (#1294) · 83a9d252

Jaikrishnan Menon authored Aug 02, 2018

* Fix the first_iteration flag so it works when more than one call-frame exists

Static variables defined in lambda expressions are not private to a lambda so
move this to the runtime context

* Shave off a few microseconds by initializing intermediates exactly once

* Make all execution paths use first_iteration in the runtime context

83a9d252

[Py] Add __repr__ to Strides and CoordDiff (#1291) · 870ab827
Michał Karzyński authored Aug 02, 2018
```
* [Py] Add __repr__ to Strides and CoordDiff

* Apply clang-format

* Repr fix

* Apply clang-format
```
870ab827
[Py] Add convolution_backprop_data to API (#1292) · 5c56923a
Michał Karzyński authored Aug 02, 2018
```
* [Py] Add convolution_backprop_data to API

* Conv fix
```
5c56923a

softmax & convolution memory primitive cacheing (#1290) · bb94fa85

Chris Sullivan authored Aug 02, 2018

* Updated softmax.

* Formatting.

* Updated convolution.

* Use build_primitive overloading. Add helper to emit type_string given a node.

* Formatting.

* Update ConvolutionBackpropData.

* convolution backprop & max pool memory primitive cacheing (#1303)

* Updated ConvolutionBackpropFilters.
* Update MaxPool.

* Update Max and Min. (#1307)

bb94fa85

gpu element op optimize (#1287) · eba9439b
Fenglei authored Aug 02, 2018
```
* move add,mult,min,max,sqrt to elementwise_op, increase op per threads
```
eba9439b
Implement trigonometric ops for direct execution. (#1289) · bcd1daa2
Amy Zhuang authored Aug 02, 2018
```
* Implement trigonometric ops for direct execution.

* Rename files.
```
bcd1daa2

Fix SUSE build and run errors (#1284) · 54e5a816

Robert Kimball authored Aug 02, 2018

* build on suse w/gcc 4.8.5

* fix SUSE build error

* add comments

* remove template function

* update per review comment

* fix nan check emitted code

54e5a816

Interpreter implementation of batch norm bprop (#934) · c6a0fae3

varun-intel authored Aug 02, 2018

* updated

* type prop

* disable test in manifest

* try to exclude

* style

* double

* dobule

* more

* style

* more

* vecs

* fix goe

c6a0fae3

wip (#1293) · 2a64baca
Robert Kimball authored Aug 02, 2018

2a64baca

Work around some buggy (and deprecated) rpath directives (#1256) · 84546bbc

Jaikrishnan Menon authored Aug 02, 2018

* Work around some buggy (and deprecated) rpath directives

* Add missing newline

* Revert "Add missing newline"

This reverts commit 95aebb7f14850afcd59c53ece0bb4663b8c38660.

* Encoding fixes

84546bbc

01 Aug, 2018 6 commits

More efficient sum for some cases (#1251) · f8941a12

Louis Feng authored Aug 01, 2018

* hacking to support dot of 3 by 2 inputs with gemm_batch.

* clean up.

* testing inplace reshape.

* fixed a compile error.

* added comments on todo.

* check for output.

* check for annotation.

* more optimizations WIP.

* sum simd.

* moved parallel for

* testing sum vectorization.

* fixed merge errors.

* sum wip.

* more logic.

* sum refactor and clean up.

* clean up.

* removed unrelated changes.

* removed related changes from merge.

* fixed clang compile errors.

f8941a12

IntelGPU backend: Sum and redeveloped Broadcast operation (#1276) · 92adea38
shssf authored Aug 01, 2018

92adea38

move onehot and reverse op to cuda_emitter (#1266) · cb84305e

Fenglei authored Aug 01, 2018

* move to cuda_emiiter

* fix bug, clang format

* size_t to uint32_t

* reverse_axes

* add rank back, clang format

* remove unused code and file

* remove unused code and file

* manually merge with master

cb84305e

IntelGPU backend: Power, Sigmoid and ReluBackprop operations (#1286) · 167844e4

Anna Alberska authored Aug 01, 2018

* IntelGPU backend: Power, Sigmoid and ReluBackprop operations

* style changed to ReluBackprop

* Update intelgpu_backend.cpp

167844e4

IntelGPU backend: Convolution operation (#1285) · f534650d
dmyershov authored Aug 01, 2018

f534650d

Refactoring MMB (#1224) · 6bca3efd

Nick Korovaiko authored Aug 01, 2018

* rank3xrank2 cpu_emitter version 1

* refactoring matmulbias

* add comment

6bca3efd

29 Jul, 2018 2 commits

IntelGPU backend: Dot operation (partially implemented) (#1275) · 5927bbe4

shssf authored Jul 29, 2018

* IntelGPU backend: Dot operation (partially implemented)

* PR1275. Debug output deleted.

* PR1275. Comments addressed

5927bbe4

Add no-throw error checks (#1264) · c007740b

Chris Sullivan authored Jul 29, 2018

* Broadcast and Pad bug fix.

* Added NO_THROW version of the cuda error checking defines. Now utilizing these in dtors.

This reverts commit 68d9d6eafb1475c83c47229ab3c784c3d392ddbd.

* Revert "Broadcast and Pad bug fix."

This reverts commit 099c79792a2e7b9b8727b48de90f623953691f4c.

c007740b

28 Jul, 2018 3 commits

Rama/travis fixes fornow (#1272) · cd5fe431

rsketine authored Jul 28, 2018

* Update Dockerfile

took out make -j 8 as we are running out of virtual memory.
added df -k for getting the disk space info in logs.

* Update .travis.yml

As we bring up 3 containers here i am reducing to 2 and trying to see if this is deployed on different  machines then should not matter.

* Update .travis.yml

3 in parallel should pass also 2 of then are passing so trying this out

cd5fe431

IntelGPU backend: Pad operation (#1267) · e2e7042a
shssf authored Jul 28, 2018
```
* IntelGPU backend: Pad operation

* PR1267. Comments addressed
```
e2e7042a

Add TBB flow graphs to DEX. (#1247) · 1c2e5b7a

Amy Zhuang authored Jul 28, 2018

* Add TBB flow graphs to DEX.

* Make edges from dummy start node to head nodes when traversing nodes.

* Use static_cast to cast TBB graph node.
Undefine __TBB_PREVIEW_LIGHTWEIGHT_POLICY.

* Code formatting.

* Remove clang wreserved-id-macro warning.

1c2e5b7a