Commits · e5e8d03cc8bed9738e5617633efe9e33ddad77d5 · submodule / ngraph

08 Aug, 2018 6 commits
- Use the right op (#1359) · e5e8d03c
  Jaikrishnan Menon authored Aug 08, 2018
  
  e5e8d03c
- CPU Direct Execution: Implement ReplaceSlice (#1357) · 045ab6bb
  Jaikrishnan Menon authored Aug 08, 2018
```
* CPU Direct Execution: Implement ReplaceSlice

* Remove scalar variant
```
  045ab6bb
- CPU Direct Execution: Implement Slice (#1356) · a2ba381d
  Jaikrishnan Menon authored Aug 08, 2018
  
  a2ba381d
- DEX: Reduce function (#1349) · ec45be4b
  Jaikrishnan Menon authored Aug 08, 2018
```
* CPU Direct Execution: Implement Reduce

* Workarounds for ancient CI compilers

* Fix return types

* Review comments
```
  ec45be4b
- DEX BatchDot (#1319) · 21012673
  Nick Korovaiko authored Aug 08, 2018
```
* batchdot with debug statements

* clean up

* address feedback
```
  21012673
- Avoid a memcpy (#1358) · 15d39100
  Jaikrishnan Menon authored Aug 08, 2018
  
  15d39100
07 Aug, 2018 14 commits

Disallow tracing control outside of init to save some microseconds (#1353) · d1d8c4a7
Jaikrishnan Menon authored Aug 07, 2018

d1d8c4a7
DEX Micro-optimizations: Use the mapped-value reference capture idiom everywhere (#1352) · f277e1c2
Jaikrishnan Menon authored Aug 07, 2018

f277e1c2
CPU Direct Execution: Implement ReduceWindow (#1351) · 4efcb76e
Jaikrishnan Menon authored Aug 07, 2018

4efcb76e
DEX LRN (#1344) · c2e98505
Nick Korovaiko authored Aug 07, 2018
```
* DEX LRN

* merge after jbobba's changes
```
c2e98505

reduce fprop cache outputs (#1343) · efa2561e

Matthew Brookhart authored Aug 07, 2018

* reduce fprop cache outputs

* refactor traverse nodes

* Slight refactor, add test, adress PR comments

* fix formatting

efa2561e

DEX: Softmax (#1341) · f1c29c9c

Jaikrishnan Menon authored Aug 07, 2018

* Add helper macros to select from a partial set of ranks and element types

* CPU Direct Execution: Implement Softmax

* Add softmax builder to the build script

* Update

f1c29c9c

Add helper macros to select from a partial set of ranks and element types (#1339) · 74a7ef7f
Jaikrishnan Menon authored Aug 07, 2018

74a7ef7f
IntelGPU backend: Reverse operation implementation (#1338) · 49d15902
dmyershov authored Aug 07, 2018

49d15902

IntelGPU backend: And, Or operations (#1337) · 91a3bf87

Anna Alberska authored Aug 07, 2018

* IntelGPU backend: And, Or operations

* Code format update: intelgpu_backend.cpp and intelgpu_op_custom_kernels.cpp

* Update logical operations

91a3bf87

cuda optimize softmax (#1310) · 154dc47a

Fenglei authored Aug 07, 2018

* Updated softmax.

* Formatting.

* Updated convolution.

* Use build_primitive overloading. Add helper to emit type_string given a node.

* Formatting.

* Update ConvolutionBackpropData.

* convolution backprop & max pool memory primitive cacheing (#1303)

* Updated ConvolutionBackpropFilters.
* Update MaxPool.

* Update Max and Min. (#1307)

* softmax optimization

* fix bug

* fix bugs

* clang format

* remove comments

* add softmax divide

* fix bugs

* fix bug

* fix bug

* clang format

* remove unused header

* register

* using single parameters instead of array

* using build_elementwise instead of build_elementwise_collective

* remove workspace as csullivan suggested

154dc47a

IntelGPU backend: AvgPool operation(partially) (#1336) · 8db7b24b

Anna Alberska authored Aug 07, 2018

* IntelGPU backend: AvgPool operation(partially)

* Code format update intelgpu_backend.cpp

* Delete code duplication in pooling ops intelgpu_backend.cpp

8db7b24b

Auto. gen. kernel signatures and argument expansion (#1326) · 8476dea0

Chris Sullivan authored Aug 07, 2018

* Add GPUKernelArgs for storing kernel arguments.

* Formatting.

* Resolve tensor addresses when extracting arg list via GPUKernelArgs.

* Updated arg list resolution so that placeholder arguments can be added anywhere in the argument list.

* const ref. args and changed add_args to use add_arg. also expanded type_names map.

* GPUKernelArgs bug fix for return values.

* add_placeholders expects pointers for later resolution

* Formatting.

* Add comments to GPUKernelArgs

* Changed GPUKernelArgs interface to use a runtime variable number of arguments.

* Removed/updated comment.

* Address review comments: Remove combined address resolution and argument list retrieval. Remove unecessary extra type entries in type_map.

* Add space between pragma once and includes.

* Broadcast optimization (#1322)

* Implement GPUKernelArgs with op::Broadcast.

* Removed excess type insertion in kernel signature for broadcast impl.

* Support new auto kernel signature generation for op::Broadcast. Add boolean to helpers to determine if parameters are registers or arrays.

* Removed commented code.

* Update broadcast impl. for new GPUKernelArgs interface.

* Updated based on interface change to GPUKernelArgs.

* Formatting.

* CUDNNHostParameters now implement GPUHostParameters. (#1324)

* Formatting.

8476dea0

Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27

Jayaram Bobba authored Aug 07, 2018

* Switch to using mkldnn memory descriptors for layout

* More changes for using mkldnn descriptor instead of format

* Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding

* Rotate mkldnn layouts on transpose

* Modifications to builder reshape to skip rotated layouts

* More fixes to layouts and removes axis order from cpu layout descriptor

* Code cleanup

* Removed shuffle folding pass since the functionality is subsumed by the layout pass

* Canonicalize a few more formats to keep MKLDNN happy.

* Style fixes

* Style fixes

* Style fixes

* Addressed PR feedback and added reshape passthrough for non-transpose cases

* Adjust named formats for weights tensors to keep MKLDNN happy

* Style fixes

* resolved merge issues

69c51c27

Fix date in license header (#1342) · 5f77fe86
Jaikrishnan Menon authored Aug 07, 2018

5f77fe86

06 Aug, 2018 3 commits
- CPU Direct Execution: Implement Pad (#1320) · e2064cc2
  Jaikrishnan Menon authored Aug 06, 2018
```
* CPU Direct Execution: Implement Pad

* Add Pad builder to the build script

* Add missed changes during commit
```
  e2064cc2
- IntelGPU backend: Product operation (#1334) · f1c3e4ab
  shssf authored Aug 06, 2018
  
  f1c3e4ab
- IntelGPU backend: Sum operation bug fix (#1330) · 81216a9e
  shssf authored Aug 06, 2018
```
* IntelGPU backend: Sum operation bug fix

* PR1330. Style fix
```
  81216a9e
05 Aug, 2018 4 commits
- IntelGPU backend: Max and Min operations (#1333) · 4f26640b
  shssf authored Aug 05, 2018
  
  4f26640b
- IntelGPU backend: Greater, Less, Equal operations (#1331) · f9ded0b1
  shssf authored Aug 05, 2018
  
  f9ded0b1
- IntelGPU backend: Dot_2x2 operation bug fix (#1329) · c5889b2b
  shssf authored Aug 05, 2018
  
  c5889b2b
- IntelGPU backend: Allow zero size Shape (#1332) · 0405a870
  shssf authored Aug 05, 2018
  
  0405a870
04 Aug, 2018 2 commits
- Fix bugs in StaticInitializer and CudaContextManager (#1321) · 8009b475
  Chris Sullivan authored Aug 04, 2018
```
* Bug fix: StaticInitializer.

* Make CudaContextManager a member of GPU_Backend::BackendContext.

* fix formatting
```
  8009b475
- IntelGPU backend: Code refactored. No algo changed. (#1328) · 8ab89b29
  shssf authored Aug 04, 2018
  
  8ab89b29
03 Aug, 2018 11 commits

nbench: add option to run all models in a directory (#1279) · 2b26df18
Robert Kimball authored Aug 03, 2018
```
* add option to run all models in a directory

* add print for exception from benchmark
```
2b26df18
bn bprop test fix, comments and throws (#1325) · 11b992a7
Nick Korovaiko authored Aug 03, 2018

11b992a7

Preallocate intermediate buffers (#1231) · 0599a628

Chris Sullivan authored Aug 03, 2018

* Utilize GPUMemoryManager/Allocator for preallocation of intermediate tensor buffer memory.

* Formatting.

* Merge with master required rework of memory due to CFE pass. Moved function memory pool allocation to pass as a result.

* Formatting.

* Added pass source files.

* Updated tests to account for new assert check. All GPUAllocators should be deconstructed before allocation is made in GPUMemoryManager.

* GPUAllocator::close() can be used to close the allocator prior to destruction

* Removed open allocators. Replaced check with inspection of pass::MemoryManager node list.

* Formatting.

* Rename m_memory_buffers -> m_tensor_memory_buffers. Use full path to static alignment variable.

* FunctionMemoryReservation -> TensorMemoryReservation. Only return true in pass if reservation is made (bug fix).

* Moved static compilation mutex.

* Update external function with new pass name.

* GPU_ExternalFunction: Add s_memory_pool_alignment, remove optimize_and_assemble method.

0599a628

IntelGPU backend: BatchNorm operation completly redeveloped (#1318) · 45b50d06
shssf authored Aug 03, 2018

45b50d06
fix travis build...I hope (#1317) · 39278e7d
Robert Kimball authored Aug 03, 2018

39278e7d

Upstream for versioning (#1316) · b99dc1ef

L.S. Cook authored Aug 03, 2018

* update frameworkdocs

* revise docs with new MXNet bridge code instructions

* revise docs with new MXNet bridge code instructions

* remove broken merge conflict

b99dc1ef

IntelGPU backend: Select operation (#1314) · 77a703c2
dmyershov authored Aug 03, 2018

77a703c2

Upstream for versioning (#1309) · f8926a7b

L.S. Cook authored Aug 03, 2018

* update frameworkdocs

* revise docs with new MXNet bridge code instructions

* revise docs with new MXNet bridge code instructions

f8926a7b

Added DEX execution support for ReluBprop (#1305) · 87b5758d
Pruthvi authored Aug 03, 2018

87b5758d
Added DEX support for (MaxPool + AvgPool) Backprop op for CPU backend (#1302) · 1fdf2d98
Pruthvi authored Aug 03, 2018
```
* - Added DEX support for MaxPoolBackprop op for CPU backend

* Added DEX execution support for AvgPoolBackprop
```
1fdf2d98
Propagate input buffers for passthrough kernels (#1312) · 2a0e43ef
Jayaram Bobba authored Aug 03, 2018

2a0e43ef