Commits · 814034511f2911cfaaccc3869b5c3958d451b691 · submodule / ngraph

10 Aug, 2018 2 commits

switch pytest to use INTERPRETER backend for testing (#1383) · 81403451
Robert Kimball authored Aug 10, 2018

81403451

Utilize GPUKernelArgs parameter for ew-collective, nd-conv, replace_slice. (#1346) · 63a233b6

Chris Sullivan authored Aug 10, 2018

* Support GPUKernelArgs in Elementwise-collective and Nd-Convolution.

* Update op::ReplaceSlice to use GPUKernelArgs and unroll coordinate transform loop.

* Formatting.

* Moved function signature for global kernels back to emitter body.

* Formatting.

63a233b6

09 Aug, 2018 6 commits
- loosen clang compile checks for switch statements (#1387) · 14019ab9
  Robert Kimball authored Aug 09, 2018
  
  14019ab9
- CMake: Halide (#1374) · 702cf7fe
  Jaikrishnan Menon authored Aug 09, 2018
  
  702cf7fe
- added DEX support for BatchNormRelu (#1375) · 49bd01fc
  Pruthvi authored Aug 09, 2018
```
* added DEX support for BatchNormRelu

* - templatized build_batchnorm_emitter
```
  49bd01fc
- IntelGPU backend: Sin, cos, log, exp operations (#1381) · e5b50db7
  shssf authored Aug 09, 2018
  
  e5b50db7
- Add missing override (#1378) · 56529a0e
  Scott Cyphers authored Aug 09, 2018
  
  56529a0e
- IntelGPU backend: Concat operation implementation (#1363) · 82fc63c0
  dmyershov authored Aug 09, 2018
```
* IntelGPU backend: Concat operation implementation

* Several remarks were fixed

* Remaining remarks were fixed; List of tests for INTELGPU was updated

* PR1363: Minor Fixes
```
  82fc63c0
08 Aug, 2018 19 commits
- add missing unit tests (#1373) · 104fd3ee
  Robert Kimball authored Aug 08, 2018
  
  104fd3ee
- Make pool alignment a constexpr (#1369) · 0bddb26d
  Jaikrishnan Menon authored Aug 08, 2018
```
* Make pool alignment a constexpr

* Fix ODR-use
```
  0bddb26d
- Jmenon/nbench (#1365) · 19bdb2ff
  Jaikrishnan Menon authored Aug 08, 2018
```
* Add an option to exclude the first iteration

* Switch to warmup iterations

* Cleanup
```
  19bdb2ff
- disable rnn && lstm fusion pass for DEX (#1371) · ad7e1ea1
  Pruthvi authored Aug 08, 2018
  
  ad7e1ea1
- Remove unnecessary static (#1368) · 58aa4746
  Jaikrishnan Menon authored Aug 08, 2018
  
  58aa4746
- Added DEX support for BoundedRelu (#1355) · fbc38cf4
  Pruthvi authored Aug 08, 2018
```
* - Added DEX support for BoundedRelu
- Refactored bounded_relu in cpu_emitter to use mkldnn_emitter helper methods

* remove unwanted templatization for bounded_relu mkldnn_emitter
```
  fbc38cf4
- Remove unused headers (#1367) · 87bcec21
  Jaikrishnan Menon authored Aug 08, 2018
  
  87bcec21
- Replace hardcoded alignment value (#1366) · 85fad484
  Jaikrishnan Menon authored Aug 08, 2018
  
  85fad484
- Added DEX execution support for Lstm & Rnn (#1308) · 6b3a668c
  Pruthvi authored Aug 08, 2018
```
* - Added DEX execution support for Lstm
- Added DEX execution support for Rnn

* style fix

* - used mkldnn_utils helper function for building DEX Lstm memory desc

*  - used mkldnn_utils helper function for building DEX Rnn memory desc

* addressed PR comments

* - Refactored rnn & lstm cpu_emitter code to use the mkldnn_emitter helper methods
```
  6b3a668c
- Re-align docs with code example for mnist (#1364) · dcd71a59
  L.S. Cook authored Aug 08, 2018
```
* Re-align docs with code example for mnist

* Also fix dist_mnist and add context highlight
```
  dcd71a59
- Add Softmax variant for rank-3 with 2 reduction axes (#1360) · 9a4125ef
  Jaikrishnan Menon authored Aug 08, 2018
  
  9a4125ef
- IntelGPU backend: Tests updated. Code refactored. No algorithms changed. (#1362) · 59a2d4dd
  shssf authored Aug 08, 2018
```
* IntelGPU backend: Tests updated. Code refactored. No algorithms changed.

* PR1362. debug code removed
```
  59a2d4dd
- Revert changes to gpu shape and update (#1354) · b8de3b7d
  Chris Sullivan authored Aug 08, 2018
```
* GPUShape(int32_t) -> NVShape(uint32_2), NVDiff(int32_t)

* Update code merged from master.

* Add nvshape.hpp and nvdiff.hpp.
```
  b8de3b7d
- Use the right op (#1359) · e5e8d03c
  Jaikrishnan Menon authored Aug 08, 2018
  
  e5e8d03c
- CPU Direct Execution: Implement ReplaceSlice (#1357) · 045ab6bb
  Jaikrishnan Menon authored Aug 08, 2018
```
* CPU Direct Execution: Implement ReplaceSlice

* Remove scalar variant
```
  045ab6bb
- CPU Direct Execution: Implement Slice (#1356) · a2ba381d
  Jaikrishnan Menon authored Aug 08, 2018
  
  a2ba381d
- DEX: Reduce function (#1349) · ec45be4b
  Jaikrishnan Menon authored Aug 08, 2018
```
* CPU Direct Execution: Implement Reduce

* Workarounds for ancient CI compilers

* Fix return types

* Review comments
```
  ec45be4b
- DEX BatchDot (#1319) · 21012673
  Nick Korovaiko authored Aug 08, 2018
```
* batchdot with debug statements

* clean up

* address feedback
```
  21012673
- Avoid a memcpy (#1358) · 15d39100
  Jaikrishnan Menon authored Aug 08, 2018
  
  15d39100
07 Aug, 2018 13 commits

Disallow tracing control outside of init to save some microseconds (#1353) · d1d8c4a7
Jaikrishnan Menon authored Aug 07, 2018

d1d8c4a7
DEX Micro-optimizations: Use the mapped-value reference capture idiom everywhere (#1352) · f277e1c2
Jaikrishnan Menon authored Aug 07, 2018

f277e1c2
CPU Direct Execution: Implement ReduceWindow (#1351) · 4efcb76e
Jaikrishnan Menon authored Aug 07, 2018

4efcb76e
DEX LRN (#1344) · c2e98505
Nick Korovaiko authored Aug 07, 2018
```
* DEX LRN

* merge after jbobba's changes
```
c2e98505

reduce fprop cache outputs (#1343) · efa2561e

Matthew Brookhart authored Aug 07, 2018

* reduce fprop cache outputs

* refactor traverse nodes

* Slight refactor, add test, adress PR comments

* fix formatting

efa2561e

DEX: Softmax (#1341) · f1c29c9c

Jaikrishnan Menon authored Aug 07, 2018

* Add helper macros to select from a partial set of ranks and element types

* CPU Direct Execution: Implement Softmax

* Add softmax builder to the build script

* Update

f1c29c9c

Add helper macros to select from a partial set of ranks and element types (#1339) · 74a7ef7f
Jaikrishnan Menon authored Aug 07, 2018

74a7ef7f
IntelGPU backend: Reverse operation implementation (#1338) · 49d15902
dmyershov authored Aug 07, 2018

49d15902

IntelGPU backend: And, Or operations (#1337) · 91a3bf87

Anna Alberska authored Aug 07, 2018

* IntelGPU backend: And, Or operations

* Code format update: intelgpu_backend.cpp and intelgpu_op_custom_kernels.cpp

* Update logical operations

91a3bf87

cuda optimize softmax (#1310) · 154dc47a

Fenglei authored Aug 07, 2018

* Updated softmax.

* Formatting.

* Updated convolution.

* Use build_primitive overloading. Add helper to emit type_string given a node.

* Formatting.

* Update ConvolutionBackpropData.

* convolution backprop & max pool memory primitive cacheing (#1303)

* Updated ConvolutionBackpropFilters.
* Update MaxPool.

* Update Max and Min. (#1307)

* softmax optimization

* fix bug

* fix bugs

* clang format

* remove comments

* add softmax divide

* fix bugs

* fix bug

* fix bug

* clang format

* remove unused header

* register

* using single parameters instead of array

* using build_elementwise instead of build_elementwise_collective

* remove workspace as csullivan suggested

154dc47a

IntelGPU backend: AvgPool operation(partially) (#1336) · 8db7b24b

Anna Alberska authored Aug 07, 2018

* IntelGPU backend: AvgPool operation(partially)

* Code format update intelgpu_backend.cpp

* Delete code duplication in pooling ops intelgpu_backend.cpp

8db7b24b

Auto. gen. kernel signatures and argument expansion (#1326) · 8476dea0

Chris Sullivan authored Aug 07, 2018

* Add GPUKernelArgs for storing kernel arguments.

* Formatting.

* Resolve tensor addresses when extracting arg list via GPUKernelArgs.

* Updated arg list resolution so that placeholder arguments can be added anywhere in the argument list.

* const ref. args and changed add_args to use add_arg. also expanded type_names map.

* GPUKernelArgs bug fix for return values.

* add_placeholders expects pointers for later resolution

* Formatting.

* Add comments to GPUKernelArgs

* Changed GPUKernelArgs interface to use a runtime variable number of arguments.

* Removed/updated comment.

* Address review comments: Remove combined address resolution and argument list retrieval. Remove unecessary extra type entries in type_map.

* Add space between pragma once and includes.

* Broadcast optimization (#1322)

* Implement GPUKernelArgs with op::Broadcast.

* Removed excess type insertion in kernel signature for broadcast impl.

* Support new auto kernel signature generation for op::Broadcast. Add boolean to helpers to determine if parameters are registers or arrays.

* Removed commented code.

* Update broadcast impl. for new GPUKernelArgs interface.

* Updated based on interface change to GPUKernelArgs.

* Formatting.

* CUDNNHostParameters now implement GPUHostParameters. (#1324)

* Formatting.

8476dea0

Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27

Jayaram Bobba authored Aug 07, 2018

* Switch to using mkldnn memory descriptors for layout

* More changes for using mkldnn descriptor instead of format

* Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding

* Rotate mkldnn layouts on transpose

* Modifications to builder reshape to skip rotated layouts

* More fixes to layouts and removes axis order from cpu layout descriptor

* Code cleanup

* Removed shuffle folding pass since the functionality is subsumed by the layout pass

* Canonicalize a few more formats to keep MKLDNN happy.

* Style fixes

* Style fixes

* Style fixes

* Addressed PR feedback and added reshape passthrough for non-transpose cases

* Adjust named formats for weights tensors to keep MKLDNN happy

* Style fixes

* resolved merge issues

69c51c27