Commits · d1d8c4a7f1279991cf73a21ac3ce7db05fd46d08 · submodule / ngraph

07 Aug, 2018 2 commits

reduce fprop cache outputs (#1343) · efa2561e

* reduce fprop cache outputs

* refactor traverse nodes

* Slight refactor, add test, adress PR comments

* fix formatting

efa2561e

Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27

Jayaram Bobba authored 6 years ago

* Switch to using mkldnn memory descriptors for layout

* More changes for using mkldnn descriptor instead of format

* Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding

* Rotate mkldnn layouts on transpose

* Modifications to builder reshape to skip rotated layouts

* More fixes to layouts and removes axis order from cpu layout descriptor

* Code cleanup

* Removed shuffle folding pass since the functionality is subsumed by the layout pass

* Canonicalize a few more formats to keep MKLDNN happy.

* Style fixes

* Style fixes

* Style fixes

* Addressed PR feedback and added reshape passthrough for non-transpose cases

* Adjust named formats for weights tensors to keep MKLDNN happy

* Style fixes

* resolved merge issues

69c51c27

03 Aug, 2018 2 commits

bn bprop test fix, comments and throws (#1325) · 11b992a7
Nick Korovaiko authored 6 years ago

11b992a7

Preallocate intermediate buffers (#1231) · 0599a628

Chris Sullivan authored 6 years ago

* Utilize GPUMemoryManager/Allocator for preallocation of intermediate tensor buffer memory.

* Formatting.

* Merge with master required rework of memory due to CFE pass. Moved function memory pool allocation to pass as a result.

* Formatting.

* Added pass source files.

* Updated tests to account for new assert check. All GPUAllocators should be deconstructed before allocation is made in GPUMemoryManager.

* GPUAllocator::close() can be used to close the allocator prior to destruction

* Removed open allocators. Replaced check with inspection of pass::MemoryManager node list.

* Formatting.

* Rename m_memory_buffers -> m_tensor_memory_buffers. Use full path to static alignment variable.

* FunctionMemoryReservation -> TensorMemoryReservation. Only return true in pass if reservation is made (bug fix).

* Moved static compilation mutex.

* Update external function with new pass name.

* GPU_ExternalFunction: Add s_memory_pool_alignment, remove optimize_and_assemble method.

0599a628

02 Aug, 2018 3 commits

LRN (#1282) · 237c4803

Nick Korovaiko authored 6 years ago

* lrn init

* fix comment

* mkldnn lrn (#1295)

* add serializer + fix compiler warnings

237c4803

Interpreter implementation of batch norm bprop (#934) · c6a0fae3

varun-intel authored 6 years ago

* updated

* type prop

* disable test in manifest

* try to exclude

* style

* double

* dobule

* more

* style

* more

* vecs

* fix goe

c6a0fae3

wip (#1293) · 2a64baca
Robert Kimball authored 6 years ago

2a64baca

27 Jul, 2018 3 commits

is_contained (#1257) · 81c48453
Nick Korovaiko authored 6 years ago

81c48453
CSE constant (#1271) · 953c65f8
Nick Korovaiko authored 6 years ago

953c65f8

Add some convenience macros/classes for error messages (#1258) · deacf29a

Adam Procter authored 6 years ago

* Testing out some ideas for better error messages on AvgPool

* Add uncaught_exception() check to ConstructionAssertLogger dtor

* More general assertion class, not homed inside Node

* Minor formatting change

* NODE_ASSERT for type prop failure

* Produce lighter-weight DummyAssertionHandler when assertion succeeds

* New ctor for AssertionHelper that takes a single location arg; more const&-ness for the constructors

* Remove move constructor for AssertionHelper; fix broken test in assertion.cpp

* Miscellaneous improvements

* Templatized AssertionHelper so different exception classes can be used; implemented TYPE_CHECK_ASSERT around this
* Changed from a "stack" of locations to a single location (the stack was too complicated)
* Added "FAIL" classes/macros which do not take a condition

* Rename a helper function

* Cleanup, cruft removal

* Add test to make sure the assert helper has the lifetime we expect

* Missing includes

deacf29a

26 Jul, 2018 1 commit

IntelGPU backend: broadcast operation (#1252) · d4349db8

shssf authored 6 years ago

* IntelGPUBackend: Broadcast operation

* IntelGPUBackend: more tests for Broadcast operation

* Move macro to static C function in Broadcast tests

d4349db8

18 Jul, 2018 3 commits
- Pool tests updated to check all backends (#1245) · e2255fbd
  Robert Kimball authored 6 years ago
```
* make pool test check backends other than CPU

* more unit test cleanup
```
  e2255fbd
- Fix incorrect divide-by-zero test (#1243) · 7c7c5d62
  Jaikrishnan Menon authored 6 years ago
  
  7c7c5d62
- CPU Loop Kernel Fusion optimization (#1190) · e3ad1b31
  Nick Korovaiko authored 6 years ago
```
* cpu loop kernel fusion pass

*  remove extra code

* bounded relu test

* address scotts feedback
```
  e3ad1b31
17 Jul, 2018 1 commit

Added more convolution variants to DEX (#1223) · 9bb0b653

Jayaram Bobba authored 6 years ago

* CPU Direct Execution: Implement ConvertLayout and refactor

* CPU Direct Execution: Implement Convolution

* 1) Adds computation reuse to direct execution
2) Add avg_pool, broadcast and convolution_bias to direct execution
3) Moved some computation reuse utility functions to graph_utils

* Use lists instead of vectors to avoid reallocation overheads

* - Added convolution variants to direct execution
- Removed ConvolutionBiasRelu, use ConvolutionBias instead
- Reduced code duplication by moving functionality to mkldnn_emitter
  from cpu_emitter

* Style fix

* Moved mkldnn build_convolution to a templated method

* Style fix

* refactored mkldnn conv bprop builders

* Style fix

9bb0b653

14 Jul, 2018 1 commit
- move long building tests to the be the first tests built with the hope of… · cce0c224
  Robert Kimball authored 6 years ago
```
move long building tests to the be the first tests built with the hope of reducing build time. (#1229)
```
  cce0c224
13 Jul, 2018 1 commit
- get_subgraph_outputs (towards checking that intermediate nodes in a matched graph not used) (#1207) · 83e7dba5
  Nick Korovaiko authored 6 years ago
```
* get_subgraph_outputs

* simplify the condition
```
  83e7dba5
12 Jul, 2018 2 commits

Added reshape and broadcast to CSE (#1221) · cf568ef9

Louis Feng authored 6 years ago

* reshape inplace without copy data if possible.

* added reshape and broadcast to CSE.

* Fixed debug messages.

cf568ef9

Bob/backend list (#1220) · 8e1954d0

Robert Kimball authored 6 years ago

* open only the unversioned library but check that it is built against the correct version of ngraph

* review comments

8e1954d0

11 Jul, 2018 1 commit
- Disabeled RNN fusion pass in IA transformer (#1217) · 4cd2c602
  Pruthvi authored 6 years ago
  
  4cd2c602
09 Jul, 2018 2 commits

Liveness optimizations (#1210) · 0c721561

Robert Kimball authored 6 years ago

* Faster liveness.

Memory manager optimized for non-sharing of tensors.
Add pass manager profiler.

* Move pass profiler to a separate PR

* Move Memory Layout optimizations to a separate PR

* use find instead of count

0c721561

Cache functions so the backend does not need to recompile (#1209) · ffe3a631
Robert Kimball authored 6 years ago
```
* Cache some generated functions in backwards tests to speed performance

* more caching
```
ffe3a631

07 Jul, 2018 1 commit

New backend construction/destruction API (#1171) · ad4dd5b0

Robert Kimball authored 6 years ago

* complete the new backend construction/destruction API
* close each dlopen
* don't close libraries for now as it causes python to segfault

ad4dd5b0

06 Jul, 2018 2 commits

Use mkldnn reorder only for transpose/dimshuffles. (#1188) · 5be99c0a

Nishant Patel authored 6 years ago

* Usage of mkldnn reshape updated

* update reshape condition for mkldnn

* Add a test case and order in which conditions are checked

5be99c0a

Collect matched nodes (#1166) · e07637c0
Nick Korovaiko authored 6 years ago
```
* collect matched nodes

* clear m_matched_list

* tests

* address feedback
```
e07637c0

03 Jul, 2018 2 commits
- Batch dot operation for rank 3 multiply with rank 2 tensors (#1180) · 238ce788
  Louis Feng authored 6 years ago
```
* hacking to support dot of 3 by 2 inputs with gemm_batch.

* clean up.
```
  238ce788
- nbench cleanup (#1183) · 9d09c7e5
  Robert Kimball authored 6 years ago
```
* nbench cleanup

* update style
```
  9d09c7e5
02 Jul, 2018 3 commits

move sigmoid to core fusion (#1132) · d05b5e39

Sandeep authored 6 years ago

* declare sigmoid for core fusion

* add simple test for sigmoid

* info fusion status

* cp op as main op

* builds as expected

* move sigmoid fusion code

* add reference kernel

* sigmoid bprop reference kernel and clang-format

* add delta to bprop

* fprop called

* compiles bprop

* move tests

* serializer support

* address comments in code

* add doc

* naming similar to core ops

* fix failing test

* fix failing test

* address clang issue

* more changes

* change test macro

d05b5e39

MKLDNN BoundedRelu implementation for Relu6 (#1179) · eaa6091c

Pruthvi authored 6 years ago

* 1. Added MKLDNNN BoundedRelu op support for Relu6
2. CpuLayout && CPU assignment pass for BoundedRelu Op
3. Unit test inter v/s CPU for BoundedReluOp
4. MKLDNN and default emitter code for BoundedReluOp

* Removed Debug prints

* 1. Added support for boundedrelu to work on any constant literal
2. unit test case for rank2, rank3, rank4 for bounded relu without serialized graph

* Removed is_six() method

eaa6091c

Conv+bias shape check for better error detection (#1176) · e42e5815

Louis Feng authored 6 years ago

* Reshape bias to 1D for conv + bias bprop fusion

* Reshape goe2 back to 2D before replacing

* added shape checks to validate conv+bias op.

* removed conv+bias backprop merge for separate PR review.

* fixed conv_bias_bprop test.

* minor changes to error messages.

e42e5815

30 Jun, 2018 2 commits

Pruthvi/fix rnn output (#1135) · c4c24cb0

Pruthvi authored 6 years ago

* - Fixed replace output for the multi layer recurrent cell state tensor output
- Modified rnn add_output to consider direction and n_layer while calculating the output size for mkldnn dst_layer and dst_iter

* fix unit test failure

c4c24cb0

LoopKernel Collector (#1128) · 784735d6

Nick Korovaiko authored 6 years ago

* collector

* keeping track of inputs; simplifying a merging stratey; adding LKGraph

* LoopKernel Collector

* address feedback

* address feedback 2

* address feedback 3

784735d6

28 Jun, 2018 2 commits
- Support dimshuffle/transpose with MKLDNN (#1129) · 846f6bfe
  Nishant Patel authored 6 years ago
```
* Reshape 4d

* Support dimshuffles/transpose with MKLDNN

* Addressing PR Feedback

* Use Eigen for 3D dimshuffles
```
  846f6bfe
- constant broadcast folding (#1139) · 35b04e6a
  Adam Straw authored 6 years ago
```
* constant broadcast folding

* code review feedback
```
  35b04e6a
26 Jun, 2018 3 commits

remove unused file (#1159) · e4db82ec
Robert Kimball authored 6 years ago

e4db82ec

Convolution sum fusion (#1146) · 82ee0a77

Jayaram Bobba authored 6 years ago

* inplace compute

* fix warnings

* Initial support for convolution sum fusion

* Added in-place support for conv sum fusion and test cases

* reverting spurious changes

* Bug fix to account for inplace input in conv sum fusion

* fix compilation error

* Addressed PR feedback

82ee0a77

OS X support (#1098) · 5395a378

Igor Kaplounenko authored 6 years ago

* updated to work with llvm 8.1 that tensorflow is built with

* sane extensions on the mac

* not doing rpath on apple

* apply style

5395a378

25 Jun, 2018 2 commits

inplace compute (#1141) · 88aa9e9c

Nick Korovaiko authored 6 years ago

* inplace compute

* fix warnings

* address bob's feedback

* bob's feedback 2

* bobs feedback 3

* address bob's feedback 4

88aa9e9c

Fix build for MacOS (#1112) · e2e814e3

Robert Kimball authored 6 years ago

* remove reference to ngraph core code from codegen. add stand-alone implementations of needed funcions

* fixed potential pointer leak

* clean up file_util

* more file util cleanup, removing unused functions

* interpreter works on mac

* CPU and INTERPRETER build and pass unmit tests on macos

* move get_directory to file_util

* cleanup

e2e814e3

22 Jun, 2018 1 commit
- refactor cache_prop to reuse bprop inputs (#1134) · 3b49dd1a
  Matthew Brookhart authored 6 years ago
  
  3b49dd1a