Commits · afcc4ca89f0b638061ac676eb18f3d84c29cdc2a · submodule / ngraph

11 Feb, 2019 1 commit

Mixed-precision fusions (#2401) · 13b4966b

Jayaram Bobba authored 6 years ago

* CPUQuantFusion pass and some usions for converting mixed precision sub-graphs to int8 fused ops

* - Added unit tests and misc bug fixes for mixed-precision fusions
- Adjust fused sum_scale in quantization builders instead of mkldnn
  primitive creation

13b4966b

04 Feb, 2019 1 commit

Windows support. (#2394) · 45a0fb47

Robert Kimball authored 6 years ago

* fix windows build

* wip

* mkldnn seems to build

* address various errors building cpu backend with MSVC

* wip

* wip

* Windows support.

    * Delete dependency of LLVM when building with MSVC.

* Define EIGEN_HAS_CONSTEXPR when using MSVS.

* Fix MSVC build errors.

    * Incorrect argument to 'decltype'. It is VC bug. Work around the
    error with rename the function into different name.

    * MINMAX issue in matmul_bias.cpp.

    * Correct TBB_LINK_LIBS on Windows.

* Fix MSVC link errors.

    1. redefine problems in cpu_builder.obj and convert_layout.obj. It
    is because cpu_builder.hpp contains an implicit implement of
    function runtime::cpu::Builder::build for cpu::op::ConvertLayout.
    The fix is deleting the registration item in cpu_builder.cpp and
    using REGISTER_CPU_OP_BUILDER in convert_layout.cpp.

    2. Fix the dependent libraries path on Windows. It should be *.lib
    not *.dll when linking these libraries.

* Set visibility for CPU backend to fix the MSVC linker error.

    MSVC complain that the .def file exceed the size limitatoin
    when using CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS. All the functions
    with CPU_BACKEND_API are used by unit test or nbench.

* Fix unit test build errors on Windows.

    * backend_unary_elementwise.in.cpp: Use all_close_f to test case
    BACKEDND sqrt

    * cpu_fustion.cpp: Fix 'NUM_STEPS' cannot be implicitly
    captured because no default capture mode has been specified

    * cpu_test.cpp: Use portable setenv and unsetenv from misc.hpp.

    * tools.cpp: Use portable fpopen from misc.hpp.

    * misc.hpp/misc.cpp: Add new files to host misc functions that Linux and
    Windows using different implementation.

* Make Debug mode work with MSVC.

* style

* fix line ending

45a0fb47

02 Feb, 2019 1 commit

Pruthvi/fix input matrix fusion (#2381) · 917efb94

Pruthvi authored 6 years ago

* -   check to verify if the data_slices shares the same weights

* add the serialized graph

* - explicitly fuse the data slices, so all the parameter partitioned by slices are in contigous memory location
- fixes all the failing test cases

917efb94

18 Jan, 2019 1 commit

Addes backprop to BatchDot op, allows fusion in training. (#2297) · ef778693

Louis Feng authored 6 years ago

* batch dot bprop WIP.

* WIP.

* testing.

* clean up debug code.

* comments and var name change.

* clean up.

* format style, batch dot differentiable pass.

* removed debug output.

* added unit test to autodiff, refactored make_function -> make_function_from_file.

* fixed build warning.

* fixed gpu build error.

* clang format fix.

* all test_tools.cpp to find SERIALIZED_ZOO

* remove cmake redef.

* fix unused macro.

* making test cpu only.

* testing build var

* macro test

* verbose makefile test

* style fix

* verbose make

* test/util needs test/models.

* removed debug output.

* refactor fusion type.

* refactor fusion type.

ef778693

03 Jan, 2019 1 commit
- update licenses for 2019 (#2275) · ba299b93
  Robert Kimball authored 6 years ago
```
* update licenses for 2019

* style
```
  ba299b93
19 Dec, 2018 1 commit

Make explicit compile call in unit tests (#2224) · 7693f74e

Robert Kimball authored 6 years ago

* make validate public

* move compile call outside of call for unit tests

* fix compile error

* one more error

7693f74e

07 Dec, 2018 2 commits

Update slice kernels (#2180) · a16c4961

Jayaram Bobba authored 6 years ago

* initial commit for update slice op

* Finished up update_slice fusion and added codegen support

* style fixes

* Added unit test for in-place update-slice strided

* change pattern name

a16c4961

Backend API change pre-work (#2064) · e0933553

Robert Kimball authored 6 years ago

* change compile call to return Handle

* make CPU require compile() before call()

* fix unit tests to call compile() before call()

* fix failing ops

* update unit test

* revert some changes

* more fixups

* more diff cleanup

* a few more issues addressed

* more fixes

* update API

* more updates

* fix test_ops.py

* fix

* another attempt to fix

* fix unit test

* fix test error

e0933553

06 Dec, 2018 2 commits

DEX Loop Kernel (updated) (#2156) · 8fc481a3

Nick Korovaiko authored 6 years ago

* one output

passing tests

clean up

fix build breaks

* move generators into a separate file

8fc481a3

Pruthvi/fix rnn precision (#1874) · 73da681a

Pruthvi authored 6 years ago

* - Added reorder support for rnn weights_layer/iter

* i) fixed compilation issues ii) working but still observing precision error

* i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter

* i) added support for src reorder to TNC from NTC

* reorder support for rnn output fron NTC to TNC

* - added support for rnn weight reorder ldgoi -> ldigo
- code refactor for lstm/rnn kernel in mkldnn emitter

* - refactor rnn mkldnnn kernel, change variable names

* fix RNN codegen kernel

* disbale layer rnn fusion pass, to test CI

* method to validate recurrent rnn inputs

* add correlated macthes for Recurrent RNN PM

* - simplify reorder logic for rnn_weights
- fix graph pattern for fusing rnn cell across time steps

* do weights reorders in rnn timesteps fusion

* refactored LSTM graph pass

* - Bug fix for finding the lstm inputs determenstically
- Refactored LSTM graph pass to single pass
- made changes to LSTM RNN time step fusion graph pass

* - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass

* fix compilation error

* Fix GNMT rnn fusion

* check if the node is in use before replacing in RNN graph passes

*  i) fix style ii) fix topo sort issue in RNN graph pass

* style fix

* fix bug in simplify_concat pass

* replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2

* cse for convert layout

* addressed PR comments

* - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
- conditional fusing of LSTM cells only for the decoder

* made changes to multi layer RNN fusion callback

* fix asserts in RNN op

* - added support to fuse layers when slc=dlc for RNN cells
- bug fix on the sanity checks for RNN Op

* - support RNN layer fusion till slc = dlc
- bug fixes in multi layer rnn fusion call back

* capture reshape in the RNN weights

* Addressed PR comments

* - added comments in multi layer PM call back
- fuse only if slc == DLC across layers

* restore deleted 3_lstm_cell_forward.json file

* fix typo

* fix failing unit tets

* When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.

* Address PR feedback: process in place slice after propagating in place input.

* Set INTERMEDIATE role before propagating in place input.

* Do not add temporaries to the variable name map before propagating in place input in codegen.

* Fix a bug in codegen.

* Fix a bug in codegen slice.

* reenable disabled rnn unit test

* fix compiler error

* - bug fix in the slicing logic for the layer fused rnn cell
- fix failing rnn unit test

* - Addressed PR comments
- removed redundant checks from the rnn graph pass
- simplified rnn call back replace node logic

* - added new multilayer rnn *.json file
- fix test case

* [PRIVATE BRANCH] Style fixes (#2080)

* Style fixes

* change order of lstm gates

* [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)

* Style fixes for single-layer RNN fusion

* Style fixes to multi-layer RNN

* style fix

* disable GPU test

73da681a

05 Dec, 2018 1 commit

Support for 5D batchnorm (#2055) · d4f8bfdc

Pruthvi authored 6 years ago

* - modified cpu_assignment pass to support bn with input 5D
- added test cases for 5D bn and 5D bn+relu

* - Address PR comments
- used mkldnn_utils to validate bn for mkldnn

* fix compilation error

* Addressed PR comments
- added helpers in mkldnn_utils for assigning ngraph Op as MKLDNN op
- helper funnction for bn mkldnn assignment

* fix clang error

d4f8bfdc

28 Nov, 2018 1 commit

Cyphers/bnorm back (#2129) · 403a09ce

Scott Cyphers authored 6 years ago

* Fix batchnorm argument order, cleanup some comments, fix backprop

* Merge error

* Clean up training function, organize inference test

* BatchNormInference tests

* Training case

* Training test

* Fix autodiff BatchNorm test

* Cleanup

* Move file to doc checkout

* Update disabled test name in igpu manifest
Fix unnused variable

* Unit tests disables

* Review comments

403a09ce

21 Nov, 2018 1 commit

Adding leaky relu (#2096) · 587b96e5

Jayaram Bobba authored 6 years ago

* Adding leaky relu

* Silence compiler warning around fp compares

* Fix copy-paste error and enable in-place for relu mkldnn kernels

587b96e5

16 Nov, 2018 1 commit

Move ParameterVector and ResultVector to the ngraph namespace (#2054) · 803c38aa

Robert Kimball authored 6 years ago

* Move ParameterVector and ResultVector to the ngraph namespace where they belong

* update python wrapper

* more python fixes

* style

* Update setup.py

* fix some new code

803c38aa

11 Nov, 2018 1 commit

add isfinite check for all_close (#2028) · 702d465a

Fenglei authored 6 years ago

* add isfinite check

* style

* output 5 diff and total diff

* output limit of diff for all_close_f

* dix bug

* disable tests

* remove failing unit test that does not make sense.

702d465a

07 Nov, 2018 1 commit

Do not fuse nodes if one node is predecessor of another node in horiz… (#1928) · 2a26558a

Amy Zhuang authored 6 years ago

* Do not fuse nodes if one node is predecessor of another node in horizontal fusion.

* Add dead node check and remove predecessor check in horizontal fusion.

2a26558a

31 Oct, 2018 1 commit

Change Backend::create to return std::unique_ptr<Backend> (#1909) · 05a404a8

Robert Kimball authored 6 years ago

* create unique_ptr backend

* unit test cleanup

* address more code that was recently added

* change from reference to pointer when passing backend to reduce the number of lines changed.

* fix build error

* fix python wrapper

* style

* more specific treatment for unique_ptr

05a404a8

30 Oct, 2018 1 commit

Gauri/groupconv batchnorm (#1900) · c637d629

gaurides authored 6 years ago

* Initial implementation of GroupConv+BatchNorm fusion

* Added GroupConv+BatchNorm with Relu fusion

* Added changes to fuse with BoundedRelu

* Changed BoundedRelu to Relu

* Added test; Code cleanup

* Code formatting

* Removed dead code

* Added test cases and other misc

* Bug fix in group conv callback and general cleanup

* Address PR feedback

* Minor edit to comment. MKLDNN divides both input and output channels by groups

* Style fixes and PR feedback

c637d629

22 Oct, 2018 1 commit

BatchNorm splitting into ops (2nd try) (#1828) · 1beec46b

Nick Korovaiko authored 6 years ago

* split bn into bn_inference bn_training

* fix warnings

* Add GPU support for the new BN ops (#1569)

* Add GPU support and change batchnorm_globalstats test to use BNInference.

* Changed test back to using BNTraining for global stats and updated cudnn backend to account for it.

* Fix issues in merge with master.

* Formatting.

* CPU fixes

* remove 5-arg training BN for now

* more fixes

* python batchnorm changes

* fix onnx_import

* fix a call BatchNormInference c-tor

* yet another fix to BatchNormInference c-tor

* AND yet another fix to batchnorm_inference c-tor

* ops.py

* address adam's feedback

* Remove unnecessary parameter/argument.

* remove batch_norm_training_relu_with_global_stats

* remove bn_relu (training)

1beec46b

15 Oct, 2018 1 commit
- mat_fusion 4d test case (#1809) · fe4b0f49
  Nick Korovaiko authored 6 years ago
  
  fe4b0f49
12 Oct, 2018 1 commit
- Modify slice layout. (#1788) · d19d1271
  Amy Zhuang authored 6 years ago
  
  d19d1271
08 Oct, 2018 1 commit

IAT: More convolution folding optimizations (#1712) · 00b4453d

Jayaram Bobba authored 6 years ago

* Check output shape when setting memory layout for slice op.

* Miscellaneous fusion and other optimizations for inception-resnetv2
- ConvBias Batchnorm folding
- ConvBias Affine folding
- Check if MKLDNN can slice a given layout and select layouts
  appropriately

* Fixed unit test and bug in conv bias pattern

* Addressed PR feedback

* Addressed PR feedback

00b4453d

05 Oct, 2018 1 commit
- CPU: Make DEX mode the default (#1755) · c8858ef2
  Jaikrishnan Menon authored 6 years ago
  
  c8858ef2
02 Oct, 2018 1 commit

Pruthvi/rnn fusion (#1677) · 18e41513

Pruthvi authored 6 years ago

* WIP input * weights rnn optimization

* concat + slcing + replacing new node works

* WIP unit test case of fusing rnn inputs

* - Added unit test case for fusing rnn input weights
- registered CPURnnMatFusion_v1/v2 in codegen and DEX

* fixed redeclaration of a variable

* Refactored rnn input traformation passes into a single pass

* Refactored CPURnnMatFusion call back functions

* change random generator range to include -ve values in unit test

* address PR comments

* dont fuse if the shape of the data slices dont match

18e41513

29 Sep, 2018 1 commit
- Rename runtime::TensorView to runtime::Tensor (#1699) · 5fc7cf65
  Robert Kimball authored 6 years ago
```
* rename files

* rename runtime TensorView to Tensor

* rename HostTensorView to HostTensor
```
  5fc7cf65
21 Sep, 2018 1 commit

Add CPU horizontal fusion pass for inception. (#1577) · 2d2b3b2f

Amy Zhuang authored 6 years ago

* Add CPU horizontal fusion pass for inception.

* Name change.

* Move horizontal fusion to cpu_fusion.

* Change horizontal fusion pass for inception to a general horizontal fusion pass.
Add a unit test conv_horizontal_fusion to cpu_fusion.

* Rename files.

* Correct cpu_fusion.hpp.

* Add NGRAPH_DEBUG.

* Set native layout when input format of slice is nChw16c or nChw8c and lower bound of
channels is not a multiple of 16 or 8.

2d2b3b2f

14 Sep, 2018 1 commit

Cyphers/layout (#1602) · 2f79f707

Scott Cyphers authored 6 years ago

* Remove "view"
Simplify layout

* Fix merge error

* fix build error

* PR1602. IntelGPU backend. Compilation fixed.

2f79f707

13 Sep, 2018 1 commit

Handle unsupported op in nbench (#1531) · fe676f72

Robert Kimball authored 6 years ago

* add unsupported_op exception

* unsupported_op test

* add printout of unsupported op in model

* fix GPU dispatcher check

* fix test designation

* catch exceptions on single file runs too

* add unsupported_op exception where needed

* remove unsupported_op class

* add unassigned op exception

* add unit test

* catch unsupported op in nbench

* add cpu test back

* update all latest merges

* mode change

fe676f72

11 Sep, 2018 1 commit

Add conv add fusion (#1526) · 37174c90

gaurides authored 6 years ago

* Add conv add fusion

* Updated file permissions and cpu_fusion order

* Formatted code using maint/apply-code-format.sh

* Fixed minor review comments

* Use NODE_VALIDATION_ASSERT instead of throw ngraph_error;\nupgrade baseline and fix issues

* Some more fixes

37174c90

29 Aug, 2018 2 commits

Change license header to use single-line comment (#1508) · a17ec605

Robert Kimball authored 6 years ago

* use line comments instead of multiline comments for license header

* update more

* update new files

* more header updates

* style

a17ec605

disabled RNN test to workaround RNN unit test failure on MAC due to bug in… · 33cd386b
Pruthvi authored 6 years ago
```
disabled RNN test to workaround RNN unit test failure on MAC due to bug in MKLDNN scratchpad creation (#1502)
```
33cd386b

27 Aug, 2018 1 commit
- normalize comments (#1492) · 9c48c327
  Robert Kimball authored 6 years ago
```
* normalize comments

* address review comments
```
  9c48c327
15 Aug, 2018 1 commit
- Fold affine transformations on 4d convolution (#1347) · 1a8b1f97
  Jayaram Bobba authored 6 years ago
```
* Fold affine transformations on 4d convolution

* Handle more cases for affine parameters

* Style fix
```
  1a8b1f97
13 Aug, 2018 1 commit
- enable parameter validation for all unit tests (#1385) · 24b41844
  Robert Kimball authored 6 years ago
```
* enable parameter validation for all unit tests
```
  24b41844
10 Aug, 2018 1 commit
- Dex non-mkldnn version of clipped relu (#1376) · 1f1ab184
  Jayaram Bobba authored 6 years ago
```
* Dex non-mkldnn version of clipped relu

* Change to static_cast
```
  1f1ab184
07 Aug, 2018 1 commit

Switch to using more expressive layout descriptors instead of numeric layout names (#1278) · 69c51c27

Jayaram Bobba authored 6 years ago

* Switch to using mkldnn memory descriptors for layout

* More changes for using mkldnn descriptor instead of format

* Removed mkldnn format from cpu layout descriptor. TODO - shuffle folding

* Rotate mkldnn layouts on transpose

* Modifications to builder reshape to skip rotated layouts

* More fixes to layouts and removes axis order from cpu layout descriptor

* Code cleanup

* Removed shuffle folding pass since the functionality is subsumed by the layout pass

* Canonicalize a few more formats to keep MKLDNN happy.

* Style fixes

* Style fixes

* Style fixes

* Addressed PR feedback and added reshape passthrough for non-transpose cases

* Adjust named formats for weights tensors to keep MKLDNN happy

* Style fixes

* resolved merge issues

69c51c27

18 Jul, 2018 1 commit

CPU Loop Kernel Fusion optimization (#1190) · e3ad1b31

Nick Korovaiko authored 6 years ago

* cpu loop kernel fusion pass

*  remove extra code

* bounded relu test

* address scotts feedback

e3ad1b31

17 Jul, 2018 1 commit

Added more convolution variants to DEX (#1223) · 9bb0b653

Jayaram Bobba authored 6 years ago

* CPU Direct Execution: Implement ConvertLayout and refactor

* CPU Direct Execution: Implement Convolution

* 1) Adds computation reuse to direct execution
2) Add avg_pool, broadcast and convolution_bias to direct execution
3) Moved some computation reuse utility functions to graph_utils

* Use lists instead of vectors to avoid reallocation overheads

* - Added convolution variants to direct execution
- Removed ConvolutionBiasRelu, use ConvolutionBias instead
- Reduced code duplication by moving functionality to mkldnn_emitter
  from cpu_emitter

* Style fix

* Moved mkldnn build_convolution to a templated method

* Style fix

* refactored mkldnn conv bprop builders

* Style fix

9bb0b653

11 Jul, 2018 1 commit
- Disabeled RNN fusion pass in IA transformer (#1217) · 4cd2c602
  Pruthvi authored 6 years ago
  
  4cd2c602
03 Jul, 2018 1 commit
- Batch dot operation for rank 3 multiply with rank 2 tensors (#1180) · 238ce788
  Louis Feng authored 6 years ago
```
* hacking to support dot of 3 by 2 inputs with gemm_batch.

* clean up.
```
  238ce788