Commits · fc216f3938460cb84ac0703124398261312f8d2b · submodule / ngraph

12 Dec, 2018 1 commit

Adam Procter authored 6 years ago

* Skip --exclude-libs linker flag on macOS

* Change test to if(LINUX)

* Add "Any" op and AnyAllReplacement pass

* Add AnyAllReplacement to IGPU backend

* Stub (error-out) handlers for GPU and INTELGPU

* Add 'All' op

* Add AnyAllInsertion pass, deprecate deprecable ops, add stubs for INTELGPU

* Add failing unit tests to INTELGPU manifest

* Reduce boilerplate

* Reduce more boilerplate

* Add static keywords

fc216f39

11 Dec, 2018 13 commits

Embedding fprop (#2053) · 16d88a7f

Nick Korovaiko authored 6 years ago

* embedding fprop

* add a new line

* type prop tests

* rename

* add a stub handler for embeddinglookup on intelgpu

* rename embedding.* to embedding_lookup

* rename tests in manifest files

* move embeddinglookup to catchall case

* fix test case breaks after merge

* add a negative test, pull up an assertion

* fix test failures

16d88a7f

Framework for Hybrid GPU backend (#2196) · af2c4c7d

Robert Kimball authored 6 years ago

* add empty framework for hybrid GPU, or GPUH

* move placement to the runtime directory

* wip

* skeleton for hybrid GPU backend. most unit tests pass.

* cleanup

* move hybrid code into hybrid dir/namespace

* move hybrid functions

* move more hybrid functions to hybrid directory

* fix placement after compile. All unit tests passing

* fix gpu backend ctor

af2c4c7d

Windows build support (#2177) · 9234cc69

Robert Kimball authored 6 years ago

* files pulled from bob/winbuild

* fix compile problems

* fix a few windows build errors

* add windows file to exclude from git

* add comment why change was made

* revert obsolete change

* more cleanup

* building interpreter and unit test on windows with DLLs

* Add flag for windows to export all symbols. Short term fix.

* enable MD build

* address warnings

* dump all windows build results to a single directory

* fix windows backend dll open issue

* remove debug

* fix file iterator for windows

* fix merge error

* fix test failure

* change header from h to hpp in hopes of making python happy

* address more linux build issues

* fix visibility enable

9234cc69

nvgpu cuda softmax optimization (#2101) · a3133482

Fenglei authored 6 years ago

* add some helper function

* update with new helper function

* update reduce to nd with new helper function

* update float sum to stable sum

* fix bug

* update all reduce to stable sum for float

* fix bug and pass the sum stable test

* remove debug info

* style

* update with shape

* fix bug

* add host parameters to cuda_emitter

* clang format

* fix bugs

* add element::type support

* format

* add a cached value with datatype name

* add init_reduce_value

* unroll loop

* optimization

* remove the need for init_value

* add memset kernel

* add memcpy

* working version

* remove debug info

* add comments, clean up code.

* change in_idx to input_idx

* fix bug

* change args name for memset in emitter

* pass element::Type instead of string

* the op::reduce come with init value, add support

* resolve codacy-bot comment

* fix bug

* resove codacy-bot comment

* add soft_max_block_reduce kernel

* fix bugs

* add softmax_block_reduce to cuda_emitter

* compiing ok, result wrong

* fix bug in kernel

* working version

* removed unused code

* remove unused comments, resolve comments

* cuda reduce for max, min, mul, reduce op init value, format

* use type::info

* use type info for numeric_limits

* remove code from gpu_host_parameters

* header

* remvoe outdated comments

* add helper to check if stable sum is needed

* add stable sum test for double

* remove extra line

* consolidate helper functions

* no need list now.

* remove extra ;

* clang format

* style

* add skip test for cpu and intelGPU side

* resolve more conflict

* update comment

* fix a warning

* Update src/ngraph/runtime/gpu/gpu_cuda_kernel_builder.cpp

using load.
Co-Authored-By: fengleitian <35274053+fengleitian@users.noreply.github.com>

* using WARPSIZE instead of 32, using lambda

* more WARPSIZE instead of 32

* fix block_size_x bug

* using __expf

a3133482

fix crash in ReshapeConvertLayout (#2205) · 6584306c

gaurides authored 6 years ago

* fix crash in ngraph-tf test conv_ops_test.Conv2DTest.testConv2DKernelSmallerThanStrideSame

* fix file perms

* correct checks

6584306c

IntelGPU backend: Fix reshape operation (#2201) · 24bd105f
Sergey Shalnov authored 6 years ago

24bd105f

Bind cuda context to thread prior to compilation (#2199) · 31210402

Chris Sullivan authored 6 years ago

* Bind cuda context to thread prior to compilation. Small refactoring.

* bind_cuda_context_to_thread in source

* bind_cuda_context_to_thread header

31210402

[Py]Add version to ngraph python (#2193) · ec0a3f5c
tsocha authored 6 years ago
```
* [Py]Add version to ngraph python

* FIX
```
ec0a3f5c
Reshape SoftMax Reshape (#2188) · b77fd922
Nick Korovaiko authored 6 years ago
```
* reshape softmax reshape

* add new line

* add new line

* fix style errors
```
b77fd922

Matcher skip (#2169) · c8bc3edc

Nick Korovaiko authored 6 years ago

* Update cpu_external_function.cpp

* fix test case failures

* env var to abort matching

* Update matcher.cpp

* Update matcher.cpp

* add a comment

* give an env var a better name

c8bc3edc

Fix setup.py for CentOS (#2163) · f46e56ec

Adam Rogowiec authored 6 years ago

* Fix installing numpy dependency on CentOS.

* Check whether nGraph library directory exists.

f46e56ec

Fix TF test failures on Mac. (#2210) · 1640d21e

Amy Zhuang authored 6 years ago

* Bug fixes to unordered map checks

* No in-place slice for non-native MKLDNN layouts

* is_op

1640d21e

is_op (#2203) · c9eef901
Nick Korovaiko authored 6 years ago

c9eef901

10 Dec, 2018 1 commit

Harryk remove winml ref (#2204) · 90aa7336

harryskim authored 6 years ago

* Removed winml from stack diagram

* Removed winml from full stack diagram

* Update README.md

* update the diagram without winml

* Changed sentence about WinML

* Removed duplication

90aa7336

08 Dec, 2018 4 commits

change all_close tests to return gtest AssertionResult instead of bool (#2195) · fcdfc4ce

Robert Kimball authored 6 years ago

* change all_close tests to return gtest AssertionResult instead of bool to allow for better error messages

* change throw to return error

* address PR comments and fix compile error

Unverified

fcdfc4ce

reenable mkldnn convolution for large padding (#2168) · 15d9b658

Jayaram Bobba authored 6 years ago

* reenable mkldnn convolution for large padding

* specify precision tolerance to unit test

* pass tolerance values to all_close

15d9b658

move GPU specific test to GPU only (#2191) · 40dda4eb

Robert Kimball authored 6 years ago

* move GPU specific test to GPU only

* fix unit test invocation

* fix compile error

* fix compile error

* style

* fix runtime error

Unverified

40dda4eb

make GOE extend from util::Op (#2153) · 453a6a3c
Nick Korovaiko authored 6 years ago
```
* make GOE extend from util::Op

* fix build breaks
```
453a6a3c

07 Dec, 2018 6 commits

Update slice kernels (#2180) · a16c4961

Jayaram Bobba authored 6 years ago

* initial commit for update slice op

* Finished up update_slice fusion and added codegen support

* style fixes

* Added unit test for in-place update-slice strided

* change pattern name

a16c4961

Backend API change pre-work (#2064) · e0933553

Robert Kimball authored 6 years ago

* change compile call to return Handle

* make CPU require compile() before call()

* fix unit tests to call compile() before call()

* fix failing ops

* update unit test

* revert some changes

* more fixups

* more diff cleanup

* a few more issues addressed

* more fixes

* update API

* more updates

* fix test_ops.py

* fix

* another attempt to fix

* fix unit test

* fix test error

e0933553

IntelGPU backend: Fix memory copy into zero tensors (#2192) · c95bdf64
Sergey Shalnov authored 6 years ago

c95bdf64

Support for all_close_f w/ doubles (#2184) · 125f7242

gcwenger authored 6 years ago

* Double support for all_close_f

* all_close_f uses fixed number of mantissa bits now. Simplified testing code.

* Initialize test data members in constructor to values which will cause test failure. Setup then sets them correctly.

* Reduce info printed out during all_close_f unit tests.

125f7242

Update TBB from 2019_U1 to 2019_U2. (#2154) · 91c4b553
Sang Ik Lee authored 6 years ago

91c4b553
re-enable quantize_clamp_int32 test on CPU (#2090) · bba2b3bd
Adam Straw authored 6 years ago
```
* re-enable quantize_clamp_int32 test on CPU

* MLKDNN typo
```
bba2b3bd

06 Dec, 2018 14 commits

QCBiasAdd and QCBiasSignedAdd for mkldnn (#2062) · 1f40160d

Nishant Patel authored 6 years ago

* Quantize the bias to int32

* Bias scale fix

* mnist works

* Quantize Bias

* Introduce Quantize op in the graph to quantize bias & feedback

* Add QuantizedConvBiasAdd

* Comments and some refactoring

* Add test case with float bias and enable int32 as quantized type in ngraph

* Change shape of scale from Shape{} to Shape{1} in the backend

* Add QuantizedConvBiasSignedAdd

* Fix Layouts, clean up and a test case for QCBA

* Test case for QCBSA

* cleanup mkldnn_emitter.hpp

* fix build error

* Constant fold

1f40160d

IntelGPU backend: Allow more cases for clDNN gemm (#2187) · 4034a0c2
Sergey Shalnov authored 6 years ago

4034a0c2

DEX Loop Kernel (updated) (#2156) · 8fc481a3

Nick Korovaiko authored 6 years ago

* one output

passing tests

clean up

fix build breaks

* move generators into a separate file

8fc481a3

add a throw in lieu of a return stmt (#2183) · 56980738
Nick Korovaiko authored 6 years ago

56980738
an env var to disable individual fusions (#2185) · 504e78f8
Nick Korovaiko authored 6 years ago
```
* an env var to disable individual fusions

* fix env var name
```
504e78f8
Give Fusions Names (#2178) · a09d5f88
Nick Korovaiko authored 6 years ago
```
* give fusions names

* fix build breaks

* fix perms
```
a09d5f88
Abort messages in Matcher to better understand cases where we fail to match (#2179) · 06916cbc
Nick Korovaiko authored 6 years ago
```
*  abort messages in matcher.cpp

* style fixes
```
06916cbc

Graph comparison - isolated per op testing (#2144) · 1feb49f1

gcwenger authored 6 years ago

* Isolated per op testing when comparing graphs for better determination of source of accuracy divergence.

* Improve clarity of comment

1feb49f1

[Py] Update README for PyPI (#2151) · 8a9cf8aa

Michał Karzyński authored 6 years ago

* Update README for PyPI

* Update README for PyPI

* Remove redundant newlines

* Fix links

8a9cf8aa

[Py] setup.py code style formatting. (#2164) · 8249bf9f

Adam Rogowiec authored 6 years ago

* Uniform quotes style .

* Fix comment style.

* Check setup.py with flake8.

- Fix flake8 errors.

* Move function out of class scope.

* Fix function paramter list

* Fix formatting.

8249bf9f

nvgpu cuda reduce with stable sum (#2076) · 606f3f93

Fenglei authored 6 years ago

* add some helper function

* update with new helper function

* update reduce to nd with new helper function

* update float sum to stable sum

* fix bug

* update all reduce to stable sum for float

* fix bug and pass the sum stable test

* remove debug info

* style

* update with shape

* fix bug

* add host parameters to cuda_emitter

* clang format

* fix bugs

* add element::type support

* format

* add a cached value with datatype name

* add init_reduce_value

* unroll loop

* optimization

* remove the need for init_value

* add memset kernel

* add memcpy

* working version

* remove debug info

* add comments, clean up code.

* change in_idx to input_idx

* fix bug

* change args name for memset in emitter

* pass element::Type instead of string

* the op::reduce come with init value, add support

* resolve codacy-bot comment

* fix bug

* resove codacy-bot comment

* remove unused comments, resolve comments

* cuda reduce for max, min, mul, reduce op init value, format

* use type::info

* use type info for numeric_limits

* remove code from gpu_host_parameters

* header

* remvoe outdated comments

* add helper to check if stable sum is needed

* add stable sum test for double

* remove extra line

* consolidate helper functions

* no need list now.

* remove extra ;

* clang format

* style

* add skip test for cpu and intelGPU side

* add line between groups of headers

* add two simple stable sum test for float and double

* skip test for intelGPU

606f3f93

Fix compiler error GCC with 7.1 (#2155) · 4b0445d1
Fabian Boemer authored 6 years ago

4b0445d1

Pruthvi/fix rnn precision (#1874) · 73da681a

Pruthvi authored 6 years ago

* - Added reorder support for rnn weights_layer/iter

* i) fixed compilation issues ii) working but still observing precision error

* i) fixed failing rnn unit test for DEX ii) refactored workspace in RNN mkldnn emitter

* i) added support for src reorder to TNC from NTC

* reorder support for rnn output fron NTC to TNC

* - added support for rnn weight reorder ldgoi -> ldigo
- code refactor for lstm/rnn kernel in mkldnn emitter

* - refactor rnn mkldnnn kernel, change variable names

* fix RNN codegen kernel

* disbale layer rnn fusion pass, to test CI

* method to validate recurrent rnn inputs

* add correlated macthes for Recurrent RNN PM

* - simplify reorder logic for rnn_weights
- fix graph pattern for fusing rnn cell across time steps

* do weights reorders in rnn timesteps fusion

* refactored LSTM graph pass

* - Bug fix for finding the lstm inputs determenstically
- Refactored LSTM graph pass to single pass
- made changes to LSTM RNN time step fusion graph pass

* - use replace_node instead of replace_output in Lstm_step_wise fusion graph pass

* fix compilation error

* Fix GNMT rnn fusion

* check if the node is in use before replacing in RNN graph passes

*  i) fix style ii) fix topo sort issue in RNN graph pass

* style fix

* fix bug in simplify_concat pass

* replaces Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Concat -> Lstm2 with Lstm1 -> Lstm2

* cse for convert layout

* addressed PR comments

* - optimization pass to remove  Lstm1 -> {GOE1, GOE2} -> {Slice1, Slice2} -> Lstm2
- conditional fusing of LSTM cells only for the decoder

* made changes to multi layer RNN fusion callback

* fix asserts in RNN op

* - added support to fuse layers when slc=dlc for RNN cells
- bug fix on the sanity checks for RNN Op

* - support RNN layer fusion till slc = dlc
- bug fixes in multi layer rnn fusion call back

* capture reshape in the RNN weights

* Addressed PR comments

* - added comments in multi layer PM call back
- fuse only if slc == DLC across layers

* restore deleted 3_lstm_cell_forward.json file

* fix typo

* fix failing unit tets

* When processing in place slice, do not change the offset of the slice node if the argument pointer comes from function input.

* Address PR feedback: process in place slice after propagating in place input.

* Set INTERMEDIATE role before propagating in place input.

* Do not add temporaries to the variable name map before propagating in place input in codegen.

* Fix a bug in codegen.

* Fix a bug in codegen slice.

* reenable disabled rnn unit test

* fix compiler error

* - bug fix in the slicing logic for the layer fused rnn cell
- fix failing rnn unit test

* - Addressed PR comments
- removed redundant checks from the rnn graph pass
- simplified rnn call back replace node logic

* - added new multilayer rnn *.json file
- fix test case

* [PRIVATE BRANCH] Style fixes (#2080)

* Style fixes

* change order of lstm gates

* [PRIVATE BRANCH] Jbobba/rnn fusion review (#2113)

* Style fixes for single-layer RNN fusion

* Style fixes to multi-layer RNN

* style fix

* disable GPU test

73da681a

fix failing bn test (#2175) · 86b783c6
Pruthvi authored 6 years ago
```
* fix fialing bn test

* fix style
```
86b783c6

05 Dec, 2018 1 commit
- Jbobba/fix squeeze padded layouts (#2136) · 05c7fbe4
  Jayaram Bobba authored 6 years ago
```
* fix expand layout for padded dimensions

* enable squeeze padded layouts
```
  05c7fbe4