Commits · dc4320af73c3f4903b5fe28ef70163a6fcd308e8 · submodule / ngraph

26 Oct, 2018 2 commits

Add builder for {de}quantize to make API's consistent and support {de}quantize with mkldnn (#1839) · 6b36a480

Nishant Patel authored 6 years ago

* Add builder for {de}quantize

* Add declaration in header

* Add mkldnn support for {de}quantize

* Add support for {de}quantize with mkldnn

* Add Dex support

* Generalizing some api's and adding a test case for DQ in backend_test.in.cpp

* Unify scale between ngraph and mkldnn

* Check for nullptrs

* PR feedback

* fix unit test failure

* Adding tests for builder and deleting the backend tests

* curly braces

* test rename

6b36a480

DEX Debugger (#1798) · fc5842d9

Nick Korovaiko authored 6 years ago

* gdb-like interface + tests

* fix not being able to run call twice without call

* fix continue bug

* fix enables; rename kontinue to resume

* switch from lists of functors,enables to vector

* address scott's feedback

* adding a debugger object

* address jayarams feedback

fc5842d9

23 Oct, 2018 1 commit

hybrid at core (#1821) · 2e88d948

Sandeep authored 6 years ago

* skeleton backend

* Code owner from if conditioning

* add simple placement for interpreter and register pass in hybrid

* placement policy applied

* clone the function if needed

* split the function

* Compile subfunctions in corresponding backends

* hybrid backed works as is for abc test

* cleanup

* add placement policy for CPU

* cleanup a little

* add simple op cost method to backend

* enable CPU pass via flag

* address clang-format PR issue

* reslove build

* clean-up

* update manifest

* disable HYBRID as default build

* style

* addressing offline discussion

* more offline discussion

2e88d948

22 Oct, 2018 1 commit
- move unit tests out of backend_test.in.cpp (#1880) · e07147f8
  Robert Kimball authored 6 years ago
  
  e07147f8
19 Oct, 2018 2 commits

Move unit tests out of backend_test.in.cpp (#1865) · 925e7b27
Robert Kimball authored 6 years ago
```
* comparisons

* move more unit test out of backend_test.in.cpp

* move more tests

* move more tests
```
925e7b27

Basic infrastructure for simple halide subgraphs (#1854) · 91219e40

Jaikrishnan Menon authored 6 years ago

* Basic infrastructure for simple halide subgraphs

* Always build the op since it has no dependencies

* minor cleanup

* Incorporate feedback

91219e40

18 Oct, 2018 1 commit
- CMake cleanup (#1838) · c4018000
  Robert Kimball authored 6 years ago
```
* remove unused setting

* cleanup

* cleanup

* cleanup

* more cleanup
```
  c4018000
13 Oct, 2018 1 commit
- enable Intel gpu unit tests again (#1818) · c8620b11
  Robert Kimball authored 6 years ago
  
  c8620b11
12 Oct, 2018 2 commits

Move backend specific unit tests to test/backend directory (#1775) · 93a5cf92

Robert Kimball authored 6 years ago

* move backend specific test files to test/backend directory

* remove unit_test_control

* move tests back to test root

* fix comment

* wip

* fix manifest

93a5cf92

validate includes (#1767) · c579e245

Robert Kimball authored 6 years ago

* update test to verify all header files are complete, meaning they include what they use.

* disable

c579e245

10 Oct, 2018 1 commit

Reshape Sinking (#1701) · f642bc4c

Nick Korovaiko authored 6 years ago

* reshape sinking working on mnist_conv

* forgot to add reshape_sinking files

* refactoring of binary case

* Quantize/Dequantize case, fix add case, add assert

* address bob and scott's feedback

* debug

* fix a bug where reshapes are removed too early

f642bc4c

05 Oct, 2018 1 commit

RNN fusion (inference) (#1459) · 4df5ea8b

Chris Sullivan authored 6 years ago

* Add op::Sigmoid to nvgpu.

* Bring rnn fusion and concat passes over into GPU from IA. This is a temporary move until generalization and gpu specification can occur.

* Add LSTM fusion and cudnn inference kernel. Next need recurrent fusion and layer fusion.

* Formatting

* Removed unecessary extra output from LSTM op (rnn with seq. length = 1, so y = hy).

* Add RNN fusion of LSTM cells within a recurrent layer.

* Formatting.

* Add fusion across RNN layers.

* Formatting.

* Add algebraic simplification.

* Added rnn fusion tests.

* Updated conditional on LSTM fusion to better distinguish bound nodes as ht vs xt.

* Formatting.

* Removed print statements.

* Formatting.

* Committing missing file.

* Remove concat inputs pass and mkldnn references.

* fix cmake paths

* conflict resolution with merge from master.

* remove explicit lstm op support. bare LSTM ops are converted to RNN ops for emission.

* Formatting.

* Use NGRAPH_ASSERT. Formatting of intel copyright.

* Add check on the feature size (shape) of the recurrent (hidden) input and cell state, to ensure they are the same size.

* fix wrong rnn header

* Formatting.

* Add back lstm op to dispatch table.

* Added RNN test which shows cudnn rnn kernel is not producing correct results.

* With update to AlgSimpl. to simplify concat-reshape-slice, the check modifed in this commit needed to be relaxed.

* Bug fix in parameter tensor packing.

* Alias third output element of RNN for cell state (bug fix).

* Resolve numerical correctness issue with negative values in RNN (bug fix).
Add minimal test to evaluate LSTM and compare with values calculated by hand.

* Add tensor parameter sizes to kernel hash as
they are kernel-specific.

* Add 2 layer lstm fusion test against by-hand solution.

* Export param concatenation to graph for cudnn kernel at both the single rnn layer and multi-layer.

* Formatting.

* Finishing touches after merge: add support for macro expansed dispatch via op_tbl.

* Simplify macro support for gpu ops.

* Add CUDNN_VERSION >= 7200 defguards for RNN fusion.
Need to decide how to notify user of increased performance with >= 7200.

* Revert lstm_analytic test to explicitly copy data to tensor params.

* Removed namespace arg from NGRAPH_GPU_OP.

* Refactored macros to different header so op_tbl only contains op list.

* Defguard on cudnn_descriptor<cudnnRNNDataDescriptor_t>.

* doubles -> floats

* Reorg. pass asserts, prepare to replace with non-throwing pass failures.

* Remove Lstm op and replace it with Rnn.

* Format

* Utilize RETURN_IF_FALSE in rnn pass to avoid any RT asserts.
Note that falling back to raw (no passes) graph for 2rnn_3lstm json from mxnet models
results in a double free inside of the memory layout pass. Appears to be a bug
in Reshape pass through.

* Removed print statements. Add check on input data and recurrent data.

* Don't reuse memory for non-destructive ops.

* Add back Rnn test.

* Formatting.

* Clean up comments.

* Update test per review comments.

4df5ea8b

26 Sep, 2018 1 commit
- Tests, and 'possibly_eq/possibly_neq' methods · e1b5cfbc
  Adam Procter authored 6 years ago
  
  e1b5cfbc
15 Sep, 2018 1 commit
- add very basic test for Coordinate (#1614) · 51174c2f
  Robert Kimball authored 6 years ago
```
* add very basic test for Coordinate

* add correct license header
```
  51174c2f
13 Sep, 2018 1 commit

Control dependencies (#1445) · 58f9af01

Nick Korovaiko authored 6 years ago

* topological sort with cdeps

* add control deps API, fix unit tests

* rollback adjoints changes

* fix test failures,add more tests

* remove dead code

* address scott's feedback

58f9af01

07 Sep, 2018 1 commit
- Add support for Dequantize op via mkldnn for IA backend (codegen + DEX) (#1565) · e6267708
  Nishant Patel authored 6 years ago
```
* Add support for Dequantize op via mkldnn for IA backend (codegen + DEX)

* Remove unused variable

* Static cast target range
```
  e6267708
06 Sep, 2018 1 commit

[ONNXIFI] implement onnxGetBackendIDs() interface function (#1546) · 836ee508

Artur Wojcik authored 6 years ago

* onnx: add missing header files
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnxifi: implementation of onnxGetBackendIDs
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnxifi: add unit tests for onnxGetBackendIDs
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnxifi: change std::out_of_range to std::length_error
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnxifi: after review changes
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

836ee508

30 Aug, 2018 1 commit
- [ONNX] re-enable unit tests for CentOS (#1517) · 394b74fa
  Artur Wojcik authored 6 years ago
  
  394b74fa
24 Aug, 2018 1 commit

[ONNX] CentOS disable unit tests (#1469) · 77fb5b4e

Artur Wojcik authored 6 years ago

* add detection of Linux distribution
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: disable ONNX unit-tests if CentOS detected (temporar solution)
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: correct DISTRIB_ID name for CentOS
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

77fb5b4e

22 Aug, 2018 1 commit
- Revert "Statically link cpu backend into ngraph shared library (#1444)" (#1457) · 68ec623d
  Robert Kimball authored 6 years ago
```
This reverts commit 5ab5a129.
```
  68ec623d
21 Aug, 2018 1 commit

Statically link cpu backend into ngraph shared library (#1444) · 5ab5a129

Robert Kimball authored 6 years ago

* static link cpu library to ngraph

* remove debug

* link ngraph and cpu backend into a single shared object

* add -fPIC and whole-archive for CPU backend

* Added conditional for --whole-archive for Mac OS.

* Added more conditonal for MacOS.

* fix linking problem and unit test failures caused by multiple copies of the same function in CPU backend and INTERPRETER

* fix nbench build

* add nbench to unit test build

* add version number to libngraph

5ab5a129

17 Aug, 2018 1 commit

Enable DEX only build of ngraph (#1424) · 64ac3775

Jayaram Bobba authored 6 years ago

* Optionally get rid of codegen from the CPU backend

* Rename option variable

* Merge fixes

* Merge

* Remove extra changes

* remove dex only exclusions (#1429)

* Unconditionally pick  m_direct_execution if NGRAPH_DEX_ONLY

* Style fix

64ac3775

12 Aug, 2018 1 commit

remove test that relies on CPU if CPU is not built (#1403) · b17b4066

Robert Kimball authored 6 years ago

* remove test that relies on CPU if CPU is not built

* fix docker build

* change onnx to use INTERPRETER

* run unit-test-check

b17b4066

11 Aug, 2018 1 commit
- change style-check target to not be predecated on unit test (#1401) · a3a9a9fa
  Robert Kimball authored 6 years ago
  
  a3a9a9fa
10 Aug, 2018 1 commit

[ONNX] Support for Split op (#1377) · dc329cbc

Artur Wojcik authored 6 years ago

* onnx: add 'constant' operator
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: getting attribute value by name
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: fix code style
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: fix clang compilation warnings
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: exception
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: add 'split' operator
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: add public interface
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: add initial unit test for importer
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* onnx: initial implementetion of operator' set
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>

* [WIP] Unit test for split operation.

* Fix Split Op bounds calculation + UT

* clang format

* Split Op with variable parts unit test.

* Remove unused headers

* Add missing ngraph install prefix for cmake in travis Dockerfile.

* Remove -Wno-zero-as-null-pointer-constant

* Code review

* Apply clang-format-3.9

* Add missing onnx_import interface files to CMakeList

* Copyright notice format

dc329cbc

27 Jul, 2018 1 commit

Add some convenience macros/classes for error messages (#1258) · deacf29a

Adam Procter authored 6 years ago

* Testing out some ideas for better error messages on AvgPool

* Add uncaught_exception() check to ConstructionAssertLogger dtor

* More general assertion class, not homed inside Node

* Minor formatting change

* NODE_ASSERT for type prop failure

* Produce lighter-weight DummyAssertionHandler when assertion succeeds

* New ctor for AssertionHelper that takes a single location arg; more const&-ness for the constructors

* Remove move constructor for AssertionHelper; fix broken test in assertion.cpp

* Miscellaneous improvements

* Templatized AssertionHelper so different exception classes can be used; implemented TYPE_CHECK_ASSERT around this
* Changed from a "stack" of locations to a single location (the stack was too complicated)
* Added "FAIL" classes/macros which do not take a condition

* Rename a helper function

* Cleanup, cruft removal

* Add test to make sure the assert helper has the lifetime we expect

* Missing includes

deacf29a

14 Jul, 2018 1 commit
- move long building tests to the be the first tests built with the hope of… · cce0c224
  Robert Kimball authored 6 years ago
```
move long building tests to the be the first tests built with the hope of reducing build time. (#1229)
```
  cce0c224
21 Jun, 2018 1 commit

Constant folding for Reshapes (#1130) · b9a77a9d

Adam Straw authored 6 years ago

* adding constant propagation pass

* adding test/constant_propagation.cpp

* template make_constant_reshape function

* code review feedback

* add missing files

b9a77a9d

19 Jun, 2018 1 commit

Bob/cmake (#1118) · 4847b2de

Robert Kimball authored 6 years ago

* fix mkldnn rpath

* fix compile warning

* close backends when exiting

* set backend output directory of backends to the ngraph output directory

* Aprocter/patch patch (#1119)

* Move more rpath stuff inside if(NOT APPLE)

* fix repatch problem with mkldnn library

* add updated patch command for older versions of cmake

4847b2de

13 Jun, 2018 1 commit

Ubuntu 18 build support (#1101) · 838ba3f1

Robert Kimball authored 6 years ago

* backend libraries now found in tree

dynamically read header search paths

fix running from install

838ba3f1

04 Jun, 2018 1 commit

Modernize cmake usage (#1032) · eef750df

Robert Kimball authored 6 years ago

* Update cmake files to more modern approach

* disable building libraries that are not required

* handle more build cases

* add versions to backend libs. add start of package target.

* add create_backend to backends

* temporary workaround to tbb not linking correctly with gcc

* install codegen lib

* force tbb to link to the cpu backend so that it is available for codegen

* fix clang build error

* fix warning for codegen build

* update cuda header paths

* change error message for opening backend shared library

* set lib path

eef750df

02 Jun, 2018 1 commit
- Floating point comparison with ULP, adding close_f and all_close_f (#1068) · b8e28555
  Yixing Lao authored 6 years ago
  
  b8e28555
29 May, 2018 1 commit

[CS:GPU::Part 1] Add GPUShape type, conversion operators, and generalized shape helpers. (#1031) · d051f5fa

Chris Sullivan authored 6 years ago

* Added GPUShape and reworked Shape helpers to be
compatible with different shape types.
Shape is now implicitly convertable to GPUShape.

* Updated shape helpers signature and add conversion operators/constructors for GPUShape.

* Adjust row_major_strides to avoid reversed-copy.

* Moved declaration out of loop for clang.

* Moved gpu_shape to gpu transformer.

* Removed no longer necessary headers.

* Added stdexcept header to gpu_shape.hpp

* Changed check on 64bit shape to check if high bits are set.

* Added spacing between functions in GPUShape and boolean operators in shape.hpp.

* Template parameters are UPPER_SNAKE_CASE.

* Return type of shape_size should be large enough to encapsulate the full stride of the tensor.
This should be 64bits wide regardless of the underlying value_type of the ShapeType.

* [CS:GPU::Part 2] Add GPUMemoryManager, GPUAllocator, and memory primitives. (#1034)

This is a big PR which introduces the GPUMemoryManager, GPUAllocator, and the concept of memory primitives.

A memory primitive is a closure which yields the device memory address for a reserved memory space. When a memory reservation is made, the request is recorded along with the data that should be copied (for kernel arguments, but not for workspace memory). The reservation does not yield an address eagerly but instead does so lazily by returning an index which can be used to look up the memory_primitive at runtime. This allows the GPUMemoryManager to delay resolution of the memory address until all reservations have been made.

Ideally, the temporary allocations needed by each kernel could be captured by the liveness lists in the GPU_External_Function. This way the pass::MemoryManager would capture these allocations along with the needed tensor allocations.

For now, rather than rearchitect the gpu_emitter and external function, we utilize the GPUMemoryManager, which maintains its own internal pass::MemoryManager, and the GPUAllocator. Liveness is handled by the GPUAllocator: all workspace allocation/reservations created in the same (or sub)scope as the GPUAllocator will persist until the GPUAllocator goes out of scope and deconstructs. At that time, the GPUAllocator will mark the requested temporary buffers as free, and their liveness will be removed (effectively). That way the next kernels that construct a GPUAllocator can reuse the workspace memory that was needed for previous kernels.

Additional notes:
* This PR updates the CUDAEmitter to exclusively utilize GPUShape instead of Shape.

Commits:
* Added GPUMemoryManager for aggregating memory allocations and copies into a single operation for kernel arguments, and a reusuable memory space for workspace allocations.

* Added GPUShape and reworked Shape helpers to be
compatible with different shape types.

* Removed several unecessary static_casts now that GPUShape is utilized. GPUTensorViewWrapper had a few functions returning std::vector<size_t> instead of Shape/Strides. These were updated as well to take advantage of GPUShape convertion operators.

* Coordinate->GPUShape

* Refactored replace_slice into CudaKernelBuilder. Simplified allocations using new GPUAllocator and GPUMemoryManager.

* Refactor allocations to make use of primitive emitter. Now memory primitives are registered at compile time and the gpu memory address is resolved at runtime by invoking the primitive.

* Added const qualifier to data being copied in GPUAllocator::reserve_argspace

* Removed more replace_slice diffs.

* Added unit tests for GPUMemoryManager and added checks that ensure the
device memory is allocated prior to address resolution by the memory_primitives.
Also exposed the allocation size of the memory manager.

* Added explicit function for queueing kernel argument data rather than inline in the reservation function per @fengleitian recommendation.

[CS:GPU::Part 3] Refactoring of several ops to use GPUMemoryManager (#1035)

This PR implements the new GPUMemoryManager and allocator for all the ops which were previously implemented but required allocations and copies for kernel arguments at runtime.

Limitations:
The convolution workspaces could not be added because the relevant descriptors were not available at compile time due to the codegen. If convolution is later added to the CUDNN emitter, the GPUAllocator can be used to avoid workspace allocation at runtime.

Commits:
* Replaced runtime host to device memcpys with GPUAllocator reservations in order to move them to compile time.

* Forgot to remove no longer necessary buffer freeing from op emitters.

[CS:GPU::Part4] Added op::ReplaceSlice and enabled respective tests. (#999)

This PR implements ReplaceSlice using the coordinate transformation strategy. A thread for each tensor element of the input tensor is chosen and it's position in the source tensor coordinate system is calculated. If it is within the source slice, the source is loaded and written out, otherwise the input tensor is loaded.

* Relevant tests are enabled.

* This op was refactored to utilize the new GPUAllocator and memory manager.

Commits:

* Updated replace_slice op to utilize GPUShape and GPUMemoryManager.

* Added back missing changes after timeline resolution.

* Fixed clang warnings and bug. The cudnn_handle was not initialized ahead of emission time and so any eager cudnn calls would fail.
To fix this, the cudnn and cublas handle creation was moved to the external function constructor.

* Changed row_major_strides to always return vector<size_t> to avoid overflow for tensors with many dimensions. Handle the conversion to 32 bits for GPU shapes with an explicit conversion constructor from vector<size_t>.

* During merge the allocation line from external_function was left out. Adding it back.

d051f5fa

14 May, 2018 1 commit
- Build NNP with ngraph as a library (#1005) · cc5ddd2f
  Yixing Lao authored 6 years ago
```
* Enable NNP reverse build, clean up ngraph repo

* clean mkldnn cmake
```
  cc5ddd2f
11 May, 2018 1 commit

move nop elimination pass to nGraph and add broadcast elimination (#995) · ddf3f4f0

Adam Straw authored 6 years ago

* move nop elimination pass to nGraph and add broadcast elimination

* fix pad test bug

* remove graph visualizer and clean up includes in nop eliminate test

* code format

ddf3f4f0

10 May, 2018 2 commits

Move test_control to test lib (#989) · 6d01a3bf
Yixing Lao authored 6 years ago
```
* test_control in util
```
6d01a3bf

New manifest driven method for disabling backend unit tests (#983) · 44b75607

Robert Kimball authored 6 years ago

* Add mechanism for disabling specific backend unit tests from a manifest file.
Populate the test manifest files for CPU, GPU and INTERPRETER.

* update docs for new manifest controlled transformer unit tests

44b75607

09 May, 2018 1 commit
- Expose ngraph unit test util as a library (#980) · 9e6d67f2
  Yixing Lao authored 6 years ago
```
* create ngraph_test_util

* installs libngraph_test_util
```
  9e6d67f2
07 May, 2018 1 commit
- Infrastructure for Common Subexpression Elimination (#927) · 1c2b0dc9
  Nick Korovaiko authored 6 years ago
```
* cse init

* init tests

* clean up; more tests

* remove visualizations
```
  1c2b0dc9
28 Apr, 2018 1 commit
- nnpi and nnpi tester in unit test (#929) · 42676ed8
  Yixing Lao authored 6 years ago
  
  42676ed8