Commits · 981dabeffaced15f685d2f105edf2559067fa67d · submodule / ngraph

17 Jan, 2018 4 commits

remove a node from users (#379) · 981dabef
varun-intel authored Jan 17, 2018
```
* remove a node from users

* style
```
981dabef
Add mxnet seq2seq serialized model for benchmarking (#385) · 5ad1de22
Robert Kimball authored Jan 17, 2018
```
* add mxnet seq2seq forward and backward

* add benchmarks for seq2seq forward and backward
```
5ad1de22
Numerically stable sum so we can pass mxnet unit tests (#381) · b6c98de1
Matthew Brookhart authored Jan 17, 2018
```
* Numerically stable sum so we can pass mxnet unit tests

* Add a small initial residual
```
b6c98de1

Drwebb/gpu external function (#367) · c5549682

Tristan Webb authored Jan 17, 2018

* Initial GPU_ExternalFunction implementation

Other changes:

Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled

Initial passing (a+b)*c test

Properly link cuda libraries

Simple GPUTensorView implementation

Initial GPU emitter

GPU codegen initial function gen, no kernels yet

Rename GPU emitter and tensor_view_wrapper to match naming convention

* GPU external function based on BASE

* Fix stray base -> gpu

* TensorViewWrapper -> GPU_TensorViewWrapper

* Copy over emitter from base transformer

* Fix for naming dense layout

* Copy kernel emitters from base -> gpu and strip out kernel_utils

* Add aliases to GPU_TensorViewWrappers

* More fixes for naming descriptor::TensorViews

* Move in call_frame implementation from base -> gpu

* apply code format

* GPU codegen running A+B*C

gpu emitters
gpu ctx setup cuda_module kernels
Remove GPU_CF perf counters
Use gpu kernels in external function
Add GPU 1d dot test

Review Changes:
* Remove CPU specific kernel emitting method bodies

* Use copy_data from test/util.cpp, uncomment compileTest

* Use test_utils copy_data function

* Grab function name from pass manager for def, clean up indentation

c5549682

16 Jan, 2018 3 commits
- Add a few more openmp ops (#374) · e433e55a
  Matthew Brookhart authored Jan 16, 2018
```
* Add a few more openmp ops

* fix a warning

* fix merge error
```
  e433e55a
- Implement select-and-scatter (#364) · 29231e11
  Adam Procter authored Jan 16, 2018
  
  29231e11
- Yixing/argon install (#370) · d2b081c8
  Yixing Lao authored Jan 16, 2018
```
* bump argon version

* ask argon to install itself

* bump version again

* argon lib dir

* installs argon to ngraph_dist

* fix path

* upgrade argon version
```
  d2b081c8
14 Jan, 2018 2 commits
- Fix error where compiler's result is properly set to nullptr if compile fails (#375) · 5e80b771
  Robert Kimball authored Jan 14, 2018
```
Add support for reinitializing the compiler if a compile fails, allowing subsequent compiles to succeed
```
  5e80b771
- make CPU emit functions static so they can be called by other backends (#376) · 2775b0bf
  Robert Kimball authored Jan 14, 2018
  
  2775b0bf
12 Jan, 2018 1 commit
- Image batch dilation for convolution (#363) · c682fbf4
  Adam Procter authored Jan 12, 2018
```
Sub-PR: image dilation tests (#362) via @adstraw 
```
  c682fbf4
11 Jan, 2018 2 commits
- add interpreter nan check option (#368) · 74850150
  Robert Kimball authored Jan 11, 2018
```
* add interpreter nan check option

* add unit test
```
  74850150
- Better error message from runtime::Manager. · a2d97200
  Christian Convey authored Jan 11, 2018
  
  a2d97200
10 Jan, 2018 4 commits

Pattern matching for sum (#293) · 4345e39d

Nick Korovaiko authored Jan 10, 2018

* the first stab at pattern for sum

test refactoring, debug msg clean up, formatting fixes

removing v1 and cleaning up v2 + formatting

rollback the changes in reduce_ops

rename v2 -> sum_pred

remove unused funcs

switch to new c-tors

remove TensorViewType

removing an assert

fix a docstring to match a c-tor

* fixes after rebase

4345e39d

Implement reduce-window in interpreter and CPU (#359) · c5ffe8e9
Adam Procter authored Jan 10, 2018

c5ffe8e9
fix some is_functionally_identical methods (#365) · 7b1dc3e3
Robert Kimball authored Jan 10, 2018

7b1dc3e3

Switch from Eigen to OpenMP for loops for DS2 kernels (#345) · 7df687c1

Matthew Brookhart authored Jan 10, 2018

* speed up reduceslice with kernel emitter

* const-ify and fix a clang warning

* add elementwise ops, slice to for loops

* add broadcast codegen

* add Exp

* fix bugs introduced in eigen kernels

* fix another introduced bug in Eigen

* Fix an Atomic Bug with Sum, do some cleanup

* unit tests pass

* Add Reshape Op, passes Tests

* rewrite sum to correctly handle muti-threading

* Code Cleanup

* add some extra unary ops

* Address review comments

* fix an error in the review comment refactor

* Add Power op

* Add (most) of the Logic Ops

* Make Concat default to OpenMP kernel

* fix n-D reshape issue

7df687c1

09 Jan, 2018 3 commits

Remove an optimization for caching a list of ordered ops (#360) · 7e89f1bb

Nick Korovaiko authored Jan 09, 2018

* remove caching of ordered_ops

* graph_util logging msgs

* small cleanup

* remove files for the TopologicalSort pass

* remove NGRAPH_DEBUG from graph_util.hpp

7e89f1bb

Fixes minor bugs in XLA-specific code. (#361) · 8627c495
Christian Convey authored Jan 09, 2018

8627c495

Optimizations to reduce compile time (#357) · 7f3dc2d7

Robert Kimball authored Jan 09, 2018

* much faster compile time
* Remove all variables and just directly access inputs, output, and temps.
* compare layouts when checking if two ops are equal
* make performance counters available to all backends

7f3dc2d7

08 Jan, 2018 2 commits
- Definitions of XLA ConvNet MNIST ops (#324) · 524d04fc
  Adam Procter authored Jan 08, 2018
  
  524d04fc
- Optimize the Coordinate class to prevent copies (#358) · 686ee9ab
  Robert Kimball authored Jan 08, 2018
  
  686ee9ab
06 Jan, 2018 1 commit
- fix boolean ops to return the input element::type instead of float32 (#356) · 07ba1bef
  Matthew Brookhart authored Jan 06, 2018
  
  07ba1bef
05 Jan, 2018 4 commits

Zero padding for convolution (#352) · 8c4ae5ea
Adam Procter authored Jan 05, 2018

8c4ae5ea
Remove descriptor::Value and runtime::Value (#355) · 06f9efd9
Robert Kimball authored Jan 05, 2018
```
* general cleanup

* remove runtime::Value

* more cleanup

* more cleanup
```
06f9efd9
Remove unused args from Input (#353) · f4bb3e46
Robert Kimball authored Jan 05, 2018
```
* cleanup

* remove arg_index

* remove argno from Input

* uncleanup
```
f4bb3e46

Drwebb/gpu runtime boilerplate (#314) · feab44b5

Tristan Webb authored Jan 05, 2018

* Simple boilerplate for GPU runtime files

  - GPUBackend
  - GPU ExternalFunction
  - GPUManager
  - GPUCallFrame

* Test for construction all GPU runtime classes

* Comment out calls, constructors haven't been defined

* Clang CUDA source example to later test compiling

Clang cuda example from:
https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4

* Initial nvptx compiler copied from CPU compiler sources

* Define FunctionMap and Instruction for gpu external function

* Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test

* Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen

* CAdd nvptx_compiler and nvptx_execution_engine to gpu sources

* Compiling source unit test using hardcoded PTX

* (a+b)*c test for GPU

* WIP Fix compile

* rmed accidentally included file

* Fix compile, and LLVM link errosr from nvptx_compiler.cpp

* Stub out parts needed for GPU manager

* Test GPU runtime method stubs

* Cleanup

* Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled

* Kill reflexive assertion

* change GPU naming convention to match CPU

* Snake case functions and identifiers in test case

* Change element type to match changes in master

* Make CUDA headers accessible for codegen with GPU transformer

* clang-format

* apply-code-format

feab44b5

04 Jan, 2018 2 commits
- add missing ops to serializer (#351) · 2218cf9f
  Robert Kimball authored Jan 04, 2018
  
  2218cf9f
- prerequisites for html build during docs development (#349) · fe33af85
  DawnStone authored Jan 04, 2018
```
* updated the sphinx version using pip install in Dockerfile.ngraph_cpp

added a make target to build the docs to the contrib/docker/Makefile

* avoid upgrade pip message during build
```
  fe33af85
03 Jan, 2018 1 commit
- bump argon version (#348) · c6bfa697
  Yixing Lao authored Jan 03, 2018
  
  c6bfa697
02 Jan, 2018 1 commit
- Fix a logic bug introduced by #325 (#347) · 2f0a262e
  Matthew Brookhart authored Jan 02, 2018
  
  2f0a262e
30 Dec, 2017 2 commits

Forward prop for max pooling (#305) · d901282e

Adam Procter authored Dec 30, 2017

* Definition and type checking for max pool

* Implement kernel, integrate into INTERPRETER, add a few unit tests, make function result type mismatch error message more informative (still need to update tests to reflect that)

* Temporarily delete unit tests to ease merge

* Temporarily delete unit tests to ease merge

* Restore deleted unit tests

* Fix a broken error message check in the unit tests

* Update to handle various TensorViewType-related things going away; add NGVM support

* Add codegen case

* Change various get_blah_shape methods to return const refs, and while we're here, make a similar change where it should have been done in convolution

* Use NDArray for max-pool tests

d901282e

recreate ops (#325) · 66d06693

varun-intel authored Dec 30, 2017

* recreate ops

* style

* recompute ops

* style

* fix

* recreate ops

* style

* recompute ops

* style

* fix

* some

* more

* style

* remove a line

* const

* style

* NodeMap was using non-standard operator[] behavior.

* Missing include

66d06693

29 Dec, 2017 2 commits

Get value types out of public API, multi-values from Function (#340) · d092cb91

Scott Cyphers authored Dec 29, 2017

* Function can have multiple results
Remove external use of ValueType, TupleType, Tuple
Remove many external uses of Output and Input

* corresponding CPU backend changes

* Update master changes.

* Remove type arg from Function, add changes.md

* Merge changes.

* Move bodies to .cpp, add brief doc

* Merge CPU changes.

* Remove xla includes from non-xla files

* Remove xla from tests

* First part of xla tuple support

* change fprop_cache to assume multi-output bprop functions

* New wrappers for handling tuples with XLA

* Review comments

* remove old xla files

* fix merge errors

* hand edit models to use multi output instead of tuples

d092cb91

Remove LLVM/Clang dependency in headers (#341) · 7c59ca2e
Yixing Lao authored Dec 29, 2017
```
* remove llvm/clang dependency in headers

* copy elision
```
7c59ca2e

28 Dec, 2017 6 commits

support build from ngraph repo with argon as external · 1c5abc19
Yixing Lao authored Dec 27, 2017

1c5abc19
Add bigger models to performance benchmarks (#342) · 2d2fc8c2
Robert Kimball authored Dec 28, 2017
```
* add larger test models
```
2d2fc8c2
Move header resource to .rodata (#344) · 19a10d79
Jai Menon authored Dec 28, 2017
```
This avoids bloating .data and clears the path
for code model fixes later
```
19a10d79
Rewrite the way constants are emitted in the CPU backend (#332) · 603a7d1a
Robert Kimball authored Dec 28, 2017
```
* wip

* constants as globals

* const emitter rewrite
```
603a7d1a

Build and execute TBB flow graphs in the CPU backend (#304) · c2c33748

Jai Menon authored Dec 28, 2017

* CMake: TBB integration placeholder

* CMake: Integrate TBB

* CMake: Indent

* CMake: Rewrite TBB integration

* CMake: More TBB integration changes

* CMake: Install TBB headers and DSOs

* CMake: Don't install the TBB debug DSO

* CMake: Propagate ngraph's configured compiler setting over to MKL-DNN

* CMake: Restore TBB debug DSO installation

* CMake: Add installed headers to search path.
This needs to be cleaned up along with other header search cleanup

* CPU: Build and execute TBB flowgraphs

* CPU: TBB fixes

* CPU: More TBB fixes

* CPU: Allow both TBB and serial codegen for now

* TBB: get_arguments -> get_input_ops

* CPU: Use node methods

* CPU: Add TBB headers in the build directory to the search path

* TBB: Incorporate various changes from master

* CMake: Indentation fix

* CMake: Indentation fix

* CMake: TBB is mandatory so remove additional predicates

* TBB: Add a test

* CMake: Fix linker flags with GCC

c2c33748

Fprop Cache Util Function (#312) · bc63f7bb

Matthew Brookhart authored Dec 28, 2017

* in progress

* working cache_fprop, no tests

* style fix

* all inputs to bprop (except adjoints) are cached from fprop

* fix typos, make sure to check count == 0

* fix code format

bc63f7bb