- 17 Jan, 2018 4 commits
-
-
varun-intel authored
* remove a node from users * style
-
Robert Kimball authored
* add mxnet seq2seq forward and backward * add benchmarks for seq2seq forward and backward
-
Matthew Brookhart authored
* Numerically stable sum so we can pass mxnet unit tests * Add a small initial residual
-
Tristan Webb authored
* Initial GPU_ExternalFunction implementation Other changes: Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled Initial passing (a+b)*c test Properly link cuda libraries Simple GPUTensorView implementation Initial GPU emitter GPU codegen initial function gen, no kernels yet Rename GPU emitter and tensor_view_wrapper to match naming convention * GPU external function based on BASE * Fix stray base -> gpu * TensorViewWrapper -> GPU_TensorViewWrapper * Copy over emitter from base transformer * Fix for naming dense layout * Copy kernel emitters from base -> gpu and strip out kernel_utils * Add aliases to GPU_TensorViewWrappers * More fixes for naming descriptor::TensorViews * Move in call_frame implementation from base -> gpu * apply code format * GPU codegen running A+B*C gpu emitters gpu ctx setup cuda_module kernels Remove GPU_CF perf counters Use gpu kernels in external function Add GPU 1d dot test Review Changes: * Remove CPU specific kernel emitting method bodies * Use copy_data from test/util.cpp, uncomment compileTest * Use test_utils copy_data function * Grab function name from pass manager for def, clean up indentation
-
- 16 Jan, 2018 3 commits
-
-
Matthew Brookhart authored
* Add a few more openmp ops * fix a warning * fix merge error
-
Adam Procter authored
-
Yixing Lao authored
* bump argon version * ask argon to install itself * bump version again * argon lib dir * installs argon to ngraph_dist * fix path * upgrade argon version
-
- 14 Jan, 2018 2 commits
-
-
Robert Kimball authored
Add support for reinitializing the compiler if a compile fails, allowing subsequent compiles to succeed
-
Robert Kimball authored
-
- 12 Jan, 2018 1 commit
-
-
Adam Procter authored
Sub-PR: image dilation tests (#362) via @adstraw
-
- 11 Jan, 2018 2 commits
-
-
Robert Kimball authored
* add interpreter nan check option * add unit test
-
Christian Convey authored
-
- 10 Jan, 2018 4 commits
-
-
Nick Korovaiko authored
* the first stab at pattern for sum test refactoring, debug msg clean up, formatting fixes removing v1 and cleaning up v2 + formatting rollback the changes in reduce_ops rename v2 -> sum_pred remove unused funcs switch to new c-tors remove TensorViewType removing an assert fix a docstring to match a c-tor * fixes after rebase
-
Adam Procter authored
-
Robert Kimball authored
-
Matthew Brookhart authored
* speed up reduceslice with kernel emitter * const-ify and fix a clang warning * add elementwise ops, slice to for loops * add broadcast codegen * add Exp * fix bugs introduced in eigen kernels * fix another introduced bug in Eigen * Fix an Atomic Bug with Sum, do some cleanup * unit tests pass * Add Reshape Op, passes Tests * rewrite sum to correctly handle muti-threading * Code Cleanup * add some extra unary ops * Address review comments * fix an error in the review comment refactor * Add Power op * Add (most) of the Logic Ops * Make Concat default to OpenMP kernel * fix n-D reshape issue
-
- 09 Jan, 2018 3 commits
-
-
Nick Korovaiko authored
* remove caching of ordered_ops * graph_util logging msgs * small cleanup * remove files for the TopologicalSort pass * remove NGRAPH_DEBUG from graph_util.hpp
-
Christian Convey authored
-
Robert Kimball authored
* much faster compile time * Remove all variables and just directly access inputs, output, and temps. * compare layouts when checking if two ops are equal * make performance counters available to all backends
-
- 08 Jan, 2018 2 commits
-
-
Adam Procter authored
-
Robert Kimball authored
-
- 06 Jan, 2018 1 commit
-
-
Matthew Brookhart authored
-
- 05 Jan, 2018 4 commits
-
-
Adam Procter authored
-
Robert Kimball authored
* general cleanup * remove runtime::Value * more cleanup * more cleanup
-
Robert Kimball authored
* cleanup * remove arg_index * remove argno from Input * uncleanup
-
Tristan Webb authored
* Simple boilerplate for GPU runtime files - GPUBackend - GPU ExternalFunction - GPUManager - GPUCallFrame * Test for construction all GPU runtime classes * Comment out calls, constructors haven't been defined * Clang CUDA source example to later test compiling Clang cuda example from: https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4 * Initial nvptx compiler copied from CPU compiler sources * Define FunctionMap and Instruction for gpu external function * Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test * Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen * CAdd nvptx_compiler and nvptx_execution_engine to gpu sources * Compiling source unit test using hardcoded PTX * (a+b)*c test for GPU * WIP Fix compile * rmed accidentally included file * Fix compile, and LLVM link errosr from nvptx_compiler.cpp * Stub out parts needed for GPU manager * Test GPU runtime method stubs * Cleanup * Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled * Kill reflexive assertion * change GPU naming convention to match CPU * Snake case functions and identifiers in test case * Change element type to match changes in master * Make CUDA headers accessible for codegen with GPU transformer * clang-format * apply-code-format
-
- 04 Jan, 2018 2 commits
-
-
Robert Kimball authored
-
DawnStone authored
* updated the sphinx version using pip install in Dockerfile.ngraph_cpp added a make target to build the docs to the contrib/docker/Makefile * avoid upgrade pip message during build
-
- 03 Jan, 2018 1 commit
-
-
Yixing Lao authored
-
- 02 Jan, 2018 1 commit
-
-
Matthew Brookhart authored
-
- 30 Dec, 2017 2 commits
-
-
Adam Procter authored
* Definition and type checking for max pool * Implement kernel, integrate into INTERPRETER, add a few unit tests, make function result type mismatch error message more informative (still need to update tests to reflect that) * Temporarily delete unit tests to ease merge * Temporarily delete unit tests to ease merge * Restore deleted unit tests * Fix a broken error message check in the unit tests * Update to handle various TensorViewType-related things going away; add NGVM support * Add codegen case * Change various get_blah_shape methods to return const refs, and while we're here, make a similar change where it should have been done in convolution * Use NDArray for max-pool tests
-
varun-intel authored
* recreate ops * style * recompute ops * style * fix * recreate ops * style * recompute ops * style * fix * some * more * style * remove a line * const * style * NodeMap was using non-standard operator[] behavior. * Missing include
-
- 29 Dec, 2017 2 commits
-
-
Scott Cyphers authored
* Function can have multiple results Remove external use of ValueType, TupleType, Tuple Remove many external uses of Output and Input * corresponding CPU backend changes * Update master changes. * Remove type arg from Function, add changes.md * Merge changes. * Move bodies to .cpp, add brief doc * Merge CPU changes. * Remove xla includes from non-xla files * Remove xla from tests * First part of xla tuple support * change fprop_cache to assume multi-output bprop functions * New wrappers for handling tuples with XLA * Review comments * remove old xla files * fix merge errors * hand edit models to use multi output instead of tuples
-
Yixing Lao authored
* remove llvm/clang dependency in headers * copy elision
-
- 28 Dec, 2017 6 commits
-
-
Yixing Lao authored
-
Robert Kimball authored
* add larger test models
-
Jai Menon authored
This avoids bloating .data and clears the path for code model fixes later
-
Robert Kimball authored
* wip * constants as globals * const emitter rewrite
-
Jai Menon authored
* CMake: TBB integration placeholder * CMake: Integrate TBB * CMake: Indent * CMake: Rewrite TBB integration * CMake: More TBB integration changes * CMake: Install TBB headers and DSOs * CMake: Don't install the TBB debug DSO * CMake: Propagate ngraph's configured compiler setting over to MKL-DNN * CMake: Restore TBB debug DSO installation * CMake: Add installed headers to search path. This needs to be cleaned up along with other header search cleanup * CPU: Build and execute TBB flowgraphs * CPU: TBB fixes * CPU: More TBB fixes * CPU: Allow both TBB and serial codegen for now * TBB: get_arguments -> get_input_ops * CPU: Use node methods * CPU: Add TBB headers in the build directory to the search path * TBB: Incorporate various changes from master * CMake: Indentation fix * CMake: Indentation fix * CMake: TBB is mandatory so remove additional predicates * TBB: Add a test * CMake: Fix linker flags with GCC
-
Matthew Brookhart authored
* in progress * working cache_fprop, no tests * style fix * all inputs to bprop (except adjoints) are cached from fprop * fix typos, make sure to check count == 0 * fix code format
-