- 10 Jan, 2018 3 commits
-
-
Adam Procter authored
-
Robert Kimball authored
-
Matthew Brookhart authored
* speed up reduceslice with kernel emitter * const-ify and fix a clang warning * add elementwise ops, slice to for loops * add broadcast codegen * add Exp * fix bugs introduced in eigen kernels * fix another introduced bug in Eigen * Fix an Atomic Bug with Sum, do some cleanup * unit tests pass * Add Reshape Op, passes Tests * rewrite sum to correctly handle muti-threading * Code Cleanup * add some extra unary ops * Address review comments * fix an error in the review comment refactor * Add Power op * Add (most) of the Logic Ops * Make Concat default to OpenMP kernel * fix n-D reshape issue
-
- 09 Jan, 2018 3 commits
-
-
Nick Korovaiko authored
* remove caching of ordered_ops * graph_util logging msgs * small cleanup * remove files for the TopologicalSort pass * remove NGRAPH_DEBUG from graph_util.hpp
-
Christian Convey authored
-
Robert Kimball authored
* much faster compile time * Remove all variables and just directly access inputs, output, and temps. * compare layouts when checking if two ops are equal * make performance counters available to all backends
-
- 08 Jan, 2018 2 commits
-
-
Adam Procter authored
-
Robert Kimball authored
-
- 06 Jan, 2018 1 commit
-
-
Matthew Brookhart authored
-
- 05 Jan, 2018 4 commits
-
-
Adam Procter authored
-
Robert Kimball authored
* general cleanup * remove runtime::Value * more cleanup * more cleanup
-
Robert Kimball authored
* cleanup * remove arg_index * remove argno from Input * uncleanup
-
Tristan Webb authored
* Simple boilerplate for GPU runtime files - GPUBackend - GPU ExternalFunction - GPUManager - GPUCallFrame * Test for construction all GPU runtime classes * Comment out calls, constructors haven't been defined * Clang CUDA source example to later test compiling Clang cuda example from: https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4 * Initial nvptx compiler copied from CPU compiler sources * Define FunctionMap and Instruction for gpu external function * Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test * Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen * CAdd nvptx_compiler and nvptx_execution_engine to gpu sources * Compiling source unit test using hardcoded PTX * (a+b)*c test for GPU * WIP Fix compile * rmed accidentally included file * Fix compile, and LLVM link errosr from nvptx_compiler.cpp * Stub out parts needed for GPU manager * Test GPU runtime method stubs * Cleanup * Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled * Kill reflexive assertion * change GPU naming convention to match CPU * Snake case functions and identifiers in test case * Change element type to match changes in master * Make CUDA headers accessible for codegen with GPU transformer * clang-format * apply-code-format
-
- 04 Jan, 2018 2 commits
-
-
Robert Kimball authored
-
DawnStone authored
* updated the sphinx version using pip install in Dockerfile.ngraph_cpp added a make target to build the docs to the contrib/docker/Makefile * avoid upgrade pip message during build
-
- 03 Jan, 2018 1 commit
-
-
Yixing Lao authored
-
- 02 Jan, 2018 1 commit
-
-
Matthew Brookhart authored
-
- 30 Dec, 2017 2 commits
-
-
Adam Procter authored
* Definition and type checking for max pool * Implement kernel, integrate into INTERPRETER, add a few unit tests, make function result type mismatch error message more informative (still need to update tests to reflect that) * Temporarily delete unit tests to ease merge * Temporarily delete unit tests to ease merge * Restore deleted unit tests * Fix a broken error message check in the unit tests * Update to handle various TensorViewType-related things going away; add NGVM support * Add codegen case * Change various get_blah_shape methods to return const refs, and while we're here, make a similar change where it should have been done in convolution * Use NDArray for max-pool tests
-
varun-intel authored
* recreate ops * style * recompute ops * style * fix * recreate ops * style * recompute ops * style * fix * some * more * style * remove a line * const * style * NodeMap was using non-standard operator[] behavior. * Missing include
-
- 29 Dec, 2017 2 commits
-
-
Scott Cyphers authored
* Function can have multiple results Remove external use of ValueType, TupleType, Tuple Remove many external uses of Output and Input * corresponding CPU backend changes * Update master changes. * Remove type arg from Function, add changes.md * Merge changes. * Move bodies to .cpp, add brief doc * Merge CPU changes. * Remove xla includes from non-xla files * Remove xla from tests * First part of xla tuple support * change fprop_cache to assume multi-output bprop functions * New wrappers for handling tuples with XLA * Review comments * remove old xla files * fix merge errors * hand edit models to use multi output instead of tuples
-
Yixing Lao authored
* remove llvm/clang dependency in headers * copy elision
-
- 28 Dec, 2017 6 commits
-
-
Yixing Lao authored
-
Robert Kimball authored
* add larger test models
-
Jai Menon authored
This avoids bloating .data and clears the path for code model fixes later
-
Robert Kimball authored
* wip * constants as globals * const emitter rewrite
-
Jai Menon authored
* CMake: TBB integration placeholder * CMake: Integrate TBB * CMake: Indent * CMake: Rewrite TBB integration * CMake: More TBB integration changes * CMake: Install TBB headers and DSOs * CMake: Don't install the TBB debug DSO * CMake: Propagate ngraph's configured compiler setting over to MKL-DNN * CMake: Restore TBB debug DSO installation * CMake: Add installed headers to search path. This needs to be cleaned up along with other header search cleanup * CPU: Build and execute TBB flowgraphs * CPU: TBB fixes * CPU: More TBB fixes * CPU: Allow both TBB and serial codegen for now * TBB: get_arguments -> get_input_ops * CPU: Use node methods * CPU: Add TBB headers in the build directory to the search path * TBB: Incorporate various changes from master * CMake: Indentation fix * CMake: Indentation fix * CMake: TBB is mandatory so remove additional predicates * TBB: Add a test * CMake: Fix linker flags with GCC
-
Matthew Brookhart authored
* in progress * working cache_fprop, no tests * style fix * all inputs to bprop (except adjoints) are cached from fprop * fix typos, make sure to check count == 0 * fix code format
-
- 27 Dec, 2017 5 commits
-
-
Robert Kimball authored
* cleanup * cleanup * expand * wip * undo
-
Robert Kimball authored
-
Robert Kimball authored
* enable -O3 optimization * add flags to support release/debug builds
-
Robert Kimball authored
* nan unit test * fix NAN issue * add INFINITY support
-
Christian Convey authored
This reverts commit 39383029. It looks like the commit actually suppressed parallel makes of MKL-DNN, at least in the case where ngraph itself was being built with parallel make. It also introduced problems with make jobserver warnings.
-
- 26 Dec, 2017 1 commit
-
-
Robert Kimball authored
* add resource file generator and store all headers used by codegen in memory.
-
- 22 Dec, 2017 2 commits
-
-
Robert Kimball authored
* cleanup * cleanup * update serializer to emit small, simple element_type. backwards compatible. * allow for selecting indenting when serializing
-
Jai Menon authored
-
- 21 Dec, 2017 5 commits
-
-
Robert Kimball authored
* remove ngvm * remove NGVM from cmake
-
Robert Kimball authored
* fix autodiff on non-NGVM backends. NGVM initializes all tensors to zero on allocation while the other backends do not. Had to initialize vector before use. * change autodiff tests to use INTERPRETER
-
Robert Kimball authored
set code model back to default as medium is causing the CPU.divide_by_zero_int32 unit test to sefault when it throws an exception from the generated code (#328)
-
Yixing Lao authored
-
Jai Menon authored
* CPU: Optimize Eigen based rowwise vector broadcast * CPU: Remove the need for transposing the broadcast vector * CPU: Optimize to a replicate expression * CPU: Change code model to medium and compile for the host CPU instead of hardcoding BDW
-