1. 24 Jan, 2018 3 commits
    • Tristan Webb's avatar
      Drwebb/gpu backend dot op (#413) · 94d80ffa
      Tristan Webb authored
      * Drwebb/gpu backend dot op (#387)
      
      * GPU Dot prod emitter switch statement
      
      * cuBLAS dot kernel call
      
      * Flush out arg substitution into gpu dot kernel call
      
      * Drwebb/gpu backend dot op (#392)
      
      * Take in CodeWriter into gpu op emitters
      
      * Introduce GPU function gen based on pass functions
      
      * Additional gpu emitter stubs
      
      * link cublas in to unit test and ngraph
      
      * Use static code gen methods for GPU, add new GPU op stubs
      
      * use pass manager to declare functions / cublas Updates
      
      * Prune down gpu_external_function wip
      
      * Switch back to GPU tensor views in GPU backend
      
      * Pass in cublas handle to GPU external function
      
      * cuMalloc memory in gpu tensor view
      
      * Use cuda runtime malloc and free for tensor view managment c
      
      * change GPU tensor view init, and use GPU tensor view for GPU call frame
      
      * include headers as system dirs
      
      * GPU tensor printing utility function
      
      * cublasSetPointer to device mode / Fix copyright notification lowercasing
      
      * Passing GPU dot product test using cuBLAS
      
      Clean up
      
      * Changes from review
      94d80ffa
    • Adam Procter's avatar
      2b0a5489
    • Scott Cyphers's avatar
      Remove TupleType, ValueType (#411) · d87b0065
      Scott Cyphers authored
      * Remove TupleType, ValueType
      
      * Fix compile error.
      d87b0065
  2. 23 Jan, 2018 1 commit
    • adstraw's avatar
      convolution backprop (#404) · 72a2ce72
      adstraw authored
      * fix convlution reference script
      
      * convolution backprop
      
      * cleanup
      
      * fix build warnings
      
      * Missing include
      
      * fix build warning part 2
      
      * move numeric_compare to its own header
      code review feedback
      
      * fix build warnings 3
      
      * fix build warnings 4
      
      * clang-format
      
      * cast to avoid implicit cast warning
      72a2ce72
  3. 20 Jan, 2018 3 commits
  4. 19 Jan, 2018 5 commits
    • Adam Procter's avatar
      Negative convolution padding (#396) · c5144d48
      Adam Procter authored
      c5144d48
    • Adam Procter's avatar
      Generalized constant-padding op (#383) · 68ef3faa
      Adam Procter authored
      68ef3faa
    • Robert Kimball's avatar
      Add flag to enable memory sanitizer (#393) · 0f836183
      Robert Kimball authored
      * cleanup in-memory header files
      
      * add switch to enable memory sanitizer (works like valgrind)
      
      * removed header file cleanup as it was causing a segfault on program termination
      0f836183
    • Tristan Webb's avatar
      Drwebb/gpu doc (#386) · 408f3b25
      Tristan Webb authored
      * Add mention of blob ref of original file from caffe2
      
      * Mention location of source listing originally from LLVM project
      408f3b25
    • Adam Procter's avatar
      Forward prop for average pooling (#380) · 0931b83b
      Adam Procter authored
      * Average pool type checking and kernel; type checking tests
      
      * Fix and enable average-pool tests
      
      * Docstring fix
      
      * Extend AvgPool op type checking to support padding
      
      * Untested code for padded avg-pool
      
      * Unit tests for padded avg-pool
      
      * Add CPU implementation
      
      * Temp delete
      
      * Docstring fix
      
      * Docstring fix
      
      * Add tests mixing padding and stride
      
      * Temporary cut to ease merge
      
      * Restore temporary cut for merge
      
      * Empty commit to try to force CI to wake up
      0931b83b
  5. 18 Jan, 2018 3 commits
  6. 17 Jan, 2018 3 commits
    • Robert Kimball's avatar
      Add mxnet seq2seq serialized model for benchmarking (#385) · 5ad1de22
      Robert Kimball authored
      * add mxnet seq2seq forward and backward
      
      * add benchmarks for seq2seq forward and backward
      5ad1de22
    • Matthew Brookhart's avatar
      Numerically stable sum so we can pass mxnet unit tests (#381) · b6c98de1
      Matthew Brookhart authored
      * Numerically stable sum so we can pass mxnet unit tests
      
      * Add a small initial residual
      b6c98de1
    • Tristan Webb's avatar
      Drwebb/gpu external function (#367) · c5549682
      Tristan Webb authored
      * Initial GPU_ExternalFunction implementation
      
      Other changes:
      
      Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled
      
      Initial passing (a+b)*c test
      
      Properly link cuda libraries
      
      Simple GPUTensorView implementation
      
      Initial GPU emitter
      
      GPU codegen initial function gen, no kernels yet
      
      Rename GPU emitter and tensor_view_wrapper to match naming convention
      
      * GPU external function based on BASE
      
      * Fix stray base -> gpu
      
      * TensorViewWrapper -> GPU_TensorViewWrapper
      
      * Copy over emitter from base transformer
      
      * Fix for naming dense layout
      
      * Copy kernel emitters from base -> gpu and strip out kernel_utils
      
      * Add aliases to GPU_TensorViewWrappers
      
      * More fixes for naming descriptor::TensorViews
      
      * Move in call_frame implementation from base -> gpu
      
      * apply code format
      
      * GPU codegen running A+B*C
      
      gpu emitters
      gpu ctx setup cuda_module kernels
      Remove GPU_CF perf counters
      Use gpu kernels in external function
      Add GPU 1d dot test
      
      Review Changes:
      * Remove CPU specific kernel emitting method bodies
      
      * Use copy_data from test/util.cpp, uncomment compileTest
      
      * Use test_utils copy_data function
      
      * Grab function name from pass manager for def, clean up indentation
      c5549682
  7. 16 Jan, 2018 1 commit
  8. 12 Jan, 2018 1 commit
  9. 11 Jan, 2018 2 commits
  10. 10 Jan, 2018 3 commits
    • Nick Korovaiko's avatar
      Pattern matching for sum (#293) · 4345e39d
      Nick Korovaiko authored
      * the first stab at pattern for sum
      
      test refactoring, debug msg clean up, formatting fixes
      
      removing v1 and cleaning up v2 + formatting
      
      rollback the changes in reduce_ops
      
      rename v2 -> sum_pred
      
      remove unused funcs
      
      switch to new c-tors
      
      remove TensorViewType
      
      removing an assert
      
      fix a docstring to match a c-tor
      
      * fixes after rebase
      4345e39d
    • Adam Procter's avatar
      c5ffe8e9
    • Matthew Brookhart's avatar
      Switch from Eigen to OpenMP for loops for DS2 kernels (#345) · 7df687c1
      Matthew Brookhart authored
      * speed up reduceslice with kernel emitter
      
      * const-ify and fix a clang warning
      
      * add elementwise ops, slice to for loops
      
      * add broadcast codegen
      
      * add Exp
      
      * fix bugs introduced in eigen kernels
      
      * fix another introduced bug in Eigen
      
      * Fix an Atomic Bug with Sum, do some cleanup
      
      * unit tests pass
      
      * Add Reshape Op, passes Tests
      
      * rewrite sum to correctly handle muti-threading
      
      * Code Cleanup
      
      * add some extra unary ops
      
      * Address review comments
      
      * fix an error in the review comment refactor
      
      * Add Power op
      
      * Add (most) of the Logic Ops
      
      * Make Concat default to OpenMP kernel
      
      * fix n-D reshape issue
      7df687c1
  11. 09 Jan, 2018 2 commits
  12. 08 Jan, 2018 1 commit
  13. 05 Jan, 2018 3 commits
    • Adam Procter's avatar
      Zero padding for convolution (#352) · 8c4ae5ea
      Adam Procter authored
      8c4ae5ea
    • Robert Kimball's avatar
      Remove descriptor::Value and runtime::Value (#355) · 06f9efd9
      Robert Kimball authored
      * general cleanup
      
      * remove runtime::Value
      
      * more cleanup
      
      * more cleanup
      06f9efd9
    • Tristan Webb's avatar
      Drwebb/gpu runtime boilerplate (#314) · feab44b5
      Tristan Webb authored
      * Simple boilerplate for GPU runtime files
      
        - GPUBackend
        - GPU ExternalFunction
        - GPUManager
        - GPUCallFrame
      
      * Test for construction all GPU runtime classes
      
      * Comment out calls, constructors haven't been defined
      
      * Clang CUDA source example to later test compiling
      
      Clang cuda example from:
      https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4
      
      * Initial nvptx compiler copied from CPU compiler sources
      
      * Define FunctionMap and Instruction for gpu external function
      
      * Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test
      
      * Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen
      
      * CAdd nvptx_compiler and nvptx_execution_engine to gpu sources
      
      * Compiling source unit test using hardcoded PTX
      
      * (a+b)*c test for GPU
      
      * WIP Fix compile
      
      * rmed accidentally included file
      
      * Fix compile, and LLVM link errosr from nvptx_compiler.cpp
      
      * Stub out parts needed for GPU manager
      
      * Test GPU runtime method stubs
      
      * Cleanup
      
      * Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled
      
      * Kill reflexive assertion
      
      * change GPU naming convention to match CPU
      
      * Snake case functions and identifiers in test case
      
      * Change element type to match changes in master
      
      * Make CUDA headers accessible for codegen with GPU transformer
      
      * clang-format
      
      * apply-code-format
      feab44b5
  14. 30 Dec, 2017 2 commits
    • Adam Procter's avatar
      Forward prop for max pooling (#305) · d901282e
      Adam Procter authored
      * Definition and type checking for max pool
      
      * Implement kernel, integrate into INTERPRETER, add a few unit tests, make function result type mismatch error message more informative (still need to update tests to reflect that)
      
      * Temporarily delete unit tests to ease merge
      
      * Temporarily delete unit tests to ease merge
      
      * Restore deleted unit tests
      
      * Fix a broken error message check in the unit tests
      
      * Update to handle various TensorViewType-related things going away; add NGVM support
      
      * Add codegen case
      
      * Change various get_blah_shape methods to return const refs, and while we're here, make a similar change where it should have been done in convolution
      
      * Use NDArray for max-pool tests
      d901282e
    • varun-intel's avatar
      recreate ops (#325) · 66d06693
      varun-intel authored
      * recreate ops
      
      * style
      
      * recompute ops
      
      * style
      
      * fix
      
      * recreate ops
      
      * style
      
      * recompute ops
      
      * style
      
      * fix
      
      * some
      
      * more
      
      * style
      
      * remove a line
      
      * const
      
      * style
      
      * NodeMap was using non-standard operator[] behavior.
      
      * Missing include
      66d06693
  15. 29 Dec, 2017 1 commit
    • Scott Cyphers's avatar
      Get value types out of public API, multi-values from Function (#340) · d092cb91
      Scott Cyphers authored
      * Function can have multiple results
      Remove external use of ValueType, TupleType, Tuple
      Remove many external uses of Output and Input
      
      * corresponding CPU backend changes
      
      * Update master changes.
      
      * Remove type arg from Function, add changes.md
      
      * Merge changes.
      
      * Move bodies to .cpp, add brief doc
      
      * Merge CPU changes.
      
      * Remove xla includes from non-xla files
      
      * Remove xla from tests
      
      * First part of xla tuple support
      
      * change fprop_cache to assume multi-output bprop functions
      
      * New wrappers for handling tuples with XLA
      
      * Review comments
      
      * remove old xla files
      
      * fix merge errors
      
      * hand edit models to use multi output instead of tuples
      d092cb91
  16. 28 Dec, 2017 4 commits
    • Yixing Lao's avatar
      1c5abc19
    • Robert Kimball's avatar
      Add bigger models to performance benchmarks (#342) · 2d2fc8c2
      Robert Kimball authored
      * add larger test models
      2d2fc8c2
    • Jai Menon's avatar
      Build and execute TBB flow graphs in the CPU backend (#304) · c2c33748
      Jai Menon authored
      * CMake: TBB integration placeholder
      
      * CMake: Integrate TBB
      
      * CMake: Indent
      
      * CMake: Rewrite TBB integration
      
      * CMake: More TBB integration changes
      
      * CMake: Install TBB headers and DSOs
      
      * CMake: Don't install the TBB debug DSO
      
      * CMake: Propagate ngraph's configured compiler setting over to MKL-DNN
      
      * CMake: Restore TBB debug DSO installation
      
      * CMake: Add installed headers to search path.
      This needs to be cleaned up along with other header search cleanup
      
      * CPU: Build and execute TBB flowgraphs
      
      * CPU: TBB fixes
      
      * CPU: More TBB fixes
      
      * CPU: Allow both TBB and serial codegen for now
      
      * TBB: get_arguments -> get_input_ops
      
      * CPU: Use node methods
      
      * CPU: Add TBB headers in the build directory to the search path
      
      * TBB: Incorporate various changes from master
      
      * CMake: Indentation fix
      
      * CMake: Indentation fix
      
      * CMake: TBB is mandatory so remove additional predicates
      
      * TBB: Add a test
      
      * CMake: Fix linker flags with GCC
      c2c33748
    • Matthew Brookhart's avatar
      Fprop Cache Util Function (#312) · bc63f7bb
      Matthew Brookhart authored
      * in progress
      
      * working cache_fprop, no tests
      
      * style fix
      
      * all inputs to bprop (except adjoints) are cached from fprop
      
      * fix typos, make sure to check count == 0
      
      * fix code format
      bc63f7bb
  17. 27 Dec, 2017 2 commits