1. 26 Feb, 2018 1 commit
    • Yixing Lao's avatar
      Initial support for hybrid transformer (#526) · 7f08b97b
      Yixing Lao authored
      * initial support for hybrid transformer
      
      * add broadcast_vector_rowwise_reversed for hybrid test
      
      * headerc
      
      * get function placement fix
      
      * conv ref test generator graph node in labmda fuction
      
      * rename map_parameter_to_source_node
      
      * type change map_parameter_to_source_node
      
      * use interpreter for numerical derivative
      
      * better comments
      Unverified
      7f08b97b
  2. 21 Feb, 2018 1 commit
  3. 20 Feb, 2018 1 commit
  4. 14 Feb, 2018 3 commits
  5. 13 Feb, 2018 3 commits
  6. 09 Feb, 2018 4 commits
  7. 08 Feb, 2018 2 commits
  8. 07 Feb, 2018 1 commit
  9. 06 Feb, 2018 1 commit
  10. 05 Feb, 2018 1 commit
  11. 02 Feb, 2018 1 commit
    • Tristan Webb's avatar
      GPU kernels for reshape, GEMM, EW ADD/Mult, Maximum · 1f6284ff
      Tristan Webb authored
      GPU ew add and mult cuBLAS calls
      
      GPU (A + B) * C with cuBLAS
      
      Additional gemm and gemv calls
      
      cmake updates for cuDNN calls
      
      kernels WIP
      
      params for dot gemm
      
      more kernel WIP
      
      memcpy wrappers
      
      aliased outputs, parameter, constant tensor memcopy
      
      comment cleanup
      
      remove cruft
      
      gpu faster gemm
      
      MNIST WIP
      
      Cleanup
      1f6284ff
  12. 01 Feb, 2018 1 commit
  13. 30 Jan, 2018 1 commit
    • Nick Korovaiko's avatar
      fuse dot(a,b) + c (#418) · ea29c6e3
      Nick Korovaiko authored
      cblas_gemm working on mlp
      
      rebase & small fixes
      
      enable debug output
      
      support replacing function's outputs
      
      productizing CPUFusion
      
      addressing Bob and Jayaram's feedback
      
      removing json used for simplification tests
      
      adding comments
      
      fixing formatting errors and removing dead code
      
      TODO msg
      
      removing serializer changes
      Unverified
      ea29c6e3
  14. 24 Jan, 2018 1 commit
    • Tristan Webb's avatar
      Drwebb/gpu backend dot op (#413) · 94d80ffa
      Tristan Webb authored
      * Drwebb/gpu backend dot op (#387)
      
      * GPU Dot prod emitter switch statement
      
      * cuBLAS dot kernel call
      
      * Flush out arg substitution into gpu dot kernel call
      
      * Drwebb/gpu backend dot op (#392)
      
      * Take in CodeWriter into gpu op emitters
      
      * Introduce GPU function gen based on pass functions
      
      * Additional gpu emitter stubs
      
      * link cublas in to unit test and ngraph
      
      * Use static code gen methods for GPU, add new GPU op stubs
      
      * use pass manager to declare functions / cublas Updates
      
      * Prune down gpu_external_function wip
      
      * Switch back to GPU tensor views in GPU backend
      
      * Pass in cublas handle to GPU external function
      
      * cuMalloc memory in gpu tensor view
      
      * Use cuda runtime malloc and free for tensor view managment c
      
      * change GPU tensor view init, and use GPU tensor view for GPU call frame
      
      * include headers as system dirs
      
      * GPU tensor printing utility function
      
      * cublasSetPointer to device mode / Fix copyright notification lowercasing
      
      * Passing GPU dot product test using cuBLAS
      
      Clean up
      
      * Changes from review
      Unverified
      94d80ffa
  15. 19 Jan, 2018 1 commit
  16. 11 Jan, 2018 2 commits
  17. 09 Jan, 2018 1 commit
  18. 05 Jan, 2018 1 commit
    • Tristan Webb's avatar
      Drwebb/gpu runtime boilerplate (#314) · feab44b5
      Tristan Webb authored
      * Simple boilerplate for GPU runtime files
      
        - GPUBackend
        - GPU ExternalFunction
        - GPUManager
        - GPUCallFrame
      
      * Test for construction all GPU runtime classes
      
      * Comment out calls, constructors haven't been defined
      
      * Clang CUDA source example to later test compiling
      
      Clang cuda example from:
      https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4
      
      * Initial nvptx compiler copied from CPU compiler sources
      
      * Define FunctionMap and Instruction for gpu external function
      
      * Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test
      
      * Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen
      
      * CAdd nvptx_compiler and nvptx_execution_engine to gpu sources
      
      * Compiling source unit test using hardcoded PTX
      
      * (a+b)*c test for GPU
      
      * WIP Fix compile
      
      * rmed accidentally included file
      
      * Fix compile, and LLVM link errosr from nvptx_compiler.cpp
      
      * Stub out parts needed for GPU manager
      
      * Test GPU runtime method stubs
      
      * Cleanup
      
      * Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled
      
      * Kill reflexive assertion
      
      * change GPU naming convention to match CPU
      
      * Snake case functions and identifiers in test case
      
      * Change element type to match changes in master
      
      * Make CUDA headers accessible for codegen with GPU transformer
      
      * clang-format
      
      * apply-code-format
      feab44b5
  19. 29 Dec, 2017 1 commit
    • Scott Cyphers's avatar
      Get value types out of public API, multi-values from Function (#340) · d092cb91
      Scott Cyphers authored
      * Function can have multiple results
      Remove external use of ValueType, TupleType, Tuple
      Remove many external uses of Output and Input
      
      * corresponding CPU backend changes
      
      * Update master changes.
      
      * Remove type arg from Function, add changes.md
      
      * Merge changes.
      
      * Move bodies to .cpp, add brief doc
      
      * Merge CPU changes.
      
      * Remove xla includes from non-xla files
      
      * Remove xla from tests
      
      * First part of xla tuple support
      
      * change fprop_cache to assume multi-output bprop functions
      
      * New wrappers for handling tuples with XLA
      
      * Review comments
      
      * remove old xla files
      
      * fix merge errors
      
      * hand edit models to use multi output instead of tuples
      Unverified
      d092cb91
  20. 28 Dec, 2017 1 commit
  21. 21 Dec, 2017 2 commits
  22. 18 Dec, 2017 1 commit
    • Adam Procter's avatar
      Convolution forward prop (#294) · 122db5ff
      Adam Procter authored
      * Test GitHub-JIRA integration, nothing useful in this commit
      
      NGTF-388 #comment Testing JIRA integration
      
      * WIP on convolution
      
      * Type checking for convolution
      
      * Docstrings for convolution
      
      * Add convolution reference kernel; it works on some unit tests copied and pasted from my old branch.
      
      * Bugfix for dilated conv, and improvement to conv test generation
      
      * Remove get_arguments calls from convolution stuff
      
      * Add convolution to CPU; also a few fixes to the test generation stuff
      
      * Add copyright header to convolution ref script
      
      * Move copyright header to the correct place
      
      * A few more tests
      
      * Remove fallback behavior of blanking out the convolution ref file, since we're not generating it from the build system anymore
      
      * Delete stale comment
      
      * Merge stuff for the convolution ref script
      
      * Clean up rebase mess
      
      * Review comments
      
      * Review comment (n_foo -> foo_count)
      122db5ff
  23. 13 Dec, 2017 1 commit
  24. 12 Dec, 2017 1 commit
  25. 05 Dec, 2017 1 commit
    • Robert Kimball's avatar
      New Interpreter backend (#287) · 025a1b92
      Robert Kimball authored
      * New Interpreter backend
      
      * PR review comments
      
      * More RP fixes
      
      * oops
      
      * make autodiff tests backend aware
      
      * wip
      
      * wip
      
      * more ops
      
      * wip
      
      * fix merge error
      
      * merge fixes
      025a1b92
  26. 04 Dec, 2017 1 commit
    • Adam Procter's avatar
      Finish de-Eigenization (#282) · 7b305e3e
      Adam Procter authored
      * Simpler kernel for broadcast
      
      * Fixed behavior for integer divide-by-zero, added unit tests
      
      * Strided and higher-dimensional slice (just tested to 3D)
      
      * Higher-dimensional sum
      
      * Replace-slice de-Eigenized; NOT TESTED AT HIGHER DIMENSIONS YET
      
      * Correct sum behavior when eliminating zero-length axes; add unit tests; also, add higher-dim unit tests for replace-slice
      
      * Higher-dimensional reduce, 'cause hey, why not?
      
      * Remove BroadcastScalarInstruction
      
      * Adding test for an observed failure at trivial sum on 5-tensors
      
      * De-Eigenized and higher-dimmified concat
      
      * Replace 'auto' in the kernels
      
      * temporary delete to ease merge
      
      * Re-insert tests that were deleted to ease merge
      
      * Refactor view-iteration
      
      * De-Eigenize reshape
      
      * Rework divide kernel to use std::enable_if to distinguish between floating and non-floating types
      
      * Update docs to reflect newly implemented cases in several ops
      
      * Rename parameters to View for more clarity; remove axis_walk_order (it's redundant)
      
      * Formatting
      
      * More terminological rejiggering
      
      * De-Eigenize scalar-tensor product
      
      * De-Eigenize dot
      
      * Update docstrings
      
      * Remove 'implementation status' tables from docstrings
      
      * Change step -> strides everywhere for consistent terminology
      
      * Formatting
      
      * Replace asserts in view.cpp with exceptions
      
      * Fix typo
      
      * Fix incorrect result type in dot1d test (ouch...)
      
      * Add missing support for Float64 to ngvm/external_function
      
      * Add int16 and uint16 (how was this missing?)
      
      * A few more additions relative to the missing element types
      
      * Disable tests that will not pass on CPU; they can still be run with test/unit-test --gtest_also_run_disabled_tests --gtest_filter='DISABLED_NGVM.*'
      
      * Move project_ and inject_ functions to common.[ch]pp, not view.[ch]pp
      
      * Rename View to CoordinateTransform
      
      * Add prefix ++ and += to CoordinateIterator
      7b305e3e
  27. 30 Nov, 2017 2 commits
  28. 29 Nov, 2017 1 commit
  29. 28 Nov, 2017 1 commit
    • Nick Korovaiko's avatar
      REBASE: graph pattern matcher half I/O half arguments/users (#269) · 3e68842b
      Nick Korovaiko authored
      * Start of pattern matcher
      
      recursive graph matcher, pattern node
      
      add matcher.cpp
      
      add files for matcher, graph_rewrite
      
      add const to on_match_class
      
      fix comp errors
      
      reshuffle pattern matching code across corresponding files
      
      fix comment
      
      run clang-format
      
      graph_rewrite replace_node
      
      getting simple test cases to work
      
      op/pattern.cpp
      
      toward graph_rewrite tests
      
      older matcher API
      
      before clean up tests
      
      before rebase
      
      build bbrks
      
      more tests
      
      clean up
      
      more clean-up
      
      more cleanup 2
      
      more clean up 3
      
      clean up 4
      
      clang errors
      
      clang errors2
      
      apply code format
      
      move match_class to matcher
      
      major clean up after moving match_class to matcher.cpp
      
      removing tracing changes
      
      rebased as of 11/8
      
      make matcher use i/o descs to traverse the graph; change replace_io
      
      switching to io tds
      
      graph_rewrite tests fail
      
      all tests pass
      
      formatting
      
      unhandle outputs explicitly for now
      
      reset permissions back to 0644; bad bad windows
      
      fixes after rebase
      
      * fixes
      
      * addressing Scott's feedback
      3e68842b