1. 09 Feb, 2018 1 commit
    • Tristan Webb's avatar
      GPU kernels for reshape, GEMM, EW ADD/Mult, Maximum (#440) · da50410b
      Tristan Webb authored
      * GPU kernels for reshape, GEMM, EW ADD/Mult, Maximum
      
      (A + B) * C test now with cuBLAS
      Additional gemm and gemv calls
      cmake updates for cuDNN calls
      memcpy wrappers in gpu_util
      
      Additional passing tests:
      aliased outputs, parameter, constant tensor memcopy
      da50410b
  2. 08 Feb, 2018 1 commit
  3. 02 Feb, 2018 2 commits
  4. 24 Jan, 2018 1 commit
    • Tristan Webb's avatar
      Drwebb/gpu backend dot op (#413) · 94d80ffa
      Tristan Webb authored
      * Drwebb/gpu backend dot op (#387)
      
      * GPU Dot prod emitter switch statement
      
      * cuBLAS dot kernel call
      
      * Flush out arg substitution into gpu dot kernel call
      
      * Drwebb/gpu backend dot op (#392)
      
      * Take in CodeWriter into gpu op emitters
      
      * Introduce GPU function gen based on pass functions
      
      * Additional gpu emitter stubs
      
      * link cublas in to unit test and ngraph
      
      * Use static code gen methods for GPU, add new GPU op stubs
      
      * use pass manager to declare functions / cublas Updates
      
      * Prune down gpu_external_function wip
      
      * Switch back to GPU tensor views in GPU backend
      
      * Pass in cublas handle to GPU external function
      
      * cuMalloc memory in gpu tensor view
      
      * Use cuda runtime malloc and free for tensor view managment c
      
      * change GPU tensor view init, and use GPU tensor view for GPU call frame
      
      * include headers as system dirs
      
      * GPU tensor printing utility function
      
      * cublasSetPointer to device mode / Fix copyright notification lowercasing
      
      * Passing GPU dot product test using cuBLAS
      
      Clean up
      
      * Changes from review
      94d80ffa
  5. 20 Jan, 2018 1 commit
  6. 19 Jan, 2018 1 commit
    • Tristan Webb's avatar
      Drwebb/gpu doc (#386) · 408f3b25
      Tristan Webb authored
      * Add mention of blob ref of original file from caffe2
      
      * Mention location of source listing originally from LLVM project
      408f3b25
  7. 17 Jan, 2018 1 commit
    • Tristan Webb's avatar
      Drwebb/gpu external function (#367) · c5549682
      Tristan Webb authored
      * Initial GPU_ExternalFunction implementation
      
      Other changes:
      
      Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled
      
      Initial passing (a+b)*c test
      
      Properly link cuda libraries
      
      Simple GPUTensorView implementation
      
      Initial GPU emitter
      
      GPU codegen initial function gen, no kernels yet
      
      Rename GPU emitter and tensor_view_wrapper to match naming convention
      
      * GPU external function based on BASE
      
      * Fix stray base -> gpu
      
      * TensorViewWrapper -> GPU_TensorViewWrapper
      
      * Copy over emitter from base transformer
      
      * Fix for naming dense layout
      
      * Copy kernel emitters from base -> gpu and strip out kernel_utils
      
      * Add aliases to GPU_TensorViewWrappers
      
      * More fixes for naming descriptor::TensorViews
      
      * Move in call_frame implementation from base -> gpu
      
      * apply code format
      
      * GPU codegen running A+B*C
      
      gpu emitters
      gpu ctx setup cuda_module kernels
      Remove GPU_CF perf counters
      Use gpu kernels in external function
      Add GPU 1d dot test
      
      Review Changes:
      * Remove CPU specific kernel emitting method bodies
      
      * Use copy_data from test/util.cpp, uncomment compileTest
      
      * Use test_utils copy_data function
      
      * Grab function name from pass manager for def, clean up indentation
      c5549682
  8. 05 Jan, 2018 1 commit
    • Tristan Webb's avatar
      Drwebb/gpu runtime boilerplate (#314) · feab44b5
      Tristan Webb authored
      * Simple boilerplate for GPU runtime files
      
        - GPUBackend
        - GPU ExternalFunction
        - GPUManager
        - GPUCallFrame
      
      * Test for construction all GPU runtime classes
      
      * Comment out calls, constructors haven't been defined
      
      * Clang CUDA source example to later test compiling
      
      Clang cuda example from:
      https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4
      
      * Initial nvptx compiler copied from CPU compiler sources
      
      * Define FunctionMap and Instruction for gpu external function
      
      * Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test
      
      * Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen
      
      * CAdd nvptx_compiler and nvptx_execution_engine to gpu sources
      
      * Compiling source unit test using hardcoded PTX
      
      * (a+b)*c test for GPU
      
      * WIP Fix compile
      
      * rmed accidentally included file
      
      * Fix compile, and LLVM link errosr from nvptx_compiler.cpp
      
      * Stub out parts needed for GPU manager
      
      * Test GPU runtime method stubs
      
      * Cleanup
      
      * Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled
      
      * Kill reflexive assertion
      
      * change GPU naming convention to match CPU
      
      * Snake case functions and identifiers in test case
      
      * Change element type to match changes in master
      
      * Make CUDA headers accessible for codegen with GPU transformer
      
      * clang-format
      
      * apply-code-format
      feab44b5
  9. 21 Nov, 2017 4 commits