- 08 Feb, 2018 1 commit
-
-
Jennifer Myers authored
-
- 24 Jan, 2018 1 commit
-
-
Tristan Webb authored
* Drwebb/gpu backend dot op (#387) * GPU Dot prod emitter switch statement * cuBLAS dot kernel call * Flush out arg substitution into gpu dot kernel call * Drwebb/gpu backend dot op (#392) * Take in CodeWriter into gpu op emitters * Introduce GPU function gen based on pass functions * Additional gpu emitter stubs * link cublas in to unit test and ngraph * Use static code gen methods for GPU, add new GPU op stubs * use pass manager to declare functions / cublas Updates * Prune down gpu_external_function wip * Switch back to GPU tensor views in GPU backend * Pass in cublas handle to GPU external function * cuMalloc memory in gpu tensor view * Use cuda runtime malloc and free for tensor view managment c * change GPU tensor view init, and use GPU tensor view for GPU call frame * include headers as system dirs * GPU tensor printing utility function * cublasSetPointer to device mode / Fix copyright notification lowercasing * Passing GPU dot product test using cuBLAS Clean up * Changes from review
-
- 20 Jan, 2018 1 commit
-
-
Robert Kimball authored
* wip * wip * remove get_vector from runtime::TensorView class as it was for unit test only * cleanup * move writting vector to runtime::TensorView to the unit test dir * merge fix * PR review change * update from PR comment * update changes file
-
- 19 Jan, 2018 1 commit
-
-
Tristan Webb authored
* Add mention of blob ref of original file from caffe2 * Mention location of source listing originally from LLVM project
-
- 17 Jan, 2018 1 commit
-
-
Tristan Webb authored
* Initial GPU_ExternalFunction implementation Other changes: Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled Initial passing (a+b)*c test Properly link cuda libraries Simple GPUTensorView implementation Initial GPU emitter GPU codegen initial function gen, no kernels yet Rename GPU emitter and tensor_view_wrapper to match naming convention * GPU external function based on BASE * Fix stray base -> gpu * TensorViewWrapper -> GPU_TensorViewWrapper * Copy over emitter from base transformer * Fix for naming dense layout * Copy kernel emitters from base -> gpu and strip out kernel_utils * Add aliases to GPU_TensorViewWrappers * More fixes for naming descriptor::TensorViews * Move in call_frame implementation from base -> gpu * apply code format * GPU codegen running A+B*C gpu emitters gpu ctx setup cuda_module kernels Remove GPU_CF perf counters Use gpu kernels in external function Add GPU 1d dot test Review Changes: * Remove CPU specific kernel emitting method bodies * Use copy_data from test/util.cpp, uncomment compileTest * Use test_utils copy_data function * Grab function name from pass manager for def, clean up indentation
-
- 05 Jan, 2018 1 commit
-
-
Tristan Webb authored
* Simple boilerplate for GPU runtime files - GPUBackend - GPU ExternalFunction - GPUManager - GPUCallFrame * Test for construction all GPU runtime classes * Comment out calls, constructors haven't been defined * Clang CUDA source example to later test compiling Clang cuda example from: https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4 * Initial nvptx compiler copied from CPU compiler sources * Define FunctionMap and Instruction for gpu external function * Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test * Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen * CAdd nvptx_compiler and nvptx_execution_engine to gpu sources * Compiling source unit test using hardcoded PTX * (a+b)*c test for GPU * WIP Fix compile * rmed accidentally included file * Fix compile, and LLVM link errosr from nvptx_compiler.cpp * Stub out parts needed for GPU manager * Test GPU runtime method stubs * Cleanup * Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled * Kill reflexive assertion * change GPU naming convention to match CPU * Snake case functions and identifiers in test case * Change element type to match changes in master * Make CUDA headers accessible for codegen with GPU transformer * clang-format * apply-code-format
-
- 21 Nov, 2017 4 commits
-
-
Tristan Webb authored
Clang cuda example from: https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4
-
Tristan Webb authored
-
Tristan Webb authored
-
Tristan Webb authored
-