Commits · c428a97b103673e047de633ffc8cb1ef0dd672de · submodule / ngraph

09 Feb, 2018 1 commit

GPU kernels for reshape, GEMM, EW ADD/Mult, Maximum (#440) · da50410b

Tristan Webb authored 6 years ago

* GPU kernels for reshape, GEMM, EW ADD/Mult, Maximum

(A + B) * C test now with cuBLAS
Additional gemm and gemv calls
cmake updates for cuDNN calls
memcpy wrappers in gpu_util

Additional passing tests:
aliased outputs, parameter, constant tensor memcopy

da50410b

08 Feb, 2018 1 commit
- Add LICENSE and switch to Intel Copyright (#466) · d9a9d2d7
  Jennifer Myers authored 6 years ago
  
  d9a9d2d7
02 Feb, 2018 2 commits

Remove kernel tests in cudnn test file (duplicatedin backend_tests.in) · 8bf6b3ff
Tristan Webb authored 6 years ago

8bf6b3ff

GPU kernels for reshape, GEMM, EW ADD/Mult, Maximum · 1f6284ff

Tristan Webb authored 6 years ago

GPU ew add and mult cuBLAS calls

GPU (A + B) * C with cuBLAS

Additional gemm and gemv calls

cmake updates for cuDNN calls

kernels WIP

params for dot gemm

more kernel WIP

memcpy wrappers

aliased outputs, parameter, constant tensor memcopy

comment cleanup

remove cruft

gpu faster gemm

MNIST WIP

Cleanup

1f6284ff

24 Jan, 2018 1 commit

Drwebb/gpu backend dot op (#413) · 94d80ffa

Tristan Webb authored 6 years ago

* Drwebb/gpu backend dot op (#387)

* GPU Dot prod emitter switch statement

* cuBLAS dot kernel call

* Flush out arg substitution into gpu dot kernel call

* Drwebb/gpu backend dot op (#392)

* Take in CodeWriter into gpu op emitters

* Introduce GPU function gen based on pass functions

* Additional gpu emitter stubs

* link cublas in to unit test and ngraph

* Use static code gen methods for GPU, add new GPU op stubs

* use pass manager to declare functions / cublas Updates

* Prune down gpu_external_function wip

* Switch back to GPU tensor views in GPU backend

* Pass in cublas handle to GPU external function

* cuMalloc memory in gpu tensor view

* Use cuda runtime malloc and free for tensor view managment c

* change GPU tensor view init, and use GPU tensor view for GPU call frame

* include headers as system dirs

* GPU tensor printing utility function

* cublasSetPointer to device mode / Fix copyright notification lowercasing

* Passing GPU dot product test using cuBLAS

Clean up

* Changes from review

94d80ffa

20 Jan, 2018 1 commit

Move std::vector read/write from runtime::TensorView to unit test directory (#397) · bd01bf2c

Robert Kimball authored 7 years ago

* wip

* wip

* remove get_vector from runtime::TensorView class as it was for unit test only

* cleanup

* move writting vector to runtime::TensorView to the unit test dir

* merge fix

* PR review change

* update from PR comment

* update changes file

bd01bf2c

19 Jan, 2018 1 commit

Drwebb/gpu doc (#386) · 408f3b25

Tristan Webb authored 7 years ago

* Add mention of blob ref of original file from caffe2

* Mention location of source listing originally from LLVM project

408f3b25

17 Jan, 2018 1 commit

Drwebb/gpu external function (#367) · c5549682

Tristan Webb authored 7 years ago

* Initial GPU_ExternalFunction implementation

Other changes:

Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled

Initial passing (a+b)*c test

Properly link cuda libraries

Simple GPUTensorView implementation

Initial GPU emitter

GPU codegen initial function gen, no kernels yet

Rename GPU emitter and tensor_view_wrapper to match naming convention

* GPU external function based on BASE

* Fix stray base -> gpu

* TensorViewWrapper -> GPU_TensorViewWrapper

* Copy over emitter from base transformer

* Fix for naming dense layout

* Copy kernel emitters from base -> gpu and strip out kernel_utils

* Add aliases to GPU_TensorViewWrappers

* More fixes for naming descriptor::TensorViews

* Move in call_frame implementation from base -> gpu

* apply code format

* GPU codegen running A+B*C

gpu emitters
gpu ctx setup cuda_module kernels
Remove GPU_CF perf counters
Use gpu kernels in external function
Add GPU 1d dot test

Review Changes:
* Remove CPU specific kernel emitting method bodies

* Use copy_data from test/util.cpp, uncomment compileTest

* Use test_utils copy_data function

* Grab function name from pass manager for def, clean up indentation

c5549682

05 Jan, 2018 1 commit

Drwebb/gpu runtime boilerplate (#314) · feab44b5

Tristan Webb authored 7 years ago

* Simple boilerplate for GPU runtime files

  - GPUBackend
  - GPU ExternalFunction
  - GPUManager
  - GPUCallFrame

* Test for construction all GPU runtime classes

* Comment out calls, constructors haven't been defined

* Clang CUDA source example to later test compiling

Clang cuda example from:
https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4

* Initial nvptx compiler copied from CPU compiler sources

* Define FunctionMap and Instruction for gpu external function

* Rename Compiler -> NVPTXCompiler for gpu compile. Add call to compile for test

* Rename StaticCompiler -> NVPTXStaticCompiler for GPU code gen

* CAdd nvptx_compiler and nvptx_execution_engine to gpu sources

* Compiling source unit test using hardcoded PTX

* (a+b)*c test for GPU

* WIP Fix compile

* rmed accidentally included file

* Fix compile, and LLVM link errosr from nvptx_compiler.cpp

* Stub out parts needed for GPU manager

* Test GPU runtime method stubs

* Cleanup

* Add GPU runtime to same cmake block as GPU, include CUDA headers if GPU enabled

* Kill reflexive assertion

* change GPU naming convention to match CPU

* Snake case functions and identifiers in test case

* Change element type to match changes in master

* Make CUDA headers accessible for codegen with GPU transformer

* clang-format

* apply-code-format

feab44b5

21 Nov, 2017 4 commits
- Link in LLVM to cuda tests / source build · 17ac2f3e
  Tristan Webb authored 7 years ago
```
Clang cuda example from:
https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4
```
  17ac2f3e
- change test name · f8c6734a
  Tristan Webb authored 7 years ago
  
  f8c6734a
- Add Licenses · 1bf1be99
  Tristan Webb authored 7 years ago
  
  1bf1be99
- Test for cuDNN version using cmake identifier and cudnn shared lib · caa7ec72
  Tristan Webb authored 7 years ago
  
  caa7ec72