1. 18 Apr, 2018 3 commits
    • Chris Sullivan's avatar
      GPU Padding - add support for custom pad value and interior padding (#860) · 0be581c0
      Chris Sullivan authored
      * * cuda_emitter::build_pad now utilizes pad_value.
      
      * Added TypeInfo class for dispatching c-type information from the underlying ngraph element::Type.
        Adjusted test to use all_close when comparing floating point values (max_pool_2d_1channel_1image_overpadded).
      
      * Refactored max_pool_1d into cuda_emitter so that numeric_limits<c_type>::lowest() could be used for initial max value.
      Test max_pool_2d_1channel_1image_padded_negative_values now enabled and passes.
      
      * Removed old function and switch to size_t to match ngraph.
      
      * Added virtual dtor.
      
      * Adding support for interior padding. All op::Pad functionality is now included.
      
      * More info in runtime_error for checking of tensor dimensions. Removed commented code.
      0be581c0
    • Nick Korovaiko's avatar
      Weight Fusion (#853) · 8cb48d37
      Nick Korovaiko authored
      * CPU weight fusion initial version
      
      * add tests for weight_fusion
      
      * address @jbobba's feedback
      
      * before cleaning up convolution_weight_optimization.cpp
      
      * clean up, rename, fix perms, fix format
      8cb48d37
    • Louis Feng's avatar
  2. 17 Apr, 2018 3 commits
  3. 16 Apr, 2018 8 commits
  4. 13 Apr, 2018 7 commits
    • Robert Kimball's avatar
      Remove legacy Backend API (#848) · ec501913
      Robert Kimball authored
      * remove deprecated
      
      * remove all legacy Backend API usage
      
      remove deprecated files
      
      * pull in changes from master
      
      * fix GPU calls
      
      * disable tests in convolution generator
      
      * update per PR comments. Enable performance counter feature.
      
      * update per PR comments
      
      * fix build error
      
      * fix conditionally compiled test :(
      ec501913
    • Scott Cyphers's avatar
      BatchNorm documentation (#856) · 1e091f6f
      Scott Cyphers authored
      * BatchNorm documentation
      
      * Fix typo, install URL
      
      * Switch to desired BatchNorm
      1e091f6f
    • Nick Korovaiko's avatar
    • DawnStone's avatar
      added the reference OS marker to the image name defined in the contrib/docker/Makefile (#841) · 638f36ee
      DawnStone authored
      fixed variable settings in contrib/docker/make-dimage.sh script
      638f36ee
    • arogowie-intel's avatar
      [Py] Add python wrapper for nGraph Reduce operation. (#827) · c80a1076
      arogowie-intel authored
      * Add python wrapper for nGraph Reduce operation.
      
      - Add UT.
      
      * Refactoring.
      
      - Add UT case with default reduction on all axes.
      
      * Extend `reduce` operation signature to also accept `Function` object.
      
      - Add UT case.
      
      * Fix formatting errors.
      c80a1076
    • Robert Kimball's avatar
      e7cf2662
    • Chris Sullivan's avatar
      Add GPURuntimeContext and GPUPrimitiveEmitter to the gpu transformer (#837) · 026bede0
      Chris Sullivan authored
      * Begin prototype of cudnn_emitter.
      
      * Added GPURuntimeContext to gpu_external_function for passing through to JIT functions.
      
      * gpu_emitters now utilize gpu runtime context.
      
      * Moved cublas and cudnn handles into GPURuntimeContext pointer and out of callframe EntryPoint.
      
      * Added CUDNNEmitter, comparable to MKLDNNEmitter,
      which allows for cudnn kernels to be defined via
      lambda primitives that are emitted and
      subsequently called during graph execution.
      An example implementation is provided for op::Sum.
      
      * Added GPURuntimeContext to gpu_external_function for passing through to JIT functions.
      
      * gpu_emitters now utilize gpu runtime context.
      
      * Moved cublas and cudnn handles into GPURuntimeContext pointer and out of callframe EntryPoint.
      
      * GPURuntimeContext should be stored as unique_ptr in external function.
      
      * GPURuntimeContext should be stored as unique_ptr in external function.
      
      * Extract raw pointer from unique for cudnn_emitter.
      
      * Removing unrelated code from PR.
      
      * GPURuntimeContext needs to be a strict C interface in case
      the native compiler and clang are utilizing different glibc ABIs.
      Updated to reflect this.
      
      * Added cudnn::primitive typedef for better readability.
      
      * Moved allocation of CudaFunctionPool to external function
      so that it is available during gpu emission.
      
      * Fixed too-late initialization of cudart.
      
      * Fixed too-late initialization of cudart.
      
      * CUDNNEmitter moved into superset class GPUPrimitiveEmitter.
      The GPUPrimitiveEmitter handles the emission of all gpu primitives,
      including cudnn, cuda, and cublas. CUBLASEmitter support not yet included.
      
      * Added unordered_map for cacheing primitives in the gpu_emitter.
      
      * Added dtor to GPUPrimitiveEmitter to cleanup compiled functions.
      
      * Adding back a serialized model graph that was accidentally rem* Added a few additional helpers to use ngraph::row_major_strides.
      
      * added whitespace per @fengleitian's comment
      
      * added whitespace per @fengleitian's comment
      
      * Remove implicit type conversions from size_t to int.
      
      * Add op::MaxPool, op::MaxPoolBackprop and op::Pad to GPU transformer (#817)
      
      * Added pooling for 1 and 2dimensions. 1d uses a cuda kernel and 2d utilizes cudnn.
      Padding is not yet supported.
      
      * Normalized call signature on gpu emission for 1d max pool. Added a few comments.
      
      * Max pool backprop impl. inprogress. Amend this commit.
      
      * Max pool backprop implemented. Note that cuDNN
      requests the output tensor for the maxpool operation but it is not required for computation.
      
      * Formatting and invokation for maxpool changed.
      
      * Fixed too-late initialization of cudart.
      
      * Added padding kernel that is used with maxpool. Need to investigate remaining tests.
      
      * Changed dimensionality check to correctly
      determine if data is 1d or not.
      
      * Added 3d MaxPooling (forward), verified by forcing 2d case to use Nd pooling routines.
      
      * Added 3d MaxPooling (backward), verified by forcing 2d case to use Nd pooling routines.
      
      * Moved cudnn prologues for maxpool into ngraph runtime and out of primitive so
      that the only execution occuring on the JIT runtime is the evaluation of the op kernel.
      
      * Refactored forward and backward pooling into single CUDNNEmitter::build_pooling interface
      with a runtime switch to determine if the op is forward or backward propagation.
      
      * Cache preconstructed cudnn kernel for maxpool if it has already been constructed.
      
      * Forgot to add padding arrays back into cudnn kernel for MaxPool in the 2d case.
      
      * Fixed namespace issues and use join(...,'_')
      
      * Refactored 4d/Nd tensor descriptor builder into single function.
      
      * Changed conditionals and comments. Now throws if MaxPool on more than 3 spatial dimensions is requested.
      
      * Fixed forward declare for GPURuntimeContext (class -> struct).
      
      * Clang complains about missing braces on brace-initializer. Fixed implicit conversions.
      
      * Fixed implicit conversions (clang).
      
      * Reverting changes on autodiff test for maxpool. @Krovatkin will update later.
      026bede0
  5. 12 Apr, 2018 6 commits
    • Jaikrishnan Menon's avatar
      dfae57c1
    • Fenglei's avatar
      gpu slice (#843) · 041dd524
      Fenglei authored
      * add slice op, first version
      
      * change size to output size
      
      * fix bugs
      
      * working version
      
      * using exist function for join and strides
      
      * clang format
      
      * revert accidental change
      041dd524
    • Nick Korovaiko's avatar
      RecurrentGraphRewrite + tests (#833) · b14d5665
      Nick Korovaiko authored
      * add a getter for root node
      
      * recurrent graph rewrite
      
      * fix perms, rename match_root -> get_match_root
      
      * fix comp errors
      
      * make match_root return the topmost match; fix tests
      b14d5665
    • Fenglei's avatar
      gpu convolution support nd(n<4) (#824) · b9b7845c
      Fenglei authored
      * add convolution in progress
      
      * enable 1 test
      
      * convolution in progress
      
      * use filter descripter
      
      * filter discreptor bug fix
      
      * tensor format
      
      * add missed dimension calculator
      
      * forward convolution 4d without dilation and padding working
      
      * data dilation(deconvolution) and enable some test
      
      * add backprop convolution data and filter
      
      * backprop can compile
      
      * pass unit test, but still have problem on padding
      
      * 2d, symmtric padding, no data dilation works now
      
      * clean up code
      
      * extend gpu convolution to nd
      
      * fix some bugs
      
      * working version for upto 3d convolution, code format.
      
      * remove nunecessary changes
      
      * add restriction for data dilation and asymmetric padding
      
      * clang format
      
      * support upto 3D convolution for now
      
      * change comments to not implemented
      
      * change comments to not implemented
      
      * add quary for additional GPU workspace for convolution
      
      * clang format
      
      * code format
      
      * using row_major_strides
      
      * using join
      
      * fix bug for join
      
      * refactor dimension calculation
      b9b7845c
    • tsocha's avatar
      [Py] Enable ngraph-cpp ops in Python API (#820) · 9ffb5145
      tsocha authored
      * Enable BatchNorm op
      
      * Enable function call op
      
      * Enable get output element op
      9ffb5145
    • Jaikrishnan Menon's avatar
      CPU: Eliminate slices (#849) · eec19220
      Jaikrishnan Menon authored
      eec19220
  6. 10 Apr, 2018 6 commits
  7. 09 Apr, 2018 7 commits
    • Robert Kimball's avatar
      remove parameter check from Function::get_ops() (#834) · 877ac969
      Robert Kimball authored
      * remove parameter check from Function::get_ops()
      
      * create validate pass to hold parameter validation
      877ac969
    • raramer01's avatar
      Becky/enable more python gpu tests (#830) · e5c3769d
      raramer01 authored
      * unskipping passing gpu tests
      
      * skipping failing gpu tests
      
      * import pytest as needed
      
      * fix style issues
      
      * unskip passing test
      
      * add additional skip reason, unable to compile
      e5c3769d
    • Nick Korovaiko's avatar
      Repackaging match_recurring_pattern into RecurrentMatcher (#832) · 10ef07e6
      Nick Korovaiko authored
      * repacking recurrent matching as a standalone class
      
      * RecurrentMatcher
      
      * add a getter for root node
      
      * address Scott's feedback
      10ef07e6
    • L.S. Cook's avatar
      Editing so far for review and feedback (#813) · a2ab7b50
      L.S. Cook authored
      * WIP editing so far for review and feedback
      
      * Add missing env var export for neon install new process
      
      * Add modified venv setup for TF
      
      * More edits for FW integration and landpage
      
      * Revise from PR feedback
      
      * More PR feedback and editing for clarity
      
      * Minor rewording, clearer explanation
      
      * Final pass edit
      
      * more editing
      a2ab7b50
    • DawnStone's avatar
      add GPU backend support for contrib/docker make process (#814) · 24afb41e
      DawnStone authored
      * adding support for GPU backend to contrib/docker
      
      added gpu dockerfiles
      
      renamed Dockerfile for centos74
      
      fixed NGRAPH_GPU_ENABLE cmake flag name
      
      * Check for GPU support on the host system and fall back to CPU if not present
      
      * removed double option for PREBUILT_LLVM
      
      * updated README.md with additional references for GPU support
      
      * added clarifying comments
      
      cleaned up duplicate settings
      
      * removed deprecated targets from the contrib/docker/Makefile
      
      * resolved absolute vs. conditional assignment for variables based on reference OS
      
      * removed example using a custom DOCKERFILE from README file
      24afb41e
    • Robert Kimball's avatar
      New backend/transformer API (#739) · 777600c6
      Robert Kimball authored
      * force backend compile() to make a copy of the graph
      
      fix copy_with_new_args on ops that have function pointers internal
      
      update unit test for new backend API
      
      add unit test for multiple simulataneous backends
      
      * move get_subdevices virtual method to Manager class
      
      * update GPU to latest
      
      * update call methods
      
      * add remove_compiled_function()
      777600c6
    • Michał Karzyński's avatar
      Merge pull request #818 from NervanaSystems/tsocha/tox-update · ca4a83ea
      Michał Karzyński authored
      [Py] Change Python version for tox
      ca4a83ea