1. 26 Oct, 2018 2 commits
    • Fenglei's avatar
      nvgpu concat split (#1894) · 58bd00de
      Fenglei authored
      * add split concat
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * add test
      
      * fix test bug
      
      * add comments
      
      * format
      
      * return intead of check processed
      
      * remove .back() since it's not vector anymore.
      
      * format
      
      * change to paramter tests based on Geoff's comments
      
      * types-> type
      
      * change split size to 256
      58bd00de
    • Nishant Patel's avatar
      Add builder for {de}quantize to make API's consistent and support {de}quantize with mkldnn (#1839) · 6b36a480
      Nishant Patel authored
      * Add builder for {de}quantize
      
      * Add declaration in header
      
      * Add mkldnn support for {de}quantize
      
      * Add support for {de}quantize with mkldnn
      
      * Add Dex support
      
      * Generalizing some api's and adding a test case for DQ in backend_test.in.cpp
      
      * Unify scale between ngraph and mkldnn
      
      * Check for nullptrs
      
      * PR feedback
      
      * fix unit test failure
      
      * Adding tests for builder and deleting the backend tests
      
      * curly braces
      
      * test rename
      6b36a480
  2. 24 Oct, 2018 2 commits
    • Chris Sullivan's avatar
      ArgReduce 64 bit indices (#1862) · 9f0589a8
      Chris Sullivan authored
      * Update ArgReduce to handle i64 indices.
      
      * Formatting.
      
      * Add throw for output types other than int32/64.
      
      * Add output type to hash.
      
      * Add type to throw.
      
      * Interpreter doesn't currently support 64bit output indices for argmin/max and so disabling this test [JIRA:NGRAPH-3183].
      9f0589a8
    • Chris Sullivan's avatar
      Cache and use fprop stats in cudnn batchnorm bprop (#1841) · fbc3a940
      Chris Sullivan authored
      * Temp bn update commit.
      
      * Add CUDNNBatchNorm which adds two additional outputs to batchnorm, the batch mean and batch inv variance.
      The batch mean is the same as the output mean if the cummulative average factor is 1.0. Add BatchNormCache pass which replaces all BatchNorm ops that are inputs to BatchNormBackprop
      with CUDNNBatchNorm which outputs the saved batch statistics directly to the backprop step.
      
      * Updated bn cache pass, removed extra tests, added test checking that provided stats are used in bprop instead of batch stats.
      This test was disabled for interpreter as the reference kernel needs to be updated to use provided statistics.
      
      * Formatting.
      
      * Update to new batch norm API.
      
      * CUDNNBatchNorm -> BatchNormTrainingWithStats
      
      * new line
      
      * Preprocess input variance into BN denominator for cudnn (#1885)
      
      * Add explicit cuda kernel to calculate what cuDNN describes as the inverse
      variance. In reality, the backward cudnn kernel for BN requires 1.0f / sqrt(variance + eps),
      which is the batchnorm denominator for each channel (a numerically stable inverse stddev).
      
      This introduces op annotations for batch norm backprop and updates the cudnn_emitter to support the insertion of this cuda kernel when required.
      
      * Disable second test on INTERPRETER.
      fbc3a940
  3. 22 Oct, 2018 3 commits
    • Adam Straw's avatar
      add support for Quantize round mode (#1859) · 51104813
      Adam Straw authored
      * added half_toward_zero; all previous tests passing
      
      * all rounding modes added with unit tests
      
      * fix cpu emitter
      
      * round mode doc
      
      * round out round modes
      
      * doc typo
      
      * using  names for round modes
      
      * use ceil/floor for rounding functions instead of round/nearbyint
      
      * clean up doc
      
      * equidistant
      51104813
    • Nick Korovaiko's avatar
      BatchNorm splitting into ops (2nd try) (#1828) · 1beec46b
      Nick Korovaiko authored
      * split bn into bn_inference bn_training
      
      * fix warnings
      
      * Add GPU support for the new BN ops (#1569)
      
      * Add GPU support and change batchnorm_globalstats test to use BNInference.
      
      * Changed test back to using BNTraining for global stats and updated cudnn backend to account for it.
      
      * Fix issues in merge with master.
      
      * Formatting.
      
      * CPU fixes
      
      * remove 5-arg training BN for now
      
      * more fixes
      
      * python batchnorm changes
      
      * fix onnx_import
      
      * fix a call BatchNormInference c-tor
      
      * yet another fix to BatchNormInference c-tor
      
      * AND yet another fix to batchnorm_inference c-tor
      
      * ops.py
      
      * address adam's feedback
      
      * Remove unnecessary parameter/argument.
      
      * remove batch_norm_training_relu_with_global_stats
      
      * remove bn_relu (training)
      1beec46b
    • Robert Kimball's avatar
      e07147f8
  4. 19 Oct, 2018 1 commit
  5. 14 Oct, 2018 1 commit
  6. 12 Oct, 2018 1 commit
    • Ayan Moitra's avatar
      Support ArgMin and ArgMax for NVGPU Backend (#1737) · 6f30b32b
      Ayan Moitra authored
      * Project initialization commit
      
      * Added unit tests for 3D tensors for argmax
      
      * Refactored reduce to be used by argmax argmin. argmax argmin still has some issues. WIP
      
      * [WIP]First working version of ArgMax ArgMin
      
      * added reduce buffer for the cudnn api calls
      
      * added reduce buffer for the cudnn api calls
      
      * Further modifications. Using rvalues to pass enums to build reduce method
      
      * more unit tests added
      
      * Incorporate Fenglei's comments
      
      * Incorporating Chris's first set of comments
      
      * small change to test file
      
      * Resolving clang issue that was causing argmin test to fail
      
      * Incorporate Chris's  comments
      
      * clang format issue
      6f30b32b
  7. 09 Oct, 2018 1 commit
  8. 08 Oct, 2018 3 commits
  9. 04 Oct, 2018 1 commit
    • Fenglei's avatar
      nvgpu maxpool bug fix (#1741) · 0051f201
      Fenglei authored
      * add a test failed on gpu, pass on cpu
      
      * fixed bug
      
      * get datatype size
      
      * add descript for test
      
      * update comment
      
      * update comments and name
      0051f201
  10. 02 Oct, 2018 1 commit
  11. 29 Sep, 2018 1 commit
  12. 28 Sep, 2018 3 commits
  13. 26 Sep, 2018 1 commit
    • Adam Straw's avatar
      add nGraph quantize op (#1661) · d640fac3
      Adam Straw authored
      * adding nGraph Quantize op
      
      * unit test failing for floating point exception
      
      * unit test working in float
      
      * unit test working in uint8
      
      * improved type checking and polished unit test - passing
      
      * quantized axes working
      
      * inclusive project method
      
      * add round mode
      
      * TODO cleanup
      
      * code format
      
      * adding serializer support - fails build
      
      * add serializer support
      
      * make CPU quantize op work; new tests for int8, clamp)
      
      * fix build failure
      
      * fix GPU build issue
      
      * fix GPU unit test manifest
      
      * use quantized offset
      
      * add is_quantized field to element::Type
      
      * add reduce function to coordinate.hpp
      d640fac3
  14. 18 Sep, 2018 2 commits
  15. 13 Sep, 2018 1 commit
    • Robert Kimball's avatar
      Handle unsupported op in nbench (#1531) · fe676f72
      Robert Kimball authored
      * add unsupported_op exception
      
      * unsupported_op test
      
      * add printout of unsupported op in model
      
      * fix GPU dispatcher check
      
      * fix test designation
      
      * catch exceptions on single file runs too
      
      * add unsupported_op exception where needed
      
      * remove unsupported_op class
      
      * add unassigned op exception
      
      * add unit test
      
      * catch unsupported op in nbench
      
      * add cpu test back
      
      * update all latest merges
      
      * mode change
      fe676f72
  16. 12 Sep, 2018 1 commit
    • gaurides's avatar
      Add in_place support for ReplaceSlice (#1559) · bb6de284
      gaurides authored
      * Add in_place suport for ReplaceSlice
      
      * Add emit_replace_slice_inplace kernel
      
      * changed file permissions to original
      
      * Formatted code using maint/apply-code-format.sh
      
      * Removed data type check and removed dead code
      
      * Removed setting mkldnn_op(true). ReplaceSlice is not mkldnn op
      bb6de284
  17. 07 Sep, 2018 1 commit
  18. 06 Sep, 2018 1 commit
    • Sang Ik Lee's avatar
      TopK (w/ArgMax, ArgMin python wrapper) (#1560) · 3548772b
      Sang Ik Lee authored
      * Implement TopK.
      
      * Update python wrappers for TopK, ArgMin and ArgMax.
      
      * Address some reviewer comments.
      
      * Add type property check tests for TopK.
      Set correct TopK behavior for K==0.
      
      * TopK: Add 1d and 3d unit tests.
      
      * Address more reviewer comments.
      
      * Apply code style.
      3548772b
  19. 04 Sep, 2018 2 commits
    • Fenglei's avatar
      nvgpu reduce to scalar optimization (#1491) · 5f40d957
      Fenglei authored
      * add cuda reduce
      
      * clang format
      
      * fix bugs
      
      * fix bug
      
      * add 1d reduce
      
      * clang format
      
      * fix bugs
      
      * unroll loop
      
      * remove debug info
      
      * revert tests
      
      * unroll 1D reduce op
      
      * add comments
      
      * using cudnn for nd to scalar reduction
      
      * remove cuda 1d reduction since cudnn version is faster
      
      * remove 1D kernel
      
      * fix bugs
      
      * 1d multi block size
      
      * remove debug
      
      * change kernel name
      
      * add reduce to scalar optimization, add test
      
      * fix bugs and tune parameters
      
      * clang format
      
      * update comments
      
      * update comments
      
      * update comments
      
      * clang format
      
      * update comments
      
      * remove wrong comments, apply clang format
      
      * resolve Bob's comment
      
      * clang format
      
      * pass shared mem size from cuLaunchKernel, set unroll loop size through host code
      
      * remove unused code.clang format
      
      * change reduce to thread with shfl for each warp first
      
      * add seed
      
      * unroll size
      5f40d957
    • shssf's avatar
      IntelGPU backend: Sum operation optimization (#1545) · ed22bf6c
      shssf authored
      * IntelGPU backend: Sum operation optimization
      
      * PR1545. Comments addressed. Test added. Helper function refactored.
      ed22bf6c
  20. 03 Sep, 2018 1 commit
  21. 29 Aug, 2018 1 commit
  22. 27 Aug, 2018 1 commit
  23. 22 Aug, 2018 1 commit
  24. 21 Aug, 2018 1 commit
    • Nick Korovaiko's avatar
      ArgMin (#1435) · 951e77b4
      Nick Korovaiko authored
      * argmin
      
      * address feedbacka argmin
      
      * add new lines
      
      *  addnew lines
      
      * address adam's nitpicks
      
      * scott's feedback
      
      * fix unit tests
      951e77b4
  25. 13 Aug, 2018 2 commits
  26. 08 Aug, 2018 1 commit
  27. 02 Aug, 2018 1 commit
    • Nick Korovaiko's avatar
      LRN (#1282) · 237c4803
      Nick Korovaiko authored
      * lrn init
      
      * fix comment
      
      * mkldnn lrn (#1295)
      
      * add serializer + fix compiler warnings
      237c4803
  28. 26 Jul, 2018 1 commit
  29. 18 Jul, 2018 1 commit