1. 14 Mar, 2018 3 commits
  2. 13 Mar, 2018 2 commits
    • Chris Sullivan's avatar
      GPU elementwise emitters now respect input and output tensor types. (#633) · d3ea93e2
      Chris Sullivan authored
      * GPU elementwise emitters now respect input and output tensor types.
      This enables the use of binary comparison ops and op::Convert.
      
      * Removed comments.
      
      * All kernels now have type signature
      even if the i/o tensors are equivalent type so that
      kernels for specific type tensors are unique.
      
      NGMX-391 #close 
      d3ea93e2
    • Pruthvi's avatar
      Pruthvi/sigmoid bprop (#630) · 490e4e63
      Pruthvi authored
      * - Added pattern matcher for bprop sigmoid
      - mkldnn emitter code for sigmoid bprop
      - Fusion pass unit test for sigmoid bprop
      - style fix
      
      * Added test case for bprop sigmoid
      
      * fixed sigmoid bprop test case failure
      
      * fixed bprop unit test values for sigmoid
      
      * style fix
      
      * fix typo
      
      * Addressed PR comments
      - added layout assignment pass to ensure delta and input have same layout for SigmoidBprop
      490e4e63
  3. 11 Mar, 2018 3 commits
  4. 09 Mar, 2018 5 commits
    • Chris Sullivan's avatar
      Adding support for GPU elementwise ops for arbitrarily many inputs (#618) · 89da71d3
      Chris Sullivan authored
      * Refactored unary elementwise ops into a single interface
      that is adaptable to elementwise ops with arbitrary number of inputs.
      
      * Renamed EmitUnaryElementwise -> EmitElementwise.
      Implemented first binary elementwise op (Power).
      
      * Refactored some of the boiler plate code for emitting cuda kernels to nvrtc
      out of the emit functions and into the CudaFunctionPool static singleton.
      CodeWriter now saves cuda kernels to ./gpu_codegen.
      
      * Added ops Divide, Subtract & Sign to the GPU transformer.
      Subtract and Sign both use custom device helper functions which
      have math kernels defined for the op in gpu_cuda_kernel_ops.hpp,
      and which are built by a new get_device_helper function.
      89da71d3
    • Louis Feng's avatar
      clang format · 362bb996
      Louis Feng authored
      362bb996
    • fenglei.tian's avatar
      9fd64b6f
    • Nick Korovaiko's avatar
      b3d2ff59
    • Pruthvi's avatar
      Pruthvi/sigmoid (#614) · 5885c09a
      Pruthvi authored
      * - Added sigmoid fusion pass
      - added mkldnn emitter code for sigmoid
      
      * - corrected sigmoid expected values
      - add layout assignment for sigmoid op
      
      * - added assert's in cpu fusion for sigmoid
      - style fix
      
      * remove debug prints
      
      * NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes
      
      * NGMX-371 #comment use shape_size() to calculate the 1d input size
      5885c09a
  5. 08 Mar, 2018 5 commits
    • fenglei.tian's avatar
      enable supported backward tests · dd5c77e0
      fenglei.tian authored
      dd5c77e0
    • fenglei.tian's avatar
      add sign op, fix constant bug · dd5a6769
      fenglei.tian authored
      dd5a6769
    • Nick Korovaiko's avatar
      Optimize Broadcast in MatMulBias (#604) · 9cca4073
      Nick Korovaiko authored
      * remove broadcast from matmulbias
      
      * fix comments
      
      * working gemm-based broadcast
      
      * fix clang warning
      9cca4073
    • Chris Sullivan's avatar
      Abstraction for GPU unary elementwise ops (#587) · 529362b5
      Chris Sullivan authored
      * straightforward gpu.cos implementation following previous patterns prior to refactor
      
      * Generalized unary elementwise gpu op impl.. New unary elementwise ops can
      be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step
      is to refactor the llvm interface in gpu_emitters.hpp for similar generality.
      
      * Added gpu_emitter.hpp:EmitUnaryElementwise.
      
      Function adds cuda kernel based on ngraph::op::op_type::description.
      This can service all unary elementwise ops run on the gpu.
      
      * The following elementwise unary ops now use the EmitUnaryElementwise emitter:
      * GPU.abs
      * GPU.acos
      * GPU.asin
      * GPU.atan
      * GPU.ceiling
      * GPU.cos
      * GPU.cosh
      * GPU.exp
      * GPU.floor
      * GPU.log
      * GPU.not
      * GPU.sign
      * GPU.sin
      * GPU.sinh
      * GPU.tan
      * GPU.tanh
      Unary elementwise ops Sign and Not need extra consideration.
      
      * tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff).
      
      * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.
      
      * code format update
      
      * changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format
      
      * updated ngraph-cpp reference
      
      * Fixing incorrect github conflict resolution.
      
      * Added GPU emitter for op::Result.
      For now it simply copies the output tensor.
      
      All but 3 tests now pass. The remaining
      failing tests are:
      * GPU.dot_0_0
      * GPU.dot_matrix_2x0_0x2
      * GPU.dot_2x0_0
      
      * Removed call to handle memory aliasing in gpu_external_function.
      
      * fix gpu emitter bug that will return in the middle of function
      
      * Merge pull request #609 from NervanaSystems/tfl/fix_return_bug
      
      fix gpu emitter bug that will return in the middle of function
      
      * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.
      529362b5
    • Chris Sullivan's avatar
      GPU op::Result implementation (#611) · 905cafd2
      Chris Sullivan authored
      * Added GPU emitter for op::Result.
      For now it simply copies the output tensor.
      
      All but 3 tests now pass. The remaining
      failing tests are:
      * GPU.dot_0_0
      * GPU.dot_matrix_2x0_0x2
      * GPU.dot_2x0_0
      
      * Removed call to handle memory aliasing in gpu_external_function.
      
      * fix gpu emitter bug that will return in the middle of function
      
      * Merge pull request #609 from NervanaSystems/tfl/fix_return_bug
      
      fix gpu emitter bug that will return in the middle of function
      
      * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.
      905cafd2
  6. 07 Mar, 2018 7 commits
    • Pruthvi's avatar
      bn fprop mkldnn optimized implementation (#581) · 9db548c6
      Pruthvi authored
      * - Added support optimized bn mkldnn implementation in cpu emitter
      - modified bn unit_test to support new implementation
      - added layout assignment for bn op
      - Style Fix
      
      (cherry picked from commit 7747a40806d62c126059d5c873adcd2e61a0adb0)
      
      * modified value initilization in cpu_fusion to be float explicit
      
      (cherry picked from commit 03499d380073d0197ab8cbd154eb03f63b042a48)
      
      * fix compilation issue
      
      * Addressed PR comments
      - added exception if gamma and beta layout isnot equal to memory::format::x
      - throw exception if bn Op is not mkldnn op
      
      * fix compilation issue
      
      * added support to handle multiple o/ps in fprop bn fusion
      
      * - Removed laytout pass for bn
      - fixed autodiff bug in bn
      - added "Add" for the dispatcher in cpu-layout pass
      
      * style fix
      
      * Fix bprop batchnorm test with get_output_elements
      
      * Style fix
      9db548c6
    • Scott Cyphers's avatar
      f2e6b48b
    • Louis Feng's avatar
      clang format. · d37b30ad
      Louis Feng authored
      d37b30ad
    • Louis Feng's avatar
      clean up. · 338b9622
      Louis Feng authored
      338b9622
    • Louis Feng's avatar
      simplify convbias test. · 812a699a
      Louis Feng authored
      812a699a
    • Louis Feng's avatar
      refactor and clean up. · 8b7f042d
      Louis Feng authored
      8b7f042d
    • Louis Feng's avatar
      more tests. · 97c2ce20
      Louis Feng authored
      97c2ce20
  7. 06 Mar, 2018 5 commits
    • Jai Menon's avatar
      Zero-padded convolution fusion (#596) · ad58cb29
      Jai Menon authored
      * CPU: Padded Convolution fusion
      
      * CPU: Non-reshaped fusion pattern for zero-padded convolutions
      
      * CPU: Refactor consistency checks
      
      * CPU: Rewrite hoisted reshape expression and add tests
      
      * CPU: Merge leftovers
      ad58cb29
    • Nick Korovaiko's avatar
      Generalize MatMulBias (2nd attempt) (#597) · 55d11bb4
      Nick Korovaiko authored
      * generalize matmulbias
      
      fixes
      
      disable logging
      
      * unit-test failures
      55d11bb4
    • Nick Korovaiko's avatar
      op::Result ver3 (#594) · 5c7e9844
      Nick Korovaiko authored
      * the first stab at op::Result
      
      format fixes
      
      disabling logging
      
      op::Result, 2nd attempt
      
      purge stale code
      
      disable logging
      
      fix copyright header
      
      * initial cleanup
      
      * cleanup2
      
      * remove dead code
      
      * result.cpp, fix comments
      
      * fix comment
      5c7e9844
    • Louis Feng's avatar
      test wip. · 81fe53cd
      Louis Feng authored
      81fe53cd
    • Fenglei's avatar
      gpu broadcast (#576) · 41268068
      Fenglei authored
      * add gpu broadcast
      
      * add broadcast kernel
      
      * fix bug for cumemdopyDtD usage in gpu_external_function.cpp
      41268068
  8. 05 Mar, 2018 1 commit
    • Robert Kimball's avatar
      Include cleanup (#583) · cec89708
      Robert Kimball authored
      * cleanup
      
      * cleanup
      
      * fix all headers to be standalone as far as includes go
      
      * include cleanup
      
      * cleanup includes
      
      * cleanup
      
      * include tester
      
      * wip
      
      * cleanup
      
      * cleanup
      
      * cleanup
      cec89708
  9. 02 Mar, 2018 6 commits
  10. 01 Mar, 2018 2 commits
  11. 28 Feb, 2018 1 commit