- 05 Oct, 2018 8 commits
-
-
Chris Sullivan authored
* Add op::Sigmoid to nvgpu. * Bring rnn fusion and concat passes over into GPU from IA. This is a temporary move until generalization and gpu specification can occur. * Add LSTM fusion and cudnn inference kernel. Next need recurrent fusion and layer fusion. * Formatting * Removed unecessary extra output from LSTM op (rnn with seq. length = 1, so y = hy). * Add RNN fusion of LSTM cells within a recurrent layer. * Formatting. * Add fusion across RNN layers. * Formatting. * Add algebraic simplification. * Added rnn fusion tests. * Updated conditional on LSTM fusion to better distinguish bound nodes as ht vs xt. * Formatting. * Removed print statements. * Formatting. * Committing missing file. * Remove concat inputs pass and mkldnn references. * fix cmake paths * conflict resolution with merge from master. * remove explicit lstm op support. bare LSTM ops are converted to RNN ops for emission. * Formatting. * Use NGRAPH_ASSERT. Formatting of intel copyright. * Add check on the feature size (shape) of the recurrent (hidden) input and cell state, to ensure they are the same size. * fix wrong rnn header * Formatting. * Add back lstm op to dispatch table. * Added RNN test which shows cudnn rnn kernel is not producing correct results. * With update to AlgSimpl. to simplify concat-reshape-slice, the check modifed in this commit needed to be relaxed. * Bug fix in parameter tensor packing. * Alias third output element of RNN for cell state (bug fix). * Resolve numerical correctness issue with negative values in RNN (bug fix). Add minimal test to evaluate LSTM and compare with values calculated by hand. * Add tensor parameter sizes to kernel hash as they are kernel-specific. * Add 2 layer lstm fusion test against by-hand solution. * Export param concatenation to graph for cudnn kernel at both the single rnn layer and multi-layer. * Formatting. * Finishing touches after merge: add support for macro expansed dispatch via op_tbl. * Simplify macro support for gpu ops. * Add CUDNN_VERSION >= 7200 defguards for RNN fusion. Need to decide how to notify user of increased performance with >= 7200. * Revert lstm_analytic test to explicitly copy data to tensor params. * Removed namespace arg from NGRAPH_GPU_OP. * Refactored macros to different header so op_tbl only contains op list. * Defguard on cudnn_descriptor<cudnnRNNDataDescriptor_t>. * doubles -> floats * Reorg. pass asserts, prepare to replace with non-throwing pass failures. * Remove Lstm op and replace it with Rnn. * Format * Utilize RETURN_IF_FALSE in rnn pass to avoid any RT asserts. Note that falling back to raw (no passes) graph for 2rnn_3lstm json from mxnet models results in a double free inside of the memory layout pass. Appears to be a bug in Reshape pass through. * Removed print statements. Add check on input data and recurrent data. * Don't reuse memory for non-destructive ops. * Add back Rnn test. * Formatting. * Clean up comments. * Update test per review comments.
-
Adam Procter authored
* Add some asserts to make sure we don't overshoot certain iterators in the reference kernels * Add missing assertion.hpp include
-
dmyershov authored
IntelGPU backend: Broadcast bug fix: (output_shape.at(0) == 1) doesn't mean that it is scalar (#1754)
-
Chris Sullivan authored
* global stats fix * Formatting.
-
Robert Kimball authored
* address klocwork number overflow issue * one more issue
-
Robert Kimball authored
-
Robert Kimball authored
-
Adam Procter authored
* Adapt Tensor class to have partial shapes * Add PartialShapes to Input, Output, Function, Node classes * Terminological cleanup
-
- 04 Oct, 2018 8 commits
-
-
Nishant Patel authored
* Add conv+bias * Add test case for QuantizedConv2DWithBiasAndRelu and address feedback
-
Robert Kimball authored
-
Fenglei authored
* add a test failed on gpu, pass on cpu * fixed bug * get datatype size * add descript for test * update comment * update comments and name
-
Nick Korovaiko authored
* show types in visualize_tree * fix a warning * address Bob's feedback
-
Robert Kimball authored
-
Pruthvi authored
-
Fenglei authored
-
Nick Korovaiko authored
-
- 03 Oct, 2018 3 commits
-
-
L.S. Cook authored
* add doctools js from basic theme sphinx repo * fixes from PR 672 RTD theme regarding sphinx build
-
shssf authored
* IntelGPU backend: Operation Reduce implemented * PR1736. Style fixed
-
Ayan Moitra authored
* cublas emitter * clang format fixes * Initial comment incorporation from Chris * Chris's If-else change comment incorporation * incorporating Bob's comments phase 1 * Remove unnecessary headers in cublas emitter hpp & cpp (as per Bob's comments) * clang format on previous commit * incorporate fenglei's refactoring comment * incorporating comments * Incorporate Chris's final comment * All comments resolved * Resolve Geoff's comments * Change cache_primitive to register_primitive
-
- 02 Oct, 2018 9 commits
-
-
shssf authored
-
Robert Kimball authored
-
Adam Procter authored
Partial Shapes, Part 1: Classes for partially known shapes, possibly unknown dimensions
-
Adam Procter authored
-
Adam Procter authored
Pretty sure at this point that I was reading the docs correctly.
-
Adam Procter authored
-
Adam Procter authored
-
Pruthvi authored
* WIP input * weights rnn optimization * concat + slcing + replacing new node works * WIP unit test case of fusing rnn inputs * - Added unit test case for fusing rnn input weights - registered CPURnnMatFusion_v1/v2 in codegen and DEX * fixed redeclaration of a variable * Refactored rnn input traformation passes into a single pass * Refactored CPURnnMatFusion call back functions * change random generator range to include -ve values in unit test * address PR comments * dont fuse if the shape of the data slices dont match
-
Adam Procter authored
-
- 01 Oct, 2018 7 commits
-
-
Robert Kimball authored
* rename GPU_TensorView to GPUTensor and GPUTensorViewWrapper to GPUTensorWrapper * undo bad search/replace * revert change
-
Adam Procter authored
-
Robert Kimball authored
* cleanup * cleanup header includes * cleanup * cleanup TensorMemoryReservation pass * include cleanup * more cleanup * more header cleanup * style * Remove obsolete comments
-
Fenglei authored
-
Adam Procter authored
-
Scott Cyphers authored
* Sigmoid and tanh doc, edit for abs * Add equation for sigmoid.
-
Adam Procter authored
* Add CODEOWNERS file (will have no effect until enabled in GitHub settings) * Review comments, and fix a username * Tabs -> spaces * Review comments * /maint/ to @cconvey * /maint/ back to @diyessi by default
-
- 30 Sep, 2018 1 commit
-
-
Robert Kimball authored
-
- 29 Sep, 2018 2 commits
-
-
Robert Kimball authored
* rename files * rename runtime TensorView to Tensor * rename HostTensorView to HostTensor
-
Robert Kimball authored
* Address deferred comments from PR 1676 * use dynamic pointer cast for added error checking
-
- 28 Sep, 2018 2 commits
-
-
Nick Korovaiko authored
* set_output_size fix * add assert * dont run get_loop_kernels twice
-
shssf authored
-