- 08 Oct, 2018 5 commits
-
-
Adam Procter authored
-
Robert Kimball authored
-
Chris Sullivan authored
* Add pad with fill operator using the outward-in index pattern. * Remove static pad and rename build_pad_dynamic -> build_pad. Update maxpool 1d padding. * Formatting. * Split build_pad_dynamic into build_pad and build_pad_fill. * Add test coverage for fixed bug in op::Pad for gpu.
-
Jayaram Bobba authored
* Reshape optimizations for when unit-sized dimensions are added/removed from tensors * Added unit tests for eliminating squeeze and expand_dims operations * Bug fix to expand dims layout * Style fix
-
Jayaram Bobba authored
* Check output shape when setting memory layout for slice op. * Miscellaneous fusion and other optimizations for inception-resnetv2 - ConvBias Batchnorm folding - ConvBias Affine folding - Check if MKLDNN can slice a given layout and select layouts appropriately * Fixed unit test and bug in conv bias pattern * Addressed PR feedback * Addressed PR feedback
-
- 06 Oct, 2018 2 commits
-
-
gcwenger authored
* Eliminated two warnings introduced in #1459 * Removed unnecessary call to reserve_workspace.
-
VINOD KUMAR DEVARAMPATI authored
* added constant folding for dequantize * modified as per review comments
-
- 05 Oct, 2018 13 commits
-
-
gcwenger authored
* LRN WIP * Explicit lambda captures. * Switched to Ayan's new caching routine. * Remove commented out lrn from manifest. * Fixed clang 3.9 error. * Corrected lrn hash. Only call cudnnSetLRNDescriptor once. * Simplified lrn hash. Removed redundant parameters. No longer passing CUDNN_LRN_CROSS_CHANNEL_DIM1 as parameter because it's the only choice for cudnnLRNCrossChannelForward.
-
Jaikrishnan Menon authored
-
Scott Cyphers authored
* More op doc, fix formatting * sqrt, tan * Formatting.
-
Robert Kimball authored
-
Robert Kimball authored
* address klocwork issue * move class init * more klocwork * more klocwork * more klocwork * comment on where the magic number is from * address review comments * address review comments
-
Chris Sullivan authored
* Add op::Sigmoid to nvgpu. * Bring rnn fusion and concat passes over into GPU from IA. This is a temporary move until generalization and gpu specification can occur. * Add LSTM fusion and cudnn inference kernel. Next need recurrent fusion and layer fusion. * Formatting * Removed unecessary extra output from LSTM op (rnn with seq. length = 1, so y = hy). * Add RNN fusion of LSTM cells within a recurrent layer. * Formatting. * Add fusion across RNN layers. * Formatting. * Add algebraic simplification. * Added rnn fusion tests. * Updated conditional on LSTM fusion to better distinguish bound nodes as ht vs xt. * Formatting. * Removed print statements. * Formatting. * Committing missing file. * Remove concat inputs pass and mkldnn references. * fix cmake paths * conflict resolution with merge from master. * remove explicit lstm op support. bare LSTM ops are converted to RNN ops for emission. * Formatting. * Use NGRAPH_ASSERT. Formatting of intel copyright. * Add check on the feature size (shape) of the recurrent (hidden) input and cell state, to ensure they are the same size. * fix wrong rnn header * Formatting. * Add back lstm op to dispatch table. * Added RNN test which shows cudnn rnn kernel is not producing correct results. * With update to AlgSimpl. to simplify concat-reshape-slice, the check modifed in this commit needed to be relaxed. * Bug fix in parameter tensor packing. * Alias third output element of RNN for cell state (bug fix). * Resolve numerical correctness issue with negative values in RNN (bug fix). Add minimal test to evaluate LSTM and compare with values calculated by hand. * Add tensor parameter sizes to kernel hash as they are kernel-specific. * Add 2 layer lstm fusion test against by-hand solution. * Export param concatenation to graph for cudnn kernel at both the single rnn layer and multi-layer. * Formatting. * Finishing touches after merge: add support for macro expansed dispatch via op_tbl. * Simplify macro support for gpu ops. * Add CUDNN_VERSION >= 7200 defguards for RNN fusion. Need to decide how to notify user of increased performance with >= 7200. * Revert lstm_analytic test to explicitly copy data to tensor params. * Removed namespace arg from NGRAPH_GPU_OP. * Refactored macros to different header so op_tbl only contains op list. * Defguard on cudnn_descriptor<cudnnRNNDataDescriptor_t>. * doubles -> floats * Reorg. pass asserts, prepare to replace with non-throwing pass failures. * Remove Lstm op and replace it with Rnn. * Format * Utilize RETURN_IF_FALSE in rnn pass to avoid any RT asserts. Note that falling back to raw (no passes) graph for 2rnn_3lstm json from mxnet models results in a double free inside of the memory layout pass. Appears to be a bug in Reshape pass through. * Removed print statements. Add check on input data and recurrent data. * Don't reuse memory for non-destructive ops. * Add back Rnn test. * Formatting. * Clean up comments. * Update test per review comments.
-
Adam Procter authored
* Add some asserts to make sure we don't overshoot certain iterators in the reference kernels * Add missing assertion.hpp include
-
dmyershov authored
IntelGPU backend: Broadcast bug fix: (output_shape.at(0) == 1) doesn't mean that it is scalar (#1754)
-
Chris Sullivan authored
* global stats fix * Formatting.
-
Robert Kimball authored
* address klocwork number overflow issue * one more issue
-
Robert Kimball authored
-
Robert Kimball authored
-
Adam Procter authored
* Adapt Tensor class to have partial shapes * Add PartialShapes to Input, Output, Function, Node classes * Terminological cleanup
-
- 04 Oct, 2018 8 commits
-
-
Nishant Patel authored
* Add conv+bias * Add test case for QuantizedConv2DWithBiasAndRelu and address feedback
-
Robert Kimball authored
-
Fenglei authored
* add a test failed on gpu, pass on cpu * fixed bug * get datatype size * add descript for test * update comment * update comments and name
-
Nick Korovaiko authored
* show types in visualize_tree * fix a warning * address Bob's feedback
-
Robert Kimball authored
-
Pruthvi authored
-
Fenglei authored
-
Nick Korovaiko authored
-
- 03 Oct, 2018 3 commits
-
-
L.S. Cook authored
* add doctools js from basic theme sphinx repo * fixes from PR 672 RTD theme regarding sphinx build
-
shssf authored
* IntelGPU backend: Operation Reduce implemented * PR1736. Style fixed
-
Ayan Moitra authored
* cublas emitter * clang format fixes * Initial comment incorporation from Chris * Chris's If-else change comment incorporation * incorporating Bob's comments phase 1 * Remove unnecessary headers in cublas emitter hpp & cpp (as per Bob's comments) * clang format on previous commit * incorporate fenglei's refactoring comment * incorporating comments * Incorporate Chris's final comment * All comments resolved * Resolve Geoff's comments * Change cache_primitive to register_primitive
-
- 02 Oct, 2018 9 commits
-
-
shssf authored
-
Robert Kimball authored
-
Adam Procter authored
Partial Shapes, Part 1: Classes for partially known shapes, possibly unknown dimensions
-
Adam Procter authored
-
Adam Procter authored
Pretty sure at this point that I was reading the docs correctly.
-
Adam Procter authored
-
Adam Procter authored
-
Pruthvi authored
* WIP input * weights rnn optimization * concat + slcing + replacing new node works * WIP unit test case of fusing rnn inputs * - Added unit test case for fusing rnn input weights - registered CPURnnMatFusion_v1/v2 in codegen and DEX * fixed redeclaration of a variable * Refactored rnn input traformation passes into a single pass * Refactored CPURnnMatFusion call back functions * change random generator range to include -ve values in unit test * address PR comments * dont fuse if the shape of the data slices dont match
-
Adam Procter authored
-