Commit fd94811e authored by Leona C's avatar Leona C Committed by Scott Cyphers

Update ToC to better match docplan spreadsheet (#3846)

* New ToC

* Working on docplan

* Clean up for toc

* Link to existing APIs on quantization doc

* Better align topics with docplan ToC; add section for dyn shapes

* Title casing to be consistent

* PR reviews

* New build preview

* Add default opset version, new versioning schema

* Remove duplicate file causing doc build warning
parent a1a8a7e3
...@@ -1535,7 +1535,7 @@ a:hover { ...@@ -1535,7 +1535,7 @@ a:hover {
} }
a:visited { a:visited {
color: #FFA400; color: #DC700D;
} }
html { html {
......
...@@ -33,7 +33,7 @@ Finally, to run the training using two nGraph devices, invoke ...@@ -33,7 +33,7 @@ Finally, to run the training using two nGraph devices, invoke
.. code-block:: console .. code-block:: console
$ mpirun mpirun
To deploy data-parallel training, the ``AllReduce`` op should be added after the To deploy data-parallel training, the ``AllReduce`` op should be added after the
steps needed to complete the :doc:`backpropagation <../constructing-graphs/derive-for-training>`; steps needed to complete the :doc:`backpropagation <../constructing-graphs/derive-for-training>`;
...@@ -48,7 +48,7 @@ See the `full code`_ in the ``examples`` folder ``/doc/examples/mnist_mlp/dist_m ...@@ -48,7 +48,7 @@ See the `full code`_ in the ``examples`` folder ``/doc/examples/mnist_mlp/dist_m
.. code-block:: console .. code-block:: console
$ mpirun -np 2 dist_mnist_mlp mpirun -np 2 dist_mnist_mlp
.. _Intel MLSL: https://github.com/intel/MLSL/releases .. _Intel MLSL: https://github.com/intel/MLSL/releases
......
.. core/overview.rst: .. core/overview.rst:
Basic concepts Basic Concepts
============== ==============
.. figure:: ../graphics/nGraphCompilerstack.png .. figure:: ../graphics/nGraphCompilerstack.png
......
.. core/quantization.rst:
.. _quantization:
Quantization
============
:term:`Quantization` refers the process of reducing the number of bits that
represent a number. In a :abbr:`DL (Deep Learning)` context, weights and
activations can be represented using 8-bit integers (INT8) to compress the
model size of a trained neural network without any significant loss in model
accuracy. INT8 is one kind of quantization. Compared with 32-bit floating point
(FP32), using arithmetic with lower precision, such as INT8, to calculate
weights and activation requires less memory.
Implementing a quantized model with nGraph
------------------------------------------
To implement a quantized model with nGraph, provide a partially (or fully)
quantized model (where the convolution layer in the model is replaced
with a quantized convolution, for example) to the nGraph Library along with
quantized parameters: weights, activations, scale, and zero point.
.. note:: As of version |version|, only quantization for inference is supported.
nGraph Quantized Operators (Ops)
--------------------------------
nGraph uses scale and zero point (also used by ONNX) to map real values to
quantized values. All quantized ops use scale and zero point
and can be used just like any other nGraph op.
**Scale**: the quantization scale of the tensor
**Zero point**: the zero point of the tensor
**Round mode**: used in combination with scale and zero point to round real
values to quantized values
.. table:: Quantization Ops
+-----------------------------------------------------------------+------------------------------------------------+
| Op | Description |
+=================================================================+================================================+
| :doc:`Quantize <../ops/quantize>` | Maps real values (r) to quantized values (q) |
| | using scale (s), zero point (z), |
| | and round mode; produces a quantized tensor. |
+-----------------------------------------------------------------+------------------------------------------------+
| :doc:`Dequantize <../ops/dequantize>` | Maps quantized values (q) to real values (r) |
| | using scale (s) and zero point (z); converts |
| | a quantized tensor to a floating-point tensor. |
+-----------------------------------------------------------------+------------------------------------------------+
| :mod:`FakeQuantize <ngraph.ops.fake_quantize>` | Performs element-wise linear quantization. |
+-----------------------------------------------------------------+------------------------------------------------+
| :mod:`QuantizedConvolution <ngraph.ops.quantized_convolution>` | Performs 8-bit convolution. |
+-----------------------------------------------------------------+------------------------------------------------+
| :mod:`QuantizedDot <ngraph.ops.quantized_dot>` | Performs 8-bit dot. |
+-----------------------------------------------------------------+------------------------------------------------+
Some frameworks such as TensorFlow\* have fused ops. nGraph provides optional
operations to help users easily translate (map) any quantized model created from
frameworks with fused ops to nGraph. Unlike builders, experimental ops take
scale and zero point instead of min and max.
.. table:: Experimental Quantized Ops (optional)
+-----------------------------------+-------------------------------------+
| Operator | Description |
+===================================+=====================================+
| QuantizedConvolutionBias | This experimental op can be |
| | fused with a ReLU op. |
+-----------------------------------+-------------------------------------+
| QuantizedConvolutionBiasAdd | This experimental op constructs a |
| | quantized convolution with bias and |
| | optional ReLU. And then takes input |
| | for the add operation. |
+-----------------------------------+-------------------------------------+
| QuantizedConvolutionBiasSignedAdd | Same as QuantizedConvolutionBiasAdd |
| | but with signed add. |
+-----------------------------------+-------------------------------------+
| QuantizedConvolutionRelu | This experimental op is designed |
| | for a particular use case that |
| | would require convolution |
| | and ReLU to be combined. |
+-----------------------------------+-------------------------------------+
| QuantizedDotBias | This experimental op can be fused |
| | with a ReLU op. |
+-----------------------------------+-------------------------------------+
nGraph Quantization Design
--------------------------
The goal of nGraph quantization is to flexibly support a wide variety of
frameworks and users. The use of scale and zero point as well as quantized
builders in the nGraph design helps to achieve this goal.
Scale and Zero Point
~~~~~~~~~~~~~~~~~~~~
Using scale and zero point allows nGraph to be framework agnostic (i.e., it
can equally support all deep learning frameworks). nGraph Bridges will
automatically convert min and max (provided by a DL framework) to scale and zero
point as needed. Quantized builders are available to help the bridges perform
this calculation. However, if users are directly using nGraph (and not using a
bridge), they are required to provide scale and zero point for quantized ops.
Another advantage of using scale and zero point to express quantization
parameters is that users can flexibly implement quantized ops into various
nGraph backends. When implementing quantized ops, all current nGraph backends
will directly use scale and zero point (and not min and max) to perform
the quantized computation.
Quantized Builders
~~~~~~~~~~~~~~~~~~
Quantized builders are helper utilities to assist framework integrators to
enable quantized models with nGraph. They serve as an API (interface) between
framework bridges and nGraph, allowing framework bridges to directly construct
ops in the nGraph Abstraction Layer.
Quantized builders help nGraph framework bridges by:
* Breaking down a fused quantized operator in the framework to a subgraph (of
quantized and non-quantized operators) in the nGraph core IR
* Converting from min and max to scale and zero point based on the quantization
mode described by the DL framework
.. note:: Fused ops and quantized builders serve the same purpose.
In the future, fused ops will replace quantized builders.
.. table:: nGraph Quantized Builders
+--------------------------+-----------------------------------+-----------------------------------------+
| Category | Builder | Description |
+==========================+===================================+=========================================+
| Scaled Mode | ScaledQuantize | Converts min and max to scale |
| Min / Max Builders | | and zero point using a scaled mode |
| | | calculation and then constructs and |
| | | returns an nGraph Quantize operator. |
| +-----------------------------------+-----------------------------------------+
| | ScaledDequantize | Converts min and max to scale |
| | | and zero point using a scaled mode |
| | | calculation and then constructs and |
| | | returns an nGraph Dequantize operator. |
+--------------------------+-----------------------------------+-----------------------------------------+
| Quantized Convolution | ScaledQuantizedConvolution | Constructs a quantized convolution |
| and Variants | | with an optional ReLU. |
| +-----------------------------------+-----------------------------------------+
| | ScaledQuantizedConvolutionBias | Constructs a quantized convolution |
| | | with bias and an optional ReLU. |
| +-----------------------------------+-----------------------------------------+
| | ScaledQuantizedConvolutionBiasAdd | Constructs a quantized convolution |
| | | with bias and an optional ReLU, where |
| | | the output is added to the output |
| | | of another convolution (sum_input). |
+--------------------------+-----------------------------------+-----------------------------------------+
| Quantized Dot (Matmul) | ScaledQuantizedDot | Constructs a quantized dot (Matmul) |
| and Variants | | with an optional ReLU. |
| +-----------------------------------+-----------------------------------------+
| | ScaledQuantizedDotBias | Constructs a quantized dot (Matmul) |
| | | with bias and an optional ReLU. |
+--------------------------+-----------------------------------+-----------------------------------------+
| Quantized Concat | ScaledQuantizedConcat | Constructs a quantized concatenation. |
+--------------------------+-----------------------------------+-----------------------------------------+
.. dynamic/index.rst:
Dynamic Shapes
==============
.. toctree::
:name:
:maxdepth: 1
:orphan:
.. frameworks/index.rst .. frameworks/index.rst
Working with Frameworks Working with Frameworks
......
.. frameworks/onnx_integ.rst: .. frameworks/onnx_integ.rst:
ONNX overview ONNX
============= ====
nGraph is able to import and execute ONNX models. Models are converted to nGraph is able to import and execute ONNX models. Models are converted to
nGraph's :abbr:`Intermediate Representation (IR)` and converted to ``Function`` nGraph's :abbr:`Intermediate Representation (IR)` and converted to ``Function``
......
.. frameworks/other.rst: .. frameworks/other/index.rst:
.. _fw_other: .. _fw_other:
.. contents::
Integrating other frameworks Integrating other frameworks
============================ ============================
This section details some of the *configuration options* and some of the This section details some of the *configuration options* and some of the
*environment variables* that can be used to tune for optimal performance when *environment variables* that can be used to tune for optimal performance when
your system already has a version of nGraph installed with one or more of our your system already has a version of nGraph installed with one or more of our
supported :doc:`../backends/index`. supported :doc:`../../backends/index`.
Regardless of the framework, after the :doc:`../buildlb` step, a good place Regardless of the framework, after the :doc:`../../buildlb` step, a good place
to start usually involves making the libraries available to the framework. On to start usually involves making the libraries available to the framework. On
Linux\* systems built on Intel® Architecture, that command tends to looks Linux\* systems built on Intel® Architecture, that command tends to looks
something like: something like:
...@@ -24,7 +26,7 @@ something like: ...@@ -24,7 +26,7 @@ something like:
Find or display version Find or display version
----------------------- -----------------------
If you're working with the :doc:`../python_api/index`, the following command If you're working with the :doc:`../../python_api/index`, the following command
may be useful: may be useful:
.. code-block:: console .. code-block:: console
...@@ -92,10 +94,10 @@ Training Deep Neural Networks ...@@ -92,10 +94,10 @@ Training Deep Neural Networks
----------------------------- -----------------------------
Before tweaking various environment variables, be aware that how the computation Before tweaking various environment variables, be aware that how the computation
gets executed depends upon the ordering of the data format that the model is gets executed depends on the data layout that the model is using. ``NHWC`` and
using. ``NHWC`` and ``NCHW`` are the two more common layouts in Deep Learning ``NCHW`` are common layouts in Deep Learning models. Your ultimate
models. Your ultimate runtime can vary greatly -- even when all other factors runtime can vary greatly -- even when all other factors are exactly the same --
are exactly the same -- when this detail is overlooked. when this detail is overlooked.
For CPU (and most cuDNN) backends, the preferred layout is currently ``NCHW``. For CPU (and most cuDNN) backends, the preferred layout is currently ``NCHW``.
...@@ -110,7 +112,7 @@ Intel® Math Kernel Library for Deep Neural Networks ...@@ -110,7 +112,7 @@ Intel® Math Kernel Library for Deep Neural Networks
--------------------------------------------------- ---------------------------------------------------
.. important:: Intel® MKL-DNN is automatically enabled as part of an .. important:: Intel® MKL-DNN is automatically enabled as part of an
nGraph default :doc:`build <../buildlb>`; you do *not* need to add it nGraph default :doc:`build <../../buildlb>`; you do *not* need to add it
separately or as an additional component to be able to use these separately or as an additional component to be able to use these
configuration settings. configuration settings.
...@@ -229,4 +231,3 @@ thus can make more efficient use of the underlying hardware. ...@@ -229,4 +231,3 @@ thus can make more efficient use of the underlying hardware.
.. _BUILDING.md: https://github.com/NervanaSystems/ngraph/blob/master/python/BUILDING.md .. _BUILDING.md: https://github.com/NervanaSystems/ngraph/blob/master/python/BUILDING.md
.. _GCC wiki for details: https://gcc.gnu.org/wiki/FunctionMultiVersioning .. _GCC wiki for details: https://gcc.gnu.org/wiki/FunctionMultiVersioning
.. _following article may be helpful: https://clearlinux.org/documentation/clear-linux/tutorials/fmv .. _following article may be helpful: https://clearlinux.org/documentation/clear-linux/tutorials/fmv
...@@ -16,8 +16,9 @@ the nGraph Compiler, it helps to familiarize yourself with some basic concepts. ...@@ -16,8 +16,9 @@ the nGraph Compiler, it helps to familiarize yourself with some basic concepts.
We use the term :term:`bridge` to describe code that connects to any nGraph We use the term :term:`bridge` to describe code that connects to any nGraph
device backend(s) while maintaining the framework's programmatic or user device backend(s) while maintaining the framework's programmatic or user
interface. We have a `bridge for the TensorFlow framework`_. We also have a interface. We have a `bridge for the TensorFlow framework`_. We also have a
:doc:`paddle_integ` bridge. Intel previously :doc:`contributed work to an MXNet bridge <../project/extras/testing_latency>`; :doc:`paddle_integ` bridge. Intel previously :doc:`contributed work to an MXNet
however, support for the MXNet bridge is no longer active. bridge <../project/extras/testing_latency>`; however, support for the MXNet
bridge is no longer active.
`ONNX`_ on its own is not a framework; it can be used with nGraph's `ONNX`_ on its own is not a framework; it can be used with nGraph's
:doc:`../python_api/index` to import and execute ONNX models. :doc:`../python_api/index` to import and execute ONNX models.
......
.. frameworks/paddle_integ.rst: .. frameworks/paddle_integ.rst:
PaddlePaddle integration PaddlePaddle\*
======================== ==============
PaddlePaddle is an open source deep learning framework developed by Baidu. It PaddlePaddle is an open source deep learning framework developed by Baidu. It
aims to enable performant large-scale distributed computation for deep learning. aims to enable performant large-scale distributed computation for deep learning.
......
.. frameworks/tensorflow_connect.rst: .. frameworks/tensorflow_connect.rst:
nGraph Bridge for TensorFlow TensorFlow\*
============================ ============
See the `README`_ on the `ngraph_bridge repo`_ for the many ways to connect See the `README`_ on the `ngraph_bridge repo`_ for the many ways to connect
......
...@@ -39,38 +39,58 @@ nGraph Compiler Stack Documentation ...@@ -39,38 +39,58 @@ nGraph Compiler Stack Documentation
introduction.rst introduction.rst
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 1
:caption: Framework Support :caption: Framework Support
frameworks/index.rst frameworks/overview.rst
frameworks/validated/list.rst frameworks/tensorflow_connect.rst
frameworks/onnx_integ.rst
frameworks/paddle_integ.rst
frameworks/other/index.rst
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
:caption: nGraph Core :caption: nGraph Core
buildlb.rst
core/overview.rst core/overview.rst
core/fusion/index.rst buildlb.rst
nGraph Core Ops <ops/index.rst>
core/constructing-graphs/index.rst core/constructing-graphs/index.rst
core/passes/passes.rst core/passes/passes.rst
core/fusion/index.rst
nGraph Core Ops <ops/index.rst>
provenance/index.rst provenance/index.rst
Graph Execution API <backends/executable-api/index.rst>
core/quantization.rst
dynamic/index.rst
.. toctree::
:maxdepth: 1
:caption: Backend Support
Basic Concepts <backends/index.rst>
backends/plaidml-ng-api/index.rst
Integrating Other Backends <backends/cpp-api.rst>
.. toctree::
:maxdepth: 1
:caption: Training
training/index.rst
training/qat.rst
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
:caption: APIs :caption: Validated Workloads
python_api/index.rst frameworks/validated/list.rst
backends/index.rst
backends/cpp-api.rst
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
:caption: Inspecting Graphs :caption: Debugging Graphs
inspection/index.rst inspection/index.rst
......
...@@ -58,6 +58,6 @@ Backprop ...@@ -58,6 +58,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Add .. doxygenclass:: ngraph::op::v0::Add
:project: ngraph :project: ngraph
:members: :members:
...@@ -50,6 +50,6 @@ Mathematical Definition ...@@ -50,6 +50,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::And .. doxygenclass:: ngraph::op::v0::And
:project: ngraph :project: ngraph
:members: :members:
...@@ -91,6 +91,6 @@ Backprop ...@@ -91,6 +91,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Broadcast .. doxygenclass:: ngraph::op::v0::Broadcast
:project: ngraph :project: ngraph
:members: :members:
...@@ -56,6 +56,6 @@ Backprop ...@@ -56,6 +56,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Divide .. doxygenclass:: ngraph::op::v0::Divide
:project: ngraph :project: ngraph
:members: :members:
...@@ -49,6 +49,6 @@ Mathematical Definition ...@@ -49,6 +49,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Equal .. doxygenclass:: ngraph::op::v0::Equal
:project: ngraph :project: ngraph
:members: :members:
...@@ -48,6 +48,6 @@ Mathematical Definition ...@@ -48,6 +48,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Greater .. doxygenclass:: ngraph::op::v0::Greater
:project: ngraph :project: ngraph
:members: :members:
...@@ -48,6 +48,6 @@ Mathematical Definition ...@@ -48,6 +48,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::GreaterEq .. doxygenclass:: ngraph::op::v0::GreaterEq
:project: ngraph :project: ngraph
:members: :members:
...@@ -48,6 +48,6 @@ Mathematical Definition ...@@ -48,6 +48,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Less .. doxygenclass:: ngraph::op::v0::Less
:project: ngraph :project: ngraph
:members: :members:
...@@ -48,6 +48,6 @@ Mathematical Definition ...@@ -48,6 +48,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::LessEq .. doxygenclass:: ngraph::op::v0::LessEq
:project: ngraph :project: ngraph
:members: :members:
...@@ -56,6 +56,6 @@ Backprop ...@@ -56,6 +56,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Maximum .. doxygenclass:: ngraph::op::v0::Maximum
:project: ngraph :project: ngraph
:members: :members:
...@@ -56,6 +56,6 @@ Backprop ...@@ -56,6 +56,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Minimum .. doxygenclass:: ngraph::op::v0::Minimum
:project: ngraph :project: ngraph
:members: :members:
...@@ -56,6 +56,6 @@ Backprop ...@@ -56,6 +56,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Multiply .. doxygenclass:: ngraph::op::v0::Multiply
:project: ngraph :project: ngraph
:members: :members:
...@@ -46,6 +46,6 @@ Mathematical Definition ...@@ -46,6 +46,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Not .. doxygenclass:: ngraph::op::v0::Not
:project: ngraph :project: ngraph
:members: :members:
...@@ -49,6 +49,6 @@ Mathematical Definition ...@@ -49,6 +49,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::NotEqual .. doxygenclass:: ngraph::op::v0::NotEqual
:project: ngraph :project: ngraph
:members: :members:
...@@ -50,6 +50,6 @@ Mathematical Definition ...@@ -50,6 +50,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Or .. doxygenclass:: ngraph::op::v0::Or
:project: ngraph :project: ngraph
:members: :members:
...@@ -54,6 +54,6 @@ Backprop ...@@ -54,6 +54,6 @@ Backprop
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Power .. doxygenclass:: ngraph::op::v0::Power
:project: ngraph :project: ngraph
:members: :members:
...@@ -65,7 +65,7 @@ where :math:`I=I_1, I_2, \ldots, I_n` is a coordinate of the output. ...@@ -65,7 +65,7 @@ where :math:`I=I_1, I_2, \ldots, I_n` is a coordinate of the output.
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Slice .. doxygenclass:: ngraph::op::v0::Slice
:project: ngraph :project: ngraph
:members: :members:
...@@ -50,6 +50,6 @@ Mathematical Definition ...@@ -50,6 +50,6 @@ Mathematical Definition
C++ Interface C++ Interface
============= =============
.. doxygenclass:: ngraph::op::Xor .. doxygenclass:: ngraph::op::v0::Xor
:project: ngraph :project: ngraph
:members: :members:
...@@ -21,6 +21,27 @@ We are pleased to announce the release of version |version|. ...@@ -21,6 +21,27 @@ We are pleased to announce the release of version |version|.
Core updates for |version| Core updates for |version|
-------------------------- --------------------------
+ New ops
+ Provenance improvements from 0.25.1
+ More dynamic shape ops
+ More informative errors
Latest documentation updates
----------------------------
+ Additional details on quantization
+ Index updates
+ API updates
.. important:: Pre-releases (``-rc-0.*``) have newer features, and are less stable.
Changelog on Previous Releases
==============================
+ All ops support ``Output<Node>`` arguments + All ops support ``Output<Node>`` arguments
+ Additional ops + Additional ops
+ ONNX handling unknown domains + ONNX handling unknown domains
...@@ -31,21 +52,16 @@ Core updates for |version| ...@@ -31,21 +52,16 @@ Core updates for |version|
+ Negative indices/axes fixes + Negative indices/axes fixes
+ Better support for MKL-DNN 1.0 (DNNL) + Better support for MKL-DNN 1.0 (DNNL)
+ Additional constant element types + Additional constant element types
Latest documentation updates
----------------------------
+ Add new Sphinx-friendly theme (can be built natively for an alternative to ngraph.ai docs). + Add new Sphinx-friendly theme (can be built natively for an alternative to ngraph.ai docs).
+ Update PaddlePaddle documentation to reflect demo directories instead of example directory. + Update PaddlePaddle documentation to reflect demo directories instead of example directory.
+ Update doc regarding the validation of ``Sum`` op. + Update doc regarding the validation of ``Sum`` op.
.. important:: Pre-releases (``-rc-0.*``) have newer features, and are less stable. 0.26.1
------
+ Performance increase for ``ConstantFolding`` pass
Changelog on Previous Releases
==============================
0.25.1 0.25.1
------ ------
...@@ -155,6 +171,7 @@ Changelog on Previous Releases ...@@ -155,6 +171,7 @@ Changelog on Previous Releases
pre-0.20 pre-0.20
-------- --------
+ More dynamic shape preparation + More dynamic shape preparation
+ Distributed interface factored out + Distributed interface factored out
+ fp16 and bfloat16 types + fp16 and bfloat16 types
...@@ -168,8 +185,6 @@ pre-0.20 ...@@ -168,8 +185,6 @@ pre-0.20
+ Additional ONNX ops + Additional ONNX ops
+ Add graph visualization tools to doc + Add graph visualization tools to doc
+ Update doxygen to be friendlier to frontends + Update doxygen to be friendlier to frontends
.. 0.18
+ Python formatting issue + Python formatting issue
+ mkl-dnn work-around + mkl-dnn work-around
+ Event tracing improvements + Event tracing improvements
...@@ -177,16 +192,12 @@ pre-0.20 ...@@ -177,16 +192,12 @@ pre-0.20
+ Begin tracking framework node names + Begin tracking framework node names
+ ONNX quantization + ONNX quantization
+ More fusions + More fusions
.. 0.17
+ Allow negative padding in more places + Allow negative padding in more places
+ Add code generation for some quantized ops + Add code generation for some quantized ops
+ Preliminary dynamic shape support + Preliminary dynamic shape support
+ initial distributed ops + initial distributed ops
+ Pad op takes CoordinateDiff instead of Shape pad values to allow for negative + Pad op takes CoordinateDiff instead of Shape pad values to allow for negative
padding. padding.
.. 0.16
+ NodeInput and NodeOutput classes prepare for simplifications of Node + NodeInput and NodeOutput classes prepare for simplifications of Node
+ Test improvements + Test improvements
+ Additional quantization ops + Additional quantization ops
......
...@@ -3,8 +3,10 @@ ...@@ -3,8 +3,10 @@
Provenance Provenance
########## ##########
.. include:: overview.rst
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
overview.rst
.. other.rst .. other.rst
\ No newline at end of file
.. training/data_ingest.rst:
Data Ingestion
##############
Using TensorFlow
----------------
.. include:: tf_dist.rst
Using PaddlePaddle
------------------
.. include:: paddle_dist.rst
Using a custom framework
------------------------
.. include:: ../core/constructing-graphs/distribute-train.rst
To synchronize gradients across all workers, the essential operation for data
parallel training, due to its simplicity and scalability over parameter servers,
is ``allreduce``. The AllReduce op is one of the nGraph Library’s core ops. To
enable gradient synchronization for a network, we simply inject the AllReduce op
into the computation graph, connecting the graph for the autodiff computation
and optimizer update (which then becomes part of the nGraph graph). The
nGraph Backend will handle the rest.
.. training/index.rst:
Distributed Training
####################
.. toctree::
:maxdepth: 1
overview.rst
data_ingest.rst
.. training/overview.rst:
Basic Concepts
==============
.. important:: Distributed training is not officially supported as of version
|version|; however, some configuration options have worked for nGraph
devices in testing environments.
Data scientists with locally-scalable rack or cloud-based resources will likely
find it worthwhile to experiment with different modes or variations of
distributed training. Deployments using nGraph Library with supported backends
can be configured to train with data parallelism and will soon work with model
parallelism. Distributing workloads is increasingly important, as more data and
bigger models mean the ability to :doc:`../core/constructing-graphs/distribute-train`
work with larger and larger datasets, or to work with models having many layers
that aren't designed to fit to a single device.
Distributed training with data parallelism splits the data and each worker
node has the same model; during each iteration, the gradients are aggregated
across all workers with an op that performs "allreduce", and applied to update
the weights.
Using multiple machines helps to scale and speed up deep learning. With large
mini-batch training, one could train ResNet-50 with Imagenet-1k data to the
*Top 5* classifier in minutes using thousands of CPU nodes. See
`arxiv.org/abs/1709.05011`_.
.. _arxiv.org/abs/1709.05011: https://arxiv.org/format/1709.05011
\ No newline at end of file
.. training/paddle_dist.rst:
Distributed Training with PaddlePaddle
======================================
.. training/qat.rst:
Quantization-Aware Training
===========================
:abbr:`Quantization-Aware Training (QAT)` is a technique used to
quantize models during the training process. The main idea is that
the quantization is emulated in the forward path by inserting some
"Quantization" and "De-Quantization" nodes (Q-DQ) several places in
the network to emulate the inference quantization noise. The
expectation is the backward propagation will alter the weights so
that they will adapt to this noise, and the result loss will be much
better than traditional Post-Training Quantization.
For the weights, it is also common to take different quantization
functions that cut off outliers. Some examples are available in the
`Distiller guide`_. Distiller is an open-source Python package for
neural network compression research. Network compression can reduce
the footprint of a neural network, increase its inference speed, and
save energy. Additionally, a framework for pruning, regularization
and quantization algorithms is provided. A set of tools for analyzing
and evaluating compression performance on previously-known
State-of-the-Art (SotA) algorithms
When using :abbr:`QAT (Quantization-Aware Training)` techniques, the
position in which the Q-DQ ops are placed needs to align with the
fusions hardware does for inference.
.. _Distiller guide: https://nervanasystems.github.io/distiller/algo_quantization.html#quantization-aware-training
.. training/tf_dist.rst:
Distributed Training with TensorFlow
====================================
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment