Commit dcd71a59 authored by L.S. Cook's avatar L.S. Cook Committed by Scott Cyphers

Re-align docs with code example for mnist (#1364)

* Re-align docs with code example for mnist

* Also fix dist_mnist and add context highlight
parent 9a4125ef
...@@ -4,13 +4,9 @@ ...@@ -4,13 +4,9 @@
Derive a trainable model Derive a trainable model
######################### #########################
Documentation in this section describes one of the ways to derive and run a Documentation in this section describes one of the possible ways to turn a
trainable model from an inference model. :abbr:`DL (Deep Learning)` model for inference into one that can be used
for training.
We can derive a trainable model from any graph that has been constructed with
weight-based updates. For this example named ``mnist_mlp.cpp``, we start with
a hand-designed inference model and convert it to a model that can be trained
with nGraph.
Additionally, and to provide a more complete walk-through that *also* trains the Additionally, and to provide a more complete walk-through that *also* trains the
model, our example includes the use of a simple data loader for uncompressed model, our example includes the use of a simple data loader for uncompressed
...@@ -25,19 +21,26 @@ MNIST data. ...@@ -25,19 +21,26 @@ MNIST data.
- :ref:`update` - :ref:`update`
.. _understanding_ml_ecosystem: .. _automating_graph_construction:
Automating graph construction
==============================
Understanding the ML ecosystem In a :abbr:`Machine Learning (ML)` ecosystem, it makes sense to use automation
=============================== and abstraction whereever possible. nGraph was designed to automatically use
the "ops" of tensors provided by a framework when constructing graphs. However,
nGraph's graph-construction API operates at a fundamentally lower level than a
typical framework's API, and writing a model directly in nGraph would be somewhat
akin to programming in assembly language: not impossible, but not the easiest
thing for humans to do.
In a :abbr:`Machine Learning (ML)` ecosystem, it makes sense to take advantage To make the task easier for developers who need to customize the "automatic",
of automation and abstraction as much as possible. As such, nGraph was designed construction of graphs, we've provided some demonstration code for how this
to integrate wtih graph construction endpoints (AKA *ops*) handed down to it could be done. We know, for example, that a trainable model can be derived from
from a framework. Our graph-construction API, therefore, needs to operate at a any graph that has been constructed with weight-based updates.
fundamentally lower level than a typical framework's API. For this reason,
writing a model directly in nGraph would be somewhat akin to programming in The following example named ``mnist_mlp.cpp`` represents a hand-designed
assembly language: not impossible, but not exactly the easiest thing for humans inference model being converted to a model that can be trained with nGraph.
to do.
.. _model_overview: .. _model_overview:
...@@ -74,20 +77,25 @@ Inference ...@@ -74,20 +77,25 @@ Inference
--------- ---------
We begin by building the graph, starting with the input parameter We begin by building the graph, starting with the input parameter
``X``. We define a fully-connected layer, including a parameter for ``X``. We also define a fully-connected layer, including parameters for
weights and bias. weights and bias:
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp :language: cpp
:lines: 124-136 :lines: 127-139
Repeat the process for the next layer,
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp
:lines: 141-149
We repeat the process for the next layer, which we and normalize everything with a ``softmax``.
normalize with a ``softmax``.
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp :language: cpp
:lines: 137-149 :lines: 151-153
.. _loss: .. _loss:
...@@ -95,13 +103,13 @@ normalize with a ``softmax``. ...@@ -95,13 +103,13 @@ normalize with a ``softmax``.
Loss Loss
---- ----
We use cross-entropy to compute the loss. nGraph does not currenty We use cross-entropy to compute the loss. nGraph does not currenty have a core
have a cross-entropy op, so we implement it directly, adding clipping op for cross-entropy, so we implement it directly, adding clipping to prevent
to prevent underflow. underflow.
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp :language: cpp
:lines: 151-169 :lines: 154-169
.. _backprop: .. _backprop:
...@@ -109,16 +117,15 @@ to prevent underflow. ...@@ -109,16 +117,15 @@ to prevent underflow.
Backprop Backprop
-------- --------
We want to reduce the loss by adjusting the weights. We compute the We want to reduce the loss by adjusting the weights. We compute the adjustments
adjustments using the reverse mode autodiff algorithm, commonly using the reverse-mode autodiff algorithm, commonly referred to as "backprop"
referred to as "backprop" because of the way it is implemented in because of the way it is implemented in interpreted frameworks. In nGraph, we
interpreted frameworks. In nGraph, we augment the loss computation augment the loss computation with computations for the weight adjustments. This
with computations for the weight adjustments. This allows the allows the calculations for the adjustments to be further optimized.
calculations for the adjustments to be further optimized.
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp :language: cpp
:lines: 172-175 :lines: 171-175
For any node ``N``, if the update for ``loss`` is ``delta``, the For any node ``N``, if the update for ``loss`` is ``delta``, the
...@@ -129,7 +136,8 @@ update computation for ``N`` will be given by the node ...@@ -129,7 +136,8 @@ update computation for ``N`` will be given by the node
auto update = loss->backprop_node(N, delta); auto update = loss->backprop_node(N, delta);
The different update nodes will share intermediate computations. So to The different update nodes will share intermediate computations. So to
get the updated values for the weights we just say: get the updated values for the weights as computed with the specified
:doc:`backend <../programmable/index>`,
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp :language: cpp
...@@ -153,5 +161,5 @@ compile clones of the nodes. ...@@ -153,5 +161,5 @@ compile clones of the nodes.
.. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/mnist_mlp.cpp
:language: cpp :language: cpp
:lines: 220-226 :lines: 221-226
...@@ -19,18 +19,12 @@ should be added after the steps needed to complete the ...@@ -19,18 +19,12 @@ should be added after the steps needed to complete the
.. literalinclude:: ../../../examples/mnist_mlp/dist_mnist_mlp.cpp .. literalinclude:: ../../../examples/mnist_mlp/dist_mnist_mlp.cpp
:language: cpp :language: cpp
:lines: 188-191 :lines: 180-196
:emphasize-lines: 9-12
Also since we are using OpenMPI in this example, we need to initialize and Also since we are using OpenMPI in this example, we need to initialize and
finalize MPI. finalize MPI with ``MPI::Init();`` and ``MPI::Finalize();`` at the beginning
and the end of the code used to deploy to devices; see the `full raw code`_.
.. literalinclude:: ../../../examples/mnist_mlp/dist_mnist_mlp.cpp
:language: cpp
:lines: 112
.. literalinclude:: ../../../examples/mnist_mlp/dist_mnist_mlp.cpp
:language: cpp
:lines: 295
Finally, to run the training on two nGraph devices, invoke :command:`mpirun`. Finally, to run the training on two nGraph devices, invoke :command:`mpirun`.
This will run on a single machine and launch two processes. This will run on a single machine and launch two processes.
...@@ -41,7 +35,5 @@ This will run on a single machine and launch two processes. ...@@ -41,7 +35,5 @@ This will run on a single machine and launch two processes.
$ mpirun -np 2 dist_mnist_mlp $ mpirun -np 2 dist_mnist_mlp
.. _OpenMPI: https://www.open-mpi.org/software/ompi/v3.1 .. _OpenMPI: https://www.open-mpi.org/software/ompi/v3.1
.. _full raw code: https://github.com/NervanaSystems/ngraph/blob/master/doc/examples/mnist_mlp/dist_mnist_mlp.cpp
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment