Commit 1e091f6f authored by Scott Cyphers's avatar Scott Cyphers Committed by Nick Korovaiko

BatchNorm documentation (#856)

* BatchNorm documentation

* Fix typo, install URL

* Switch to desired BatchNorm
parent b32b5c23
......@@ -70,7 +70,7 @@ and results in a tensor with the same element type and shape:
Here, :math:`X_I` means the value of a coordinate :math:`I` for the tensor
:math:`X`. So the value of sum of two tensors is a tensor whose value at a
coordinate is the sum of the elements are that coordinate for the two inputs.
Unlike many frameowrks, it says nothing about storage or arrays.
Unlike many frameworks, it says nothing about storage or arrays.
An ``Add`` op is used to represent an elementwise tensor sum. To
construct an Add op, each of the two inputs of the ``Add`` must be
......
......@@ -67,7 +67,7 @@ The process documented here will work on Ubuntu\* 16.04 (LTS)
.. code-block:: console
$ git clone git@github.com:NervanaSystems/ngraph.git
$ git clone https://github.com/NervanaSystems/ngraph.git
$ cd ngraph
#. Create a build directory outside of the ``ngraph/src`` directory
......@@ -141,7 +141,7 @@ The process documented here will work on CentOS 7.4.
.. code-block:: console
$ cd /opt/libraries
$ git clone git@github.com:NervanaSystems/ngraph.git
$ git clone https://github.com/NervanaSystems/ngraph.git
$ cd ngraph && mkdir build && cd build
$ cmake ../
$ make && sudo make install
......@@ -219,4 +219,4 @@ be updated frequently in the coming months. Stay tuned!
.. _NervanaSystems: https://github.com/NervanaSystems/ngraph/blob/master/README.md
.. _googletest framework: https://github.com/google/googletest.git
.. _ONNX: http://onnx.ai
.. _website docs: http://ngraph.nervanasys.com/docs/latest/
\ No newline at end of file
.. _website docs: http://ngraph.nervanasys.com/docs/latest/
.. batch_norm.rst:
#########
BatchNorm
#########
NOTE: This describes what the BatchNorm op should look like. The current version
will be made a CPU transformer op.
.. code-block:: cpp
BatchNorm // Produces a normalized output
Description
===========
Produces a normalized output.
Inputs
------
+---------------------+-------------------------+--------------------------------+
| Name | Element Type | Shape |
+=====================+=========================+================================+
| ``input`` | same as ``gamma`` | \(..., C, ...\) |
+---------------------+-------------------------+--------------------------------+
| ``gamma`` | any | \(C\) |
+---------------------+-------------------------+--------------------------------+
| ``beta`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+--------------------------------+
| ``global_mean`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+--------------------------------+
| ``global_variance`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+--------------------------------+
| ``use_global`` | ``bool`` | \(\) |
+---------------------+-------------------------+--------------------------------+
Attributes
----------
+-----------------+--------------------+----------------------+
| Name | Type | Notes |
+==================+====================+=====================+
| ``epsilon`` | same as ``input`` | Bias for variance |
+------------------+--------------------+---------------------+
| ``channel_axis`` | size_t | Channel axis |
+------------------+--------------------+---------------------+
Outputs
-------
+---------------------+-------------------------+--------------------------------+
| Name | Element Type | Shape |
+=====================+=========================+================================+
| ``normalized`` | same as ``gamma`` | same as ``input`` |
+---------------------+-------------------------+--------------------------------+
| ``batch_mean`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+--------------------------------+
| ``batch_variance`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+--------------------------------+
The ``batch_mean`` and ``batch_variance`` are computed per-channel from ``input``.
The values only need to be computed if ``use_global`` is ``false`` or they are used.
Mathematical Definition
=======================
The axes of the input fall into two categories, positional and
channel, with channel being axis 1. For each position, there are
:math:`C` channel values, each normalized independently.
Normalization of a channel sample is controlled by two values, the
mean :math:`\mu`, and the variance :math:`\sigma^2`, and two scaling
attributes, :math:`\gamma` and :math:`\beta`. The values for :math:`\mu`
and :math:`\sigma^2` come from either compuing the mean and variance of
``input`` or from ``global_mean`` and ``global_variance``, depending on
the value of ``use_global``.
.. math::
y_c = \frac{x_c-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c
The mean and variance can be arguments or computed for each channel of
``input`` over the positional axes. When computed from ``input``, the
mean and variance per channel are available as outputs.
Backprop
========
C++ Interface
=============
.. doxygenclass:: ngraph::op::BatchNorm
:project: ngraph
:members:
......@@ -57,6 +57,7 @@ Not currently a comprehensive list.
atan.rst
avg_pool.rst
avg_pool_backprop.rst
batch_norm.rst
broadcast.rst
ceiling.rst
concat.rst
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment