Commit 1c53fd36 authored by L.S. Cook's avatar L.S. Cook Committed by Scott Cyphers

Documentation update for BatchNorm Ops (#1927)

parent fc5842d9
...@@ -4,9 +4,6 @@ ...@@ -4,9 +4,6 @@
BatchNorm BatchNorm
######### #########
NOTE: This describes what the ``BatchNorm`` op should look like. The current
version will be made a CPU transformer op.
.. code-block:: cpp .. code-block:: cpp
BatchNorm // Produces a normalized output BatchNorm // Produces a normalized output
...@@ -50,6 +47,7 @@ Attributes ...@@ -50,6 +47,7 @@ Attributes
Outputs Outputs
------- -------
+---------------------+-------------------------+-----------------------------+ +---------------------+-------------------------+-----------------------------+
| Name | Element Type | Shape | | Name | Element Type | Shape |
+=====================+=========================+=============================+ +=====================+=========================+=============================+
...@@ -60,38 +58,48 @@ Outputs ...@@ -60,38 +58,48 @@ Outputs
| ``batch_variance`` | same as ``gamma`` | \(C\) | | ``batch_variance`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+-----------------------------+ +---------------------+-------------------------+-----------------------------+
The ``batch_mean`` and ``batch_variance`` are computed per-channel from ``input``. The ``batch_mean`` and ``batch_variance`` outputs are computed per-channel from
The values only need to be computed if ``use_global`` is ``false`` or they are used. ``input``. The values only need to be computed if ``use_global`` is ``false``,
or if they are used.
Mathematical Definition Mathematical Definition
======================= =======================
The axes of the input fall into two categories, positional and The axes of the input fall into two categories: positional and channel, with
channel, with channel being axis 1. For each position, there are channel being axis 1. For each position, there are :math:`C` channel values,
:math:`C` channel values, each normalized independently. each normalized independently.
Normalization of a channel sample is controlled by two values:
Normalization of a channel sample is controlled by two values, the * the mean :math:`\mu`, and
mean :math:`\mu`, and the variance :math:`\sigma^2`, and two scaling * the variance :math:`\sigma^2`;
attributes, :math:`\gamma` and :math:`\beta`. The values for :math:`\mu`
and :math:`\sigma^2` come from either compuing the mean and variance of and by two scaling attributes: :math:`\gamma` and :math:`\beta`.
``input`` or from ``global_mean`` and ``global_variance``, depending on
the value of ``use_global``. The values for :math:`\mu` and :math:`\sigma^2` come either from computing the
mean and variance of ``input``, or from ``global_mean`` and ``global_variance``,
depending on the value of ``use_global``.
.. math:: .. math::
y_c = \frac{x_c-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c y_c = \frac{x_c-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c
The mean and variance can be arguments or computed for each channel of The mean and variance can be arguments, or they may be computed for each channel
``input`` over the positional axes. When computed from ``input``, the of ``input`` over the positional axes. When computed from ``input``, the mean
mean and variance per channel are available as outputs. and variance per-channel are available as outputs.
Backprop
========
C++ Interface C++ Interface
============= ==============
.. doxygenclass:: ngraph::op::BatchNorm .. doxygenclass:: ngraph::op::BatchNormTraining
:project: ngraph :project: ngraph
:members: :members:
.. doxygenclass:: ngraph::op::BatchNormInference
:project: ngraph
:members:
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment