batch_norm.rst 3.83 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
.. batch_norm.rst:

#########
BatchNorm
#########

.. code-block:: cpp

   BatchNorm  // Produces a normalized output


Description
===========

Produces a normalized output.

Inputs
------

L.S. Cook's avatar
L.S. Cook committed
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
+---------------------+-------------------------+-----------------------------+
| Name                | Element Type            | Shape                       |
+=====================+=========================+=============================+
| ``input``           | same as ``gamma``       | \(..., C, ...\)             |
+---------------------+-------------------------+-----------------------------+
| ``gamma``           | any                     | \(C\)                       |
+---------------------+-------------------------+-----------------------------+
| ``beta``            | same as ``gamma``       | \(C\)                       |
+---------------------+-------------------------+-----------------------------+
| ``global_mean``     | same as ``gamma``       | \(C\)                       |
+---------------------+-------------------------+-----------------------------+
| ``global_variance`` | same as ``gamma``       | \(C\)                       |
+---------------------+-------------------------+-----------------------------+
| ``use_global``      | ``bool``                | \(\)                        |
+---------------------+-------------------------+-----------------------------+
35 36 37 38 39


Attributes
----------

L.S. Cook's avatar
L.S. Cook committed
40
+------------------+--------------------+---------------------+
41 42 43 44 45 46 47 48 49
| Name             | Type               | Notes               |
+==================+====================+=====================+
| ``epsilon``      | same as ``input``  | Bias for variance   |
+------------------+--------------------+---------------------+
| ``channel_axis`` | size_t             | Channel axis        |
+------------------+--------------------+---------------------+

Outputs
-------
50

L.S. Cook's avatar
L.S. Cook committed
51 52 53 54 55 56 57 58 59
+---------------------+-------------------------+-----------------------------+
| Name                | Element Type            | Shape                       |
+=====================+=========================+=============================+
| ``normalized``      | same as ``gamma``       | same as ``input``           |
+---------------------+-------------------------+-----------------------------+
| ``batch_mean``      | same as ``gamma``       | \(C\)                       |
+---------------------+-------------------------+-----------------------------+
| ``batch_variance``  | same as ``gamma``       | \(C\)                       |
+---------------------+-------------------------+-----------------------------+
60

61 62 63
The ``batch_mean`` and ``batch_variance`` outputs are computed per-channel from 
``input``. The values only need to be computed if ``use_global`` is ``false``, 
or if they are used.
64 65 66 67 68


Mathematical Definition
=======================

69 70 71 72 73
The axes of the input fall into two categories: positional and channel, with 
channel being axis 1. For each position, there are :math:`C` channel values, 
each normalized independently.

Normalization of a channel sample is controlled by two values:
74

75 76 77 78 79 80 81 82
*  the mean :math:`\mu`, and 
*  the variance :math:`\sigma^2`; 

and by two scaling attributes: :math:`\gamma` and :math:`\beta`. 

The values for :math:`\mu` and :math:`\sigma^2` come either from computing the 
mean and variance of ``input``, or from ``global_mean`` and ``global_variance``, 
depending on the value of ``use_global``.
83 84 85 86 87

.. math::

   y_c = \frac{x_c-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c

88 89 90
The mean and variance can be arguments, or they may be computed for each channel 
of ``input`` over the positional axes. When computed from ``input``, the mean 
and variance per-channel are available as outputs.
91 92 93


C++ Interface
94
==============
95

96
.. doxygenclass:: ngraph::op::BatchNormTraining
97 98
   :project: ngraph
   :members:
99 100 101 102 103 104 105


.. doxygenclass:: ngraph::op::BatchNormInference
   :project: ngraph
   :members: