Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in / Register
Toggle navigation
N
ngraph
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Packages
Packages
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
submodule
ngraph
Commits
220288e3
Unverified
Commit
220288e3
authored
Dec 01, 2018
by
Scott Cyphers
Committed by
GitHub
Dec 01, 2018
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Doc for Batchnorm (#2143)
parent
0c6590e7
Show whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
233 additions
and
48 deletions
+233
-48
ngraph.doxyfile
doc/sphinx/ngraph.doxyfile
+1
-1
batch_norm_inference.rst
doc/sphinx/source/ops/batch_norm_inference.rst
+80
-0
batch_norm_training.rst
doc/sphinx/source/ops/batch_norm_training.rst
+38
-45
batch_norm_training_backprop.rst
doc/sphinx/source/ops/batch_norm_training_backprop.rst
+108
-0
index.rst
doc/sphinx/source/ops/index.rst
+6
-2
No files found.
doc/sphinx/ngraph.doxyfile
View file @
220288e3
...
...
@@ -1807,7 +1807,7 @@ SEARCH_INCLUDES = YES
# preprocessor.
# This tag requires that the tag SEARCH_INCLUDES is set to YES.
INCLUDE_PATH =
INCLUDE_PATH =
../../src
# You can use the INCLUDE_FILE_PATTERNS tag to specify one or more wildcard
# patterns (like *.h and *.hpp) to filter out the header-files in the
...
...
doc/sphinx/source/ops/batch_norm_inference.rst
0 → 100644
View file @
220288e3
.. batch_norm_inference.rst:
##################
BatchNormInference
##################
.. code-block:: cpp
BatchNormInference // Adjust input for mean and variance
Description
===========
Inputs
------
+---------------------+-------------------------+------------------------------+
| Name | Element Type | Shape |
+=====================+=========================+==============================+
| ``input`` | real | :math:`(\bullet, C, \ldots)` |
+---------------------+-------------------------+------------------------------+
| ``gamma`` | same as ``input`` | :math:`(C)` |
+---------------------+-------------------------+------------------------------+
| ``beta`` | same as ``input`` | :math:`(C)` |
+---------------------+-------------------------+------------------------------+
| ``mean`` | same as ``input`` | :math:`(C)` |
+---------------------+-------------------------+------------------------------+
| ``variances`` | same as ``input`` | :math:`(C)` |
+---------------------+-------------------------+------------------------------+
Attributes
----------
+------------------+--------------------+--------------------------------------------------------+
| Name | Type | Notes |
+==================+====================+========================================================+
| ``epsilon`` | ``double`` | Small bias added to variance to avoid division by 0. |
+------------------+--------------------+--------------------------------------------------------+
Outputs
-------
+---------------------+-------------------------+-----------------------------+
| Name | Element Type | Shape |
+=====================+=========================+=============================+
| ``normalized`` | same as ``gamma`` | Same as ``input`` |
+---------------------+-------------------------+-----------------------------+
Mathematical Definition
=======================
The axes of the input fall into two categories: positional and channel, with
channel being axis 1. For each position, there are :math:`C` channel values,
each normalized independently.
Normalization of a channel sample is controlled by two values:
* the `mean` :math:`\mu`, and
* the `variance` :math:`\sigma^2`;
and by two scaling attributes: :math:`\gamma` and :math:`\beta`.
.. math::
\mathtt{normalized}_{\bullet, c, \ldots} = \frac{\mathtt{input}_{\bullet, c, \ldots}-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c
C++ Interface
==============
.. doxygenclass:: ngraph::op::BatchNormInference
:project: ngraph
:members:
doc/sphinx/source/ops/batch_norm.rst
→
doc/sphinx/source/ops/batch_norm
_training
.rst
View file @
220288e3
.. batch_norm.rst:
.. batch_norm
_training
.rst:
#########
BatchNorm
#########
#########
########
BatchNorm
Training
#########
########
.. code-block:: cpp
BatchNorm
// Produces a normalized output
BatchNorm
Training // Compute mean and variance from the input.
Description
===========
Produces a normalized output.
Inputs
------
+---------------------+-------------------------+-----------------------------+
+---------------------+-------------------------+-----------------------------
-
+
| Name | Element Type | Shape |
+=====================+=========================+=============================+
| ``input`` | same as ``gamma`` | \(..., C, ...\) |
+---------------------+-------------------------+-----------------------------+
| ``gamma`` | any | \(C\) |
+---------------------+-------------------------+-----------------------------+
| ``beta`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+-----------------------------+
| ``global_mean`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+-----------------------------+
| ``global_variance`` | same as ``gamma`` | \(C\) |
+---------------------+-------------------------+-----------------------------+
| ``use_global`` | ``bool`` | \(\) |
+---------------------+-------------------------+-----------------------------+
+=====================+=========================+==============================+
| ``input`` | real | :math:`(\bullet, C, \ldots)` |
+---------------------+-------------------------+------------------------------+
| ``gamma`` | same as ``input`` | :math:`(C)` |
+---------------------+-------------------------+------------------------------+
| ``beta`` | same as ``input`` | :math:`(C)` |
+---------------------+-------------------------+------------------------------+
Attributes
----------
+------------------+--------------------+---------------------+
+------------------+--------------------+---------------------
-----------------------------------
+
| Name | Type | Notes |
+==================+====================+=====================+
| ``epsilon`` | same as ``input`` | Bias for variance |
+------------------+--------------------+---------------------+
| ``channel_axis`` | size_t | Channel axis |
+------------------+--------------------+---------------------+
+==================+====================+========================================================+
| ``epsilon`` | ``double`` | Small bias added to variance to avoid division by 0. |
+------------------+--------------------+--------------------------------------------------------+
Outputs
-------
...
...
@@ -51,16 +43,15 @@ Outputs
+---------------------+-------------------------+-----------------------------+
| Name | Element Type | Shape |
+=====================+=========================+=============================+
| ``normalized`` | same as ``gamma`` |
s
ame as ``input`` |
| ``normalized`` | same as ``gamma`` |
S
ame as ``input`` |
+---------------------+-------------------------+-----------------------------+
| ``batch_mean`` | same as ``gamma`` |
\(C\)
|
| ``batch_mean`` | same as ``gamma`` |
:math:`(C)`
|
+---------------------+-------------------------+-----------------------------+
| ``batch_variance`` | same as ``gamma`` |
\(C\)
|
| ``batch_variance`` | same as ``gamma`` |
:math:`(C)`
|
+---------------------+-------------------------+-----------------------------+
The ``batch_mean`` and ``batch_variance`` outputs are computed per-channel from
``input``. The values only need to be computed if ``use_global`` is ``false``,
or if they are used.
``input``.
Mathematical Definition
...
...
@@ -72,22 +63,29 @@ each normalized independently.
Normalization of a channel sample is controlled by two values:
* the mean :math:`\mu`, and
* the variance :math:`\sigma^2`;
* the `batch_mean` :math:`\mu`, and
* the `batch_variance` :math:`\sigma^2`;
and by two scaling attributes: :math:`\gamma` and :math:`\beta`.
The values for :math:`\mu` and :math:`\sigma^2` come either from computing the
mean and variance of ``input``, or from ``global_mean`` and ``global_variance``,
depending on the value of ``use_global``.
The values for :math:`\mu` and :math:`\sigma^2` come from computing the
mean and variance of ``input``.
.. math::
\mu_c &= \mathop{\mathbb{E}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\
\sigma^2_c &= \mathop{\mathtt{Var}}\left(\mathtt{input}_{\bullet, c, \ldots}\right)\\
\mathtt{normlized}_{\bullet, c, \ldots} &= \frac{\mathtt{input}_{\bullet, c, \ldots}-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c
Backprop
========
.. math::
y_c = \frac{x_c-\mu_c}{\sqrt{\sigma^2_c+\epsilon}}\gamma_c+\beta_c
[\overline{\texttt{input}}, \overline{\texttt{gamma}}, \overline{\texttt{beta}}]=\\
\mathop{\texttt{BatchNormTrainingBackprop}}(\texttt{input},\texttt{gamma},\texttt{beta},\texttt{mean},\texttt{variance},\overline{\texttt{normed_input}}).
The mean and variance can be arguments, or they may be computed for each channel
of ``input`` over the positional axes. When computed from ``input``, the mean
and variance per-channel are available as outputs.
C++ Interface
...
...
@@ -98,8 +96,3 @@ C++ Interface
:members:
.. doxygenclass:: ngraph::op::BatchNormInference
:project: ngraph
:members:
doc/sphinx/source/ops/batch_norm_training_backprop.rst
0 → 100644
View file @
220288e3
.. batch_norm_training_backprop.rst:
#########################
BatchNormTrainingBackprop
#########################
.. code-block:: cpp
BatchNormTrainingBackprop // Compute mean and variance backprop from the input.
Description
===========
Computes the ``input``, ``gamma`` and ``beta`` backprop increments.
Inputs
------
+----------------------+-------------------------+------------------------------+
| Name | Element Type | Shape |
+======================+=========================+==============================+
| ``input`` | real | :math:`(\bullet, C, \ldots)` |
+----------------------+-------------------------+------------------------------+
| ``gamma`` | same as ``input`` | :math:`(C)` |
+----------------------+-------------------------+------------------------------+
| ``beta`` | same as ``input`` | :math:`(C)` |
+----------------------+-------------------------+------------------------------+
| ``mean`` | same as ``input`` | :math:`(C)` |
+----------------------+-------------------------+------------------------------+
| ``variance`` | same as ``input`` | :math:`(C)` |
+----------------------+-------------------------+------------------------------+
| ``normalized_delta`` | same as ``input`` | same as ``input`` |
+----------------------+-------------------------+------------------------------+
Attributes
----------
+------------------+--------------------+--------------------------------------------------------+
| Name | Type | Notes |
+==================+====================+========================================================+
| ``epsilon`` | ``double`` | Small bias added to variance to avoid division by 0. |
+------------------+--------------------+--------------------------------------------------------+
Outputs
-------
+---------------------+-------------------------+-----------------------------+
| Name | Element Type | Shape |
+=====================+=========================+=============================+
| ``input_delta`` | same as ``input`` | Same as ``input`` |
+---------------------+-------------------------+-----------------------------+
| ``gamma_delta`` | same as ``gamma`` | :math:`(C)` |
+---------------------+-------------------------+-----------------------------+
| ``beta_delta`` | same as ``beta`` | :math:`(C)` |
+---------------------+-------------------------+-----------------------------+
Mathematical Definition
=======================
It is easiest to simplify by looking at a single channel and flattening the
remaining axes into a vector; so ``gamma`` and ``beta`` are scalars, and ``input`` is an
:math:`N`-element vector.
The step by step forward training computation is
.. math::
\mathtt{mean} &= \frac{\sum{\mathtt{input}_i}}{N}\\
\mathtt{centered}_i &= \mathtt{input}_i - \mathtt{mean}\\
\mathtt{square}_i &= \mathtt{centered}_i^2\\
\mathtt{variance} &= \frac{\sum \mathtt{square}_i}{N}\\
\mathtt{invsqrt} &= \frac{1}{\sqrt{\mathtt{variance}+\epsilon}}\\
\mathtt{gmul} &= \texttt{gamma}\cdot \mathtt{invsqrt}\\
\mathtt{normed}_i &= \mathtt{centered}_i\mathtt{gmul}+\texttt{beta}
Using the notation :math:`\overline{\texttt{name}}` for :math:`\texttt{name_delta}`
and :math:`\overline{x} \leftarrow y`
to mean the backprop value for :math:`\texttt{x_delta}` is a sum that includes :math:`y`.
We work backwards
.. math::
\overline{\texttt{beta}}&\leftarrow \overline{\texttt{normed}}\\
\overline{\texttt{gmul}}&\leftarrow \sum \overline{\texttt{normed}}_i\\
\overline{\texttt{centered}}_i&\leftarrow\overline{\texttt{normed}}_i\texttt{gmul}\\
\overline{\texttt{gamma}}&\leftarrow \overline{\texttt{gmul}}\cdot\texttt{invsqrt}\\
\overline{\texttt{invsqrt}}&\leftarrow\texttt{gamma}\cdot\overline{\texttt{gmul}}\\
\overline{\texttt{variance}}&\leftarrow -\frac{\overline{\texttt{invsqrt}}\cdot\texttt{invsqrt}}{2\cdot(\texttt{variance}+\epsilon)}\\
\overline{\texttt{square}}_i&\leftarrow\frac{\overline{\texttt{variance}}}{N}\\
\overline{\texttt{centered}}_i&\leftarrow 2\cdot\texttt{centered}_i\cdot\overline{\texttt{square}}_i\\
\overline{\texttt{input}}_i&\leftarrow\overline{\texttt{centered}}_i\\
\overline{\texttt{mean}}&\leftarrow\sum\overline{\texttt{centered}}_i\\
\overline{\texttt{input}}_i&\leftarrow\frac{\overline{\texttt{mean}}}{N}
C++ Interface
==============
.. doxygenclass:: ngraph::op::BatchNormTrainingBackprop
:project: ngraph
:members:
doc/sphinx/source/ops/index.rst
View file @
220288e3
...
...
@@ -56,7 +56,9 @@ Not currently a comprehensive list.
* :doc:`atan`
* :doc:`avg_pool`
* :doc:`avg_pool_backprop`
* :doc:`batch_norm`
* :doc:`batch_norm_inference`
* :doc:`batch_norm_training`
* :doc:`batch_norm_training_backprop`
* :doc:`broadcast`
* :doc:`ceiling`
* :doc:`concat`
...
...
@@ -123,7 +125,9 @@ Not currently a comprehensive list.
atan.rst
avg_pool.rst
avg_pool_backprop.rst
batch_norm.rst
batch_norm_inference.rst
batch_norm_training.rst
batch_norm_training_backprop.rst
broadcast.rst
ceiling.rst
concat.rst
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment