Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in / Register
Toggle navigation
N
ngraph
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Packages
Packages
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
submodule
ngraph
Commits
1ee02b23
Commit
1ee02b23
authored
Mar 01, 2018
by
Leona C
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Convert code snippets to literal includes for documentation maintenance and other misc edits
parent
b9841863
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
81 additions
and
131 deletions
+81
-131
theme.css
doc/sphinx/ngraph_theme/static/css/theme.css
+5
-4
execute.rst
doc/sphinx/source/howto/execute.rst
+76
-127
No files found.
doc/sphinx/ngraph_theme/static/css/theme.css
View file @
1ee02b23
...
@@ -1837,18 +1837,19 @@ div[class^='highlight'] td.code {
...
@@ -1837,18 +1837,19 @@ div[class^='highlight'] td.code {
}
}
code
,
p
.caption
,
caption-text
{
code
,
p
.caption
,
caption-text
{
font-family
:
RobotoSlab
,
sans
,
monospace
;
font-family
:
Inconsolata
,
sans
,
monospace
;
color
:
#A79992
;
color
:
#A79992
;
font-size
:
0.9
5
em
;
font-size
:
0.9
9
em
;
line-height
:
1.
11
em
;
line-height
:
1.
39
em
;
}
}
.code-block-caption
{
.code-block-caption
{
font-variant
:
small-caps
;
font-variant
:
small-caps
;
font-size
:
0.88em
;
font-size
:
0.88em
;
background-color
:
#
c3d5d5
;
background-color
:
#
d0dfdf
;
padding-right
:
0.43em
;
padding-right
:
0.43em
;
padding-top
:
0.23em
;
padding-top
:
0.23em
;
padding-left
:
0.11em
;
padding-bottom
:
0.23em
;
padding-bottom
:
0.23em
;
text-align
:
right
;
text-align
:
right
;
}
}
...
...
doc/sphinx/source/howto/execute.rst
View file @
1ee02b23
...
@@ -7,12 +7,13 @@ Executing a Computation
...
@@ -7,12 +7,13 @@ Executing a Computation
This section explains how to manually perform the steps that would normally be
This section explains how to manually perform the steps that would normally be
performed by a framework :term:`bridge` to execute a computation. Intel® nGraph
performed by a framework :term:`bridge` to execute a computation. Intel® nGraph
library is targeted toward automatic construction; it is far easier for a
library is targeted toward automatic construction; it is far easier for a
processing unit (GPU, CPU, or custom silicon) to run a computation than it is
processing unit (GPU, CPU, or NNP) to run a computation than it is for a user
for a user to map out how that computation happens.
to map out how that computation happens. Unfortunately, things that make by-hand
graph construction simpler tend to make automatic construction more difficult,
and vice versa.
Here we will do all the bridge steps manually. Unfortunately, things that make
Here we will do all the bridge steps manually. The model description we'll write
by-hand graph construction simpler tend to make automatic construction more
is based on the :file:`abc.cpp` file in the ``/doc/examples/`` directory.
difficult, and vice versa.
In order to successfully run a computation, the entity (framework or user) must
In order to successfully run a computation, the entity (framework or user) must
be able to do all of these things:
be able to do all of these things:
...
@@ -37,13 +38,14 @@ from a framework's representation of the computation, graph construction can be
...
@@ -37,13 +38,14 @@ from a framework's representation of the computation, graph construction can be
somewhat more tedious for users. To a user, who is usually interested in
somewhat more tedious for users. To a user, who is usually interested in
specific nodes (vertices) or edges of a computation that reveal "what is
specific nodes (vertices) or edges of a computation that reveal "what is
happening where", it can be helpful to think of a computation as a zoomed-out
happening where", it can be helpful to think of a computation as a zoomed-out
and *stateless* dataflow graph where all of the nodes are
basic well-defined
and *stateless* dataflow graph where all of the nodes are
well-defined tensor
tensor operations and all of the edges denote use of an output from
operations and all of the edges denote use of an output from one operation as
one operation as
an input for another operation.
an input for another operation.
.. TODO
.. TODO
.. image for representing nodes and edges
.. image for representing nodes and edges of (a+b)*c
Most of the public portion of the nGraph API is in the ``ngraph`` namespace, so
Most of the public portion of the nGraph API is in the ``ngraph`` namespace, so
we will omit the namespace. Use of namespaces other than ``std`` will be
we will omit the namespace. Use of namespaces other than ``std`` will be
...
@@ -63,47 +65,40 @@ pointers is given in the glossary.
...
@@ -63,47 +65,40 @@ pointers is given in the glossary.
Every node has zero or more *inputs*, zero or more *outputs*, and zero or more
Every node has zero or more *inputs*, zero or more *outputs*, and zero or more
*attributes*. For our purpose to :ref:`define_cmp`, nodes should be thought of
*attributes*. For our purpose to :ref:`define_cmp`, nodes should be thought of
as essentially immutable; that is, when constructing a node, we need to supply
as essentially immutable; that is, when constructing a node, we need to supply
all of its inputs. We get this process started with ops that have no
all of its inputs. We get this process started with ops that have no
inputs,
inputs, since any op with
inputs is going to first need some inputs.
since any op with no
inputs is going to first need some inputs.
``op::Parameter`` specifes the tensors that will be passed to the computation.
``op::Parameter`` specifes the tensors that will be passed to the computation.
They receive their values from outside of the graph, so they have no inputs.
They receive their values from outside of the graph, so they have no inputs.
They have attributes for the element type and the shape of the tensor that will
They have attributes for the element type and the shape of the tensor that will
be passed to them.
be passed to them.
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
Shape s{2, 3};
:lines: 11-13
auto a = std::make_shared<op::Parameter>(element::f32, s);
auto b = std::make_shared<op::Parameter>(element::f32, s);
auto c = std::make_shared<op::Parameter>(element::f32, s);
Here we have made three parameter nodes, each a 32-bit float of shape
Here we have made three parameter nodes, each a 32-bit float of shape
``(2, 3)``
``(2, 3)``
using a row-major element layout.
using a row-major element layout.
We can create a graph for ``(a+b)*c`` by creating an ``op::Add`` node
We can create a graph for ``(a+b)*c`` by creating an ``op::Add`` node with inputs
with inputs from ``a`` and ``b``, and an ``op::Multiply`` node from
from ``a`` and ``b``, and an ``op::Multiply`` node from the add node and ``c``:
the add node and ``c``:
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
auto t0 = std::make_shared<op::Add>(a, b);
:lines: 15-16
auto t1 = std::make_shared<op::Multiply(t0, c);
When the ``op::Add`` op is constructed, it will check that the element
When the ``op::Add`` op is constructed, it will check that the element types and
types and shapes of its inputs match; to support multiple frameworks,
shapes of its inputs match; to support multiple frameworks, ngraph does not do
ngraph does not do automatic type conversion or broadcasting. In this
automatic type conversion or broadcasting. In this case, they match, and the
case, they match, and the shape of the unique output of ``t0`` will be
shape of the unique output of ``t0`` will be a 32-bit float with shape ``(2, 3)``.
a 32-bit float with shape ``(2, 3)``. Similarly, ``op::Multiply``
Similarly, ``op::Multiply``checks that its inputs match and sets the element
checks that its inputs match and sets the element type and shape of
type and shape of its unique output.
its unique output.
Once the graph is built, we need to package it in a ``Function``:
Once the graph is built, we need to package it in a ``Function``:
..
code-block::
cpp
..
literalinclude:: ../../../examples/abc.
cpp
:language: cpp
auto f = make_shared<Function>(NodeVector{t1}, ParameterVector{a, b, c});
:lines: 19
The first argument to the constuctor specifies the nodes that the function will
The first argument to the constuctor specifies the nodes that the function will
return; in this case, the product. A ``NodeVector`` is a vector of shared
return; in this case, the product. A ``NodeVector`` is a vector of shared
...
@@ -111,7 +106,7 @@ pointers of ``op::Node``. The second argument specifies the parameters of the
...
@@ -111,7 +106,7 @@ pointers of ``op::Node``. The second argument specifies the parameters of the
function, in the order they are to be passed to the compiled function. A
function, in the order they are to be passed to the compiled function. A
``ParameterVector`` is a vector of shared pointers to ``op::Parameter``.
``ParameterVector`` is a vector of shared pointers to ``op::Parameter``.
.. important:: The parameter vector must include
*
**every** parameter used in
.. important:: The parameter vector must include **every** parameter used in
the computation of the results.
the computation of the results.
...
@@ -121,43 +116,44 @@ Specify the backend upon which to run the computation
...
@@ -121,43 +116,44 @@ Specify the backend upon which to run the computation
=====================================================
=====================================================
For a framework bridge, a *backend* is the environment that can perform the
For a framework bridge, a *backend* is the environment that can perform the
computations; it can be done with a CPU, GPU, or an
NNP. A *transformer* can
computations; it can be done with a CPU, GPU, or an
Intel Nervana NNP. A
compile computations for a backend, allocate and deallocate tensors, and invok
e
*transformer* can compile computations for a backend, allocate and deallocat
e
computations.
tensors, and invoke
computations.
Factory-like managers for classes of backend managers can compile a ``Function``
Factory-like managers for classes of backend managers can compile a ``Function``
and allocate backends. A backend is somewhat analogous to a multi-threaded
and allocate backends. A backend is somewhat analogous to a multi-threaded
process.
process.
There are two backends for the CPU
, the optimized "CPU" backend, which
There are two backends for the CPU
: the optimized ``"CPU"`` backend, which uses
makes use of mkl-dnn, and the "INTERPRETER" backend which runs
the `Intel MKL-DNN`_, and the ``"INTERPRETER"`` backend, which runs reference
reference versions of kernels where implementation clarity is favored
versions of kernels that favor implementation clarity over speed. The
over speed. The "INTERPRETER" backend is mainly used for testing.
``"INTERPRETER"`` backend can be slow, and is primarily intended for testing.
To select the
"CPU"
backend,
To select the
``"CPU"``
backend,
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
:lines: 22-23
auto manager = runtime::Manager::get("CPU");
auto backend = manager->allocate_backend();
.. _compile_cmp:
.. _compile_cmp:
Compile the computation
Compile the computation
=======================
=======================
Compilation produces something misnamed an ``ExternalFunction``, which
Compilation triggers something that can be used as a factory for producing a
is a factory for producing a ``CallFrame``, a function and associated
``CallFrame`` which is a *function* and its associated *state* that can run
state that can run in a single thread at a time. A ``CallFrame`` may
in a single thread at a time. A ``CallFrame`` may be reused, but any particular
be reused, but any particular ``CallFrame`` must only be running in
``CallFrame`` must only be running in one thread at any time. If more than one
one thread at any time. If more than one thread needs to execute the
thread needs to execute the function at the same time, create multiple
function at the same time, create multiple ``CallFrame`` objects from
``CallFrame`` objects from the ``ExternalFunction``.
the ``ExternalFunction``.
.. code-block:: cpp
.. code-block:: cpp
auto external = manager->compile(f);
.. literalinclude:: ../../../examples/abc.cpp
auto cf = backend->make_call_frame(external);
:language: cpp
:lines: 24-28
.. _allocate_bkd_storage:
.. _allocate_bkd_storage:
...
@@ -178,12 +174,9 @@ Backends are responsible for managing storage. If the storage is off-CPU, caches
...
@@ -178,12 +174,9 @@ Backends are responsible for managing storage. If the storage is off-CPU, caches
are used to minimize copying between device and CPU. We can allocate storage for
are used to minimize copying between device and CPU. We can allocate storage for
the three parameters and return value as follows:
the three parameters and return value as follows:
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
auto t_a = backend->make_primary_tensor_view(element::f32, shape);
:lines: 30-33
auto t_b = backend->make_primary_tensor_view(element::f32, shape);
auto t_c = backend->make_primary_tensor_view(element::f32, shape);
auto t_result = backend->make_primary_tensor_view(element::f32, shape);
Each tensor is a shared pointer to a ``runtime::TensorView``, the interface
Each tensor is a shared pointer to a ``runtime::TensorView``, the interface
backends implement for tensor use. When there are no more references to the
backends implement for tensor use. When there are no more references to the
...
@@ -194,18 +187,17 @@ tensor view, it will be freed when convenient for the backend.
...
@@ -194,18 +187,17 @@ tensor view, it will be freed when convenient for the backend.
Initialize the inputs
Initialize the inputs
=====================
=====================
Normally the framework bridge reads
/
writes bytes to the tensor, assuming a
Normally the framework bridge reads
and
writes bytes to the tensor, assuming a
row-major element layout. To simplify writing unit tests, we have developed a
row-major element layout. To simplify writing unit tests, we have developed a
class for making tensor literals. We can use these to initialize our tensors:
class for making tensor literals. We can use these to initialize our tensors:
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
copy_data(t_a, test::NDArray<float, 2>({{1, 2, 3}, {4, 5, 6}}).get_vector());
:lines: 36-38
copy_data(t_b, test::NDArray<float, 2>({{7, 8, 9}, {10, 11, 12}}).get_vector());
copy_data(t_c, test::NDArray<float, 2>({{1, 0, -1}, {-1, 1, 2}}).get_vector());
The ``test::NDArray`` needs to know the element type (``float``) and rank (``2``)
The ``test::NDArray`` needs to know the element type (``float`` for this
of the tensors, and figures out the shape during template expansion.
example) and rank (``2`` for this example) of the tensors; it will then
populate the shape during template expansion.
The ``runtime::TensorView`` interface has ``write`` and ``read`` methods for
The ``runtime::TensorView`` interface has ``write`` and ``read`` methods for
copying data to/from the tensor.
copying data to/from the tensor.
...
@@ -215,12 +207,13 @@ copying data to/from the tensor.
...
@@ -215,12 +207,13 @@ copying data to/from the tensor.
Invoke the computation
Invoke the computation
======================
======================
To invoke the function, we simply pass argument and result
tensors to
To invoke the function, we simply pass argument and result
ant tensors to the
the
call frame:
call frame:
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
:lines: 41
cf->call({t_a, t_b, t_c}, {t_result});
.. _access_outputs:
.. _access_outputs:
...
@@ -229,61 +222,17 @@ Access the outputs
...
@@ -229,61 +222,17 @@ Access the outputs
We can use the ``read`` method to access the result:
We can use the ``read`` method to access the result:
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
float r[2,3];
:lines: 44-45
t_result->read(&r, 0, sizeof(r));
.. _all_together:
.. _all_together:
Putting it all together
Putting it all together
=======================
=======================
.. code-block:: cpp
.. literalinclude:: ../../../examples/abc.cpp
:language: cpp
#include <iostream>
:lines: 1-46
:caption: "The (a + b) * c example for executing a computation on nGraph"
#include <ngraph.hpp>
using namespace ngraph;
void main()
{
// Build the graph
Shape s{2, 3};
auto a = std::make_shared<op::Parameter>(element::f32, s);
auto b = std::make_shared<op::Parameter>(element::f32, s);
auto c = std::make_shared<op::Parameter>(element::f32, s);
auto t0 = std::make_shared<op::Add>(a, b);
auto t1 = std::make_shared < op::Multiply(t0, c);
// Make the function
auto f = make_shared<Function>(NodeVector{t1}, ParameterVector{a, b, c});
// Get the backend
auto manager = runtime::Manager::get("CPU");
auto backend = manager->allocate_backend();
auto external = manager->compile(f);
// Compile the function
auto cf = backend->make_call_frame(external);
// Allocate tensors
auto t_a = backend->make_primary_tensor_view(element::f32, shape);
auto t_b = backend->make_primary_tensor_view(element::f32, shape);
auto t_c = backend->make_primary_tensor_view(element::f32, shape);
auto t_result = backend->make_primary_tensor_view(element::f32, shape);
// Initialize tensors
copy_data(t_a, test::NDArray<float, 2>({{1, 2, 3}, {4, 5, 6}}).get_vector());
copy_data(t_b, test::NDArray<float, 2>({{7, 8, 9}, {10, 11, 12}}).get_vector());
copy_data(t_c, test::NDArray<float, 2>({{1, 0, -1}, {-1, 1, 2}}).get_vector());
// Invoke the function
cf->call({t_a, t_b, t_c}, {t_result});
// Get the result
float r[2, 3];
t_result->read(&r, 0, sizeof(r));
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment