Commit b9841863 authored by Scott Cyphers's avatar Scott Cyphers

Add example file

parent a55800f5
#include <iostream>
#include <ngraph.hpp>
using namespace ngraph;
void main()
{
// Build the graph
Shape s{2, 3};
auto a = std::make_shared<op::Parameter>(element::f32, s);
auto b = std::make_shared<op::Parameter>(element::f32, s);
auto c = std::make_shared<op::Parameter>(element::f32, s);
auto t0 = std::make_shared<op::Add>(a, b);
auto t1 = std::make_shared < op::Multiply(t0, c);
// Make the function
auto f = make_shared<Function>(NodeVector{t1}, ParameterVector{a, b, c});
// Get the backend
auto manager = runtime::Manager::get("CPU");
auto backend = manager->allocate_backend();
auto external = manager->compile(f);
// Compile the function
auto cf = backend->make_call_frame(external);
// Allocate tensors
auto t_a = backend->make_primary_tensor_view(element::f32, shape);
auto t_b = backend->make_primary_tensor_view(element::f32, shape);
auto t_c = backend->make_primary_tensor_view(element::f32, shape);
auto t_result = backend->make_primary_tensor_view(element::f32, shape);
// Initialize tensors
copy_data(t_a, test::NDArray<float, 2>({{1, 2, 3}, {4, 5, 6}}).get_vector());
copy_data(t_b, test::NDArray<float, 2>({{7, 8, 9}, {10, 11, 12}}).get_vector());
copy_data(t_c, test::NDArray<float, 2>({{1, 0, -1}, {-1, 1, 2}}).get_vector());
// Invoke the function
cf->call({t_a, t_b, t_c}, {t_result});
// Get the result
float r[2, 3];
t_result->read(&r, 0, sizeof(r));
}
\ No newline at end of file
...@@ -6,6 +6,7 @@ Glossary ...@@ -6,6 +6,7 @@ Glossary
======== ========
.. glossary:: .. glossary::
:sorted:
backend backend
...@@ -50,6 +51,36 @@ Glossary ...@@ -50,6 +51,36 @@ Glossary
The shape of a tensor is a tuple of non-negative integers that The shape of a tensor is a tuple of non-negative integers that
represents an exclusive upper bound for coordinate values. represents an exclusive upper bound for coordinate values.
shared pointer
The C++ standard template library has the template
``std::shared_ptr<X>``. A shared pointer is used like an ``X*``
pointer, but maintains a reference count to the underlying
object. Each new shared pointer to the object increases the
count. When a shared pointer goes out of scope, the reference
count is decremented, and, when the count reaches 0, the
underlying object is deleted. The function template
``std::make_shared<X>(...)`` can be used similarly to ``new
X(...)``, except it returns a ``std::shared_ptr<X>`` instead of
an ``X*``.
If there is a chain of shared pointers from an object back to
itself, every object in the chain is referenced, so the
reference counts will never reach 0 and the objects will never
be deleted.
If ``a`` referenced ``b`` and ``b`` wanted to track all
references to itself and shared pointers were used both
directions, there would be a chain of pointers form ``a`` to
itself. We avoid this by using shared pointers in only one
direction, and raw pointers for the inverse
direction. ``std::enabled_shared_from_this`` is a class template
that defines a method ``shared_from_this`` that provides a
shared pointer from a raw pointer.
nGraph makes use of shared pointers for objects whose lifetime
is hard to determine when they are allocated.
step step
An abstract "action" that produces zero or more tensor outputs An abstract "action" that produces zero or more tensor outputs
......
...@@ -24,7 +24,7 @@ be able to do all of these things: ...@@ -24,7 +24,7 @@ be able to do all of these things:
* :ref:`initialize_inputs` * :ref:`initialize_inputs`
* :ref:`invoke_cmp` * :ref:`invoke_cmp`
* :ref:`access_outputs` * :ref:`access_outputs`
* :ref:`all_together`
.. _define_cmp: .. _define_cmp:
...@@ -37,8 +37,9 @@ from a framework's representation of the computation, graph construction can be ...@@ -37,8 +37,9 @@ from a framework's representation of the computation, graph construction can be
somewhat more tedious for users. To a user, who is usually interested in somewhat more tedious for users. To a user, who is usually interested in
specific nodes (vertices) or edges of a computation that reveal "what is specific nodes (vertices) or edges of a computation that reveal "what is
happening where", it can be helpful to think of a computation as a zoomed-out happening where", it can be helpful to think of a computation as a zoomed-out
and *stateless* dataflow graph where all of the nodes are well-defined and all and *stateless* dataflow graph where all of the nodes are basic well-defined
of the edges are possible routes for executing a computation. tensor operations and all of the edges denote use of an output from
one operation as an input for another operation.
.. TODO .. TODO
...@@ -54,24 +55,16 @@ A computation's graph is constructed from ops; each is a member of a subclass of ...@@ -54,24 +55,16 @@ A computation's graph is constructed from ops; each is a member of a subclass of
computation, but all graphs are composed entirely of instances of ``Node``. computation, but all graphs are composed entirely of instances of ``Node``.
Computation graphs contain only ``op::Op`` nodes. Computation graphs contain only ``op::Op`` nodes.
We mostly use shared pointers for nodes. In the nGraph APIs for :doc:`ops/index`, We mostly use :term:`shared pointers<shared pointer>` for nodes, i.e.
this is presented as ``std::shared_ptr<Node>``, which allows for the ``std::shared_ptr<Node>`` so that they will be automatically
de-allocation of unreferenced nodes when they are no longer referenced. The one deallocated when they are no longer needed. A brief summary of shared
exception to this rule is that there can not be a circular path through shared pointers is given in the glossary.
pointers, as this would prevent the reference counts from ever getting to 0. In
other words, it would be impossible to entirely deallocate or unreference a node.
Every node has zero or more *inputs*, zero or more *outputs*, and zero or more Every node has zero or more *inputs*, zero or more *outputs*, and zero or more
*attributes*. For our purpose to :ref:`define_cmp`, nodes should be thought of *attributes*. For our purpose to :ref:`define_cmp`, nodes should be thought of
as essentially immutable; that is, when constructing a node, we need to supply as essentially immutable; that is, when constructing a node, we need to supply
all of its inputs. To get this process started, we need some nodes that have no all of its inputs. We get this process started with ops that have no
inputs. inputs, since any op with inputs is going to first need some inputs.
For a more concrete example, we can use the analogy of a baton relay. It can be
drawn ahead of time that the runner must hand off the baton to the next runner
at the specific place (node). The runner waiting for the baton is essentially at
an empty node until he receives the baton. The runner who surrenders the baton
to the next runner (node) is de-allocated of the baton.
``op::Parameter`` specifes the tensors that will be passed to the computation. ``op::Parameter`` specifes the tensors that will be passed to the computation.
They receive their values from outside of the graph, so they have no inputs. They receive their values from outside of the graph, so they have no inputs.
...@@ -89,7 +82,7 @@ be passed to them. ...@@ -89,7 +82,7 @@ be passed to them.
Here we have made three parameter nodes, each a 32-bit float of shape Here we have made three parameter nodes, each a 32-bit float of shape
``(2, 3)`` using a row-major element layout. ``(2, 3)`` using a row-major element layout.
We can create a graph for ``(a+b)*c)`` by creating an ``op::Add`` node We can create a graph for ``(a+b)*c`` by creating an ``op::Add`` node
with inputs from ``a`` and ``b``, and an ``op::Multiply`` node from with inputs from ``a`` and ``b``, and an ``op::Multiply`` node from
the add node and ``c``: the add node and ``c``:
...@@ -118,7 +111,7 @@ pointers of ``op::Node``. The second argument specifies the parameters of the ...@@ -118,7 +111,7 @@ pointers of ``op::Node``. The second argument specifies the parameters of the
function, in the order they are to be passed to the compiled function. A function, in the order they are to be passed to the compiled function. A
``ParameterVector`` is a vector of shared pointers to ``op::Parameter``. ``ParameterVector`` is a vector of shared pointers to ``op::Parameter``.
.. important:: The parameter vector must include* **every** *parameter used in .. important:: The parameter vector must include* **every** parameter used in
the computation of the results. the computation of the results.
...@@ -137,9 +130,9 @@ and allocate backends. A backend is somewhat analogous to a multi-threaded ...@@ -137,9 +130,9 @@ and allocate backends. A backend is somewhat analogous to a multi-threaded
process. process.
There are two backends for the CPU, the optimized "CPU" backend, which There are two backends for the CPU, the optimized "CPU" backend, which
makes use of mkl-dnn, and the "INTERPRETER" backend which runs reference makes use of mkl-dnn, and the "INTERPRETER" backend which runs
versions of kernels where implementation clarity is favored over speed. The reference versions of kernels where implementation clarity is favored
"INTERPRETER" backend is mainly used for testing. over speed. The "INTERPRETER" backend is mainly used for testing.
To select the "CPU" backend, To select the "CPU" backend,
...@@ -241,5 +234,56 @@ We can use the ``read`` method to access the result: ...@@ -241,5 +234,56 @@ We can use the ``read`` method to access the result:
float r[2,3]; float r[2,3];
t_result->read(&r, 0, sizeof(r)); t_result->read(&r, 0, sizeof(r));
.. _all_together:
Putting it all together
=======================
.. code-block:: cpp
#include <iostream>
#include <ngraph.hpp>
using namespace ngraph;
void main()
{
// Build the graph
Shape s{2, 3};
auto a = std::make_shared<op::Parameter>(element::f32, s);
auto b = std::make_shared<op::Parameter>(element::f32, s);
auto c = std::make_shared<op::Parameter>(element::f32, s);
auto t0 = std::make_shared<op::Add>(a, b);
auto t1 = std::make_shared < op::Multiply(t0, c);
// Make the function
auto f = make_shared<Function>(NodeVector{t1}, ParameterVector{a, b, c});
// Get the backend
auto manager = runtime::Manager::get("CPU");
auto backend = manager->allocate_backend();
auto external = manager->compile(f);
// Compile the function
auto cf = backend->make_call_frame(external);
// Allocate tensors
auto t_a = backend->make_primary_tensor_view(element::f32, shape);
auto t_b = backend->make_primary_tensor_view(element::f32, shape);
auto t_c = backend->make_primary_tensor_view(element::f32, shape);
auto t_result = backend->make_primary_tensor_view(element::f32, shape);
// Initialize tensors
copy_data(t_a, test::NDArray<float, 2>({{1, 2, 3}, {4, 5, 6}}).get_vector());
copy_data(t_b, test::NDArray<float, 2>({{7, 8, 9}, {10, 11, 12}}).get_vector());
copy_data(t_c, test::NDArray<float, 2>({{1, 0, -1}, {-1, 1, 2}}).get_vector());
// Invoke the function
cf->call({t_a, t_b, t_c}, {t_result});
// Get the result
float r[2, 3];
t_result->read(&r, 0, sizeof(r));
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment