Commit 910743cd authored by Jaikrishnan Menon's avatar Jaikrishnan Menon

Merge branch 'master' into cpu_layout2

parents 2d5a886d 9a7ddbc0
......@@ -3,52 +3,45 @@
Graph Basics
============
This section describes the basic concepts you need to know when constructing
a graph.
Tensors
-------
*Tensors* are maps from coordinates to scalar values, all of the same type,
called the *element type* of the tensor. Coordinates are tuples of non-negative
integers; all the coordinates for a tensor have the same length, called the
*rank* of the tensor. We often use :math:`n`-tensor for tensors with rank
:math:`n`. An :math:`n`-dimensional array is a common implementation of a
tensor, and the two terms are often used interchangeably. However, a tensor
could just as easily be a function that returns 0 for every coordinate.
*Tensors* are maps from coordinates to scalar values, all of the same
type, called the *element type* of the tensor. Coordinates are tuples
of non-negative integers; all the coordinates for a tensor have the
same length, called the *rank* of the tensor. We often use
:math:`n`-tensor for tensors with rank :math:`n`.
The :term:`shape` of a tensor is a tuple of non-negative integers that
represents an exclusive upper bound for coordinate values. A tensor has an
element for every coordinate less than the shape, so the *size* of the tensor
is the product of the values in the shape.
An :math:`n`-dimensional array is a common implementation of a tensor, and the
two terms are often used interchangeably, but a tensor could just as easily be
a function that returns 0 for every coordinate.
In the graph, every op input must be associated with an op output, and every op
output must have a constant element type and shape that will correspond to the
tensors used in the computation.
An :math:`n`-dimensional array is the usual implementation for a
tensor, and the two terms are often used interchangeably, but a tensor
could just as easily be represented by a function that returns 0 for
every coordinate or a function that adds the elements of two other
tensors at the same coordinate and returns that sum.
Ops
---
The graph is a composition of tensor computations, called ``ops``, which are
nodes in the graph. In the graph, every :term:`op` *input* must be associated
with an op *output*, and every op output must have a constant element type and
shape to correspond with the tensors used in the computation. Every op has:
* zero or more inputs, and
* zero or more outputs;
these represent tensors that will be provided during execution. Ops may also
have additional attributes that do not change during execution.
A computation graph is a composition of tensor computations, called
``ops``, which are nodes in the graph. In the graph, every :term:`op`
*input* must be associated with an op *output*, and every op output
must have a fixed element type and shape to correspond with the
tensors used in the computation. Every op has zero or more inputs and
zero or more outputs. The outputs represent tensors that will be
provided during execution. Ops may also have additional attributes
that do not change during execution.
Graph function
---------------
Function definition begins with creating one or more ``Parameter`` ops,
which represent the tensors that will be supplied as arguments to the function.
Parameters have no inputs and attributes for the element type and shape of the
tensor that will be provided as an argument. The unique output of the
``Parameter`` will have the provided element type and shape.
Every `op` is a `Node`, but not all nodes are ops. This is because
pattern graphs are another kind of graph that includes ops combined
with nodes that describe how to match subgraphs during graph
optimization.
Constructed ops have element types and shapes for each of their outputs, which
are determined during op construction from the element types and shapes
......@@ -65,189 +58,130 @@ Here, :math:`X_I` means the value of a coordinate :math:`I` for the tensor
coordinate is the sum of the elements are that coordinate for the two inputs.
Unlike many frameowrks, it says nothing about storage or arrays.
An ``Add`` op is used to represent a tensor sum. To construct an Add op, each of
the two inputs of the ``Add`` must be associated with some output of some
already-created op. All outputs of constructed ops have element types and shapes,
so when the Add is constructed, it verifies that the two outputs associated with
its two inputs have the same element type and shape and sets its output to have
the same element type and shape.
An ``Add`` op is used to represent an elementwise tensor sum. To
construct an Add op, each of the two inputs of the ``Add`` must be
assigned some output of some already-created op. All outputs of
constructed ops have element types and shapes, so when the Add is
constructed, it verifies that the two input tensors have the same
element type and shape and then sets its output to have the same
element type and shape.
Since all nodes supplying outputs for inputs to a new node must exist before the
new node can be created, it is impossible to construct a cyclic graph.
Furthermore, type-checking can be performed as the ops are constructed.
Since all nodes supplying outputs for inputs to a new node must exist
before the new node can be created, it is impossible to construct a
cyclic graph. Furthermore, type-checking is performed as the ops are
constructed.
Functions
---------
Ops are grouped together in an ``ExternalFunction``, which describes a
Ops are grouped together in a ``Function``, which describes a
computation that can be invoked on tensor arguments to compute tensor
results. The caller provides tensors in the form of row-major arrays
for each argument and each computed result. The same array can be used
for more than one argument, but each result must use a distinct array,
and argument arrays cannot be used as result arrays.
The ``ExternalFunction`` has ``Parameter``, a vector of ``Parameter`` ops,
where no ``Parameter`` op may appear more than once in the vector.
Each ``Parameter`` op has attributes for its shape and element type;
arrays passed to the function must have the same shape and element type.
The ``ExternalFunction`` also has ``Nodes``, a vector of ops that
are the results being computed (Note: We may require the results to
be ``Result`` ops in the future. A ``Result`` op would have a single
input and no outputs, and complement the zero input single output
``Parameter`` op.)
results. When called by a bridge, the bridge provides tensors in the
form of row-major arrays for each argument and each computed
result. The same array can be used for more than one argument, but
each result must use a distinct array, and argument arrays cannot be
used as result arrays.
Function definition begins with creating one or more ``Parameter``
ops, which represent the tensors that will be supplied as arguments to
the function. Parameters have no inputs and attributes for the
element type and shape of the tensor that will be provided as an
argument. The unique output of the ``Parameter`` will have the
provided element type and shape.
A ``Function`` has ``Parameters``, a vector of ``Parameter`` ops,
where no ``Parameter`` op may appear more than once in the vector. A
``Parameter`` op has no inputs and attributes for its shape and
element type; arrays passed to the function must have the same shape
and element type as the corresponding parameter. The ``Function``
also has ``Nodes``, a vector of ops that are the results being
computed.
During execution, the output of the nth ``Parameter`` op will be the tensor
corresponding to the array provided as the nth argument, and the outputs
of all result ops will be written into the result arrays in row-major
order.
.. important:: During graph building, most of the storage associated
with values is *implicit*. During compilation, *explicit* storage
will be assigned in the form *value descriptors*; this storage will
be referred to as the inputs and outputs of those calls.
Sources of values
-----------------
.. note:: The nGraph library includes a number of *built-in ops*. A :
ref:`built-in op` is like a templated function in C++, in that it
can be used with a variety of argument types. Similarly, when the
types of each argument are known in a call, the op must be able to
verify that the arguments are compatible, and it must be able to
determine the ``type`` of the returned value.
The function graph is strongly typed. Every source of a value in the graph
must be associated with a type. In a graph, values come from many possible
sources: *literals*, *calls* to ops (built-in ops or user-defined ops AKA
*functions*), and *parameters* of user-defined functions.
#. *Literals* A value type is associated with each literal, and must be
consistent with the literal's value.
#. *Calls* to **ops**. When called with appropriate arguments, an *op*
produces a return value. All arguments not fixed at compile time
must be values. In the nGraph API, the term :term:`parameter` refers
to what "stands in" for an argument in an ``op`` definition, and :term:`result`
refers to what "stands in" for the returned *value*.
For example, the ``add`` **op** is a built-in op with two run-time
parameters that **must have the same value type**. It produces a
result with the same value type as its parameters.
Another example of a built-in **op** is the ``tuple`` **op** which, has
zero or more run-time parameters of *arbitrary* value types and a result
whose type is the tuple type of the types of the parameters.
An Example
----------
#. **Functions*** are user-defined ops.
- A user-defined function is "external" if it can be called externally.
- The result is a graph node that depends only on parameters.
- The result's type of call to a function is determined from the types of the arguments.
- Any external function interacting with the graph at the level of user-defined op must specify a type for each of its parameters.
#. *Parameters* of user-defined *functions* may also be a source of a graph's
values. Externally-callable functions must specify a type for each parameter.
Building a Graph
================
The function graph is composed of instances of the class ``Node``. Nodes are
created by helpers described below.
::
.. note:: method ``dependents()`` is a vector of nodes that must be computed
before the result of ``Node`` can be used.
#include <memory>
#include <ngraph.hpp>
User-defined functions
----------------------
using ngraph;
When building a function graph with values derived from "custom" or user-defined
functions, use the following syntax to:
// f(a, b, c) = (a + b) * c
void make_function()
{
* create a user-defined function: ``make_shared<Function>()``
// First construct the graph
Shape shape{32, 32};
auto a = std::make_shared<op::Parameter>(element::f32, shape);
auto b = std::make_shared<op::Parameter>(element::f32, shape);
auto c = std::make_shared<op::Parameter>(element::f32, shape);
auto t0 = std::make_shared<op::Add>(a, b);
auto t1 = std::make_shared<op::Multiply>(t0, c);
* get the specified parameter of the function: \* method:``parameter(index)``
auto f = std::make_shared<Function>(Nodes{t1}, Parameters{a, b, c});
}
* return the type: \* method ``type()``
We use shared pointers for all ops. For each parameter, we need to
element type and shape attributes. When the function is called, each
argument must conform to the corresponding parameter element type and
shape.
* set the type to `t`: \* method ``type(ValueType t)``
During typical graph construction, all ops have one output and some
number of inputs, which makes it easy to construct the graph by
assigning each unique output of a constructor argument node to an
input of the op being constructed. For example, `Add` need to supply
node outputs to each of its two inputs, which we supply from the
unique outputs of the parameters `a` and `b`.
* set the type to a ``TensorViewType``: \* method ``type(ElementType element_type, Shape shape)``
We do not perform any implicit element type coercion or shape
conversion (such as broadcasts) since these can be
framework-dependent, so all the shapes for the add and multiply must
be the same. If there is a mismatch, the constructor will throw an
exception.
* get the function's result: \* method ``result()``
After the graph is constructed, we create the function, passing the
`Function` constructor the nodes that are results and the parameters
that are arguments.
* return the node providing the value: \* method ``value()``
* set the node that will provide the value: \* method ``value(Node node)``
Defining ops
============
Type methods are available as with parameters. A user-defined function is
callable, and can be used to add a call to it in the graph.
A framework bridge constructs a function which is compiled/optimized
by a sequence of graph transformations that replace subgraphs of the
computation with more optimal subgraphs. Throughout this process, ops
represent tensor operations.
*Core ops* are ops that are available and generally useful to all
framework bridges and that can be compiled by all transformers. A
framework bridge may define framework-specific ops to simplify graph
construction, provided that the bridge can enable every transformer to
replace all such ops with equivalent subgraphs composed of core
ops. Similary, transformers may define transformer-specific ops to
represent kernels or other intermediate operations. If a framework
supports extending the set of ops it offers, a bridge may even expose
transformer-specific ops to the framework user.
Built-in Ops
------------
It is easiest to define a new op by adapting an existing op. Some of
the tasks that must be performed are:
Calls to built-in ops are created with helper functions generally in the
``op`` namespace. Ops are generally callable singletons that build
calls. When building a function graph with built-in ops,
- Op constructor:
- ``op::tuple()`` produces an empty tuple
- to add a value to a tuple, use the overload ``Tuple(list<Value>)``
* to add a value to the tuple operation: \* method ``push_back(value)``
* to return the specified component, call \* method ``get(index)``
- where ``index`` is a compile-time value.
* Checking type-consistency of arguments
* Specifying the result type for a call
- Serializer/Deserializer
Example
-------
- Transformer handlers:
::
* Interpreter (reference) implementation of behavior. The
implementation should favor clarity over efficiency.
// Function with 4 parameters
auto cluster_0 = make_shared<Function>(4);
cluster_0->result()->type(element_type_float, Shape {32, 3});
cluster_0->parameter(0)->type(element_type_float, Shape {Shape {7, 3}});
cluster_0->parameter(1)->type(element_type_float, Shape {Shape {3}});
cluster_0->parameter(2)->type(element_type_float, Shape {Shape {32, 7}});
cluster_0->parameter(3)->type(element_type_float, Shape {Shape {32, 7}});
auto arg3 = cluster_0->parameter(3);
// call broadcast op on arg3, broadcasting on axis 1.
auto broadcast_1 = op::broadcast(arg3, 1);
auto arg2 = cluster_0->parameter(2);
auto arg0 = cluster_0->parameter(0);
// call dot op
auto dot = op::dot(arg2, arg0);
// Function returns tuple of dot and broadcast_1.
cluster_0->result()->value(dot);
Defining built-in ops
=====================
This section is WIP.
Built-in ops are used for several purposes:
- Constructing call nodes in the graph.
* Checking type-consistency of arguments
* Specifying the result type for a call
- Indicating preliminary tensor needs
* Index operations are aliased views
* Tuples are unboxed into tensor views
* Remaining ops given vectors of inputs and outputs
- Constructing patterns that will match sub-graphs
- Pre-transformer code generation
- Debug streaming of call descriptions
The general ``Node`` class provides for dependents and node type. The
class ``Call`` subclasses ``Node``. Built-in op implementations can
subclass ``Call`` to provide storage for compile-time parameters, such
as broadcast indices.
The plan is that the abstract class ``Op`` will have methods to be
implemented by built-in ops. Each built-in op corresponds to a callable
singleton (in the ``ngraph::op`` namespace) that constructs the
appropriate ``Call``. As a singleton, the op can conveniently be used as
a constant in patterns. Call objects will be able to find their related
op.
......@@ -201,6 +201,10 @@ op::AvgPool::AvgPool(const std::shared_ptr<Node>& arg, const Shape& window_shape
bool op::AvgPool::is_functionally_identical(const Node& other) const
{
// TODO: temporary workaround for MKLDNN issue
// remove 'return false' and uncomment below when fixed
return false;
/*
bool rc = true;
if (Node::is_functionally_identical(other))
{
......@@ -215,6 +219,7 @@ bool op::AvgPool::is_functionally_identical(const Node& other) const
rc = false;
}
return rc;
*/
}
op::AvgPoolBprop::AvgPoolBprop(const std::shared_ptr<Node>& arg,
......
......@@ -371,6 +371,10 @@ std::shared_ptr<Node>
bool op::Convolution::is_functionally_identical(const Node& other) const
{
// TODO: temporary workaround for MKLDNN issue
// remove 'return false' and uncomment below when fixed
return false;
/*
bool rc = true;
if (Node::test_identical(other))
{
......@@ -386,6 +390,7 @@ bool op::Convolution::is_functionally_identical(const Node& other) const
rc = false;
}
return rc;
*/
}
void op::Convolution::generate_adjoints(autodiff::Adjoints& adjoints,
......
......@@ -159,6 +159,10 @@ op::MaxPool::MaxPool(const std::shared_ptr<Node>& arg, const Shape& window_shape
bool op::MaxPool::is_functionally_identical(const Node& other) const
{
// TODO: temporary workaround for MKLDNN issue
// remove 'return false' and uncomment below when fixed
return false;
/*
bool rc = true;
if (Node::test_identical(other))
{
......@@ -171,6 +175,7 @@ bool op::MaxPool::is_functionally_identical(const Node& other) const
rc = false;
}
return rc;
*/
}
void op::MaxPool::generate_adjoints(autodiff::Adjoints& adjoints,
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment