Commit 9940123b authored by L.S. Cook's avatar L.S. Cook Committed by Robert Kimball

Doc 0.11 (#2219)

* editing docs

* more doc updates

* Cleanup theme, update backends for PlaidML, remove stale font

* Add PlaidML description and doc update that should have been added with PR 1888

* Add PlaidML description and doc update that should have been added with PR 1888

* Latest release doc updates

* Add PlaidML description and doc update for PR 1888
* Update glossary with tensor description and quantization def
* Refactor landpage with QuickStart guides
* Add better details about nGraph features and roadmap

* Placeholder detail for comparison section

* Add section link

* order sections alphabetically for now

* update compiler illustration

* Address feedback from doc review

* Update illustration wording

* Formatting and final edits

* keep tables consistent

* Clarify doc on bridge and compiler docs

* Clarify doc on bridge and compiler docs

* yay for more feedback and improvements

* edit with built doc

* Fix typo

* Another phase of PR review editing

* Final review comment resolved

* note grammatically-correct wording preferred as often as possible.

* First iteration of shared subgraphs with onnx doc

* Updte onnx wheel install instructions with latest ngraph-0.9.0 versioning

* Updte onnx wheel install instructions with latest ngraph-0.9.0 versioning

* Update section  on subgraphs and shared subgraph docs

* Finalize edit of mxnet tutorial given status of our PR

* Make sure latest conf py is being used

* Update to latest index

* Add link to design doc mentioned by Ashoke and update about for consistent headings

* Update with PR feedback

* Update with new pip install instructions

* add more testing

* Further feedback review included

* Improve descriptions, given the new pip pkg install options

* Add note to onnx_ssg_tutorial

* Links updated to latest correct url

* Improve docs for Beta

* Reorganize TOC

* Better org in sections

* Make heading style consistent across indexes

* Update intro to framework builders

* Update intro to framework builders

* Add feedback from reviewers

* Minor fixes to ToC and editing

* Add section on FMV for miscellaneous use cases

* Update notice on README

* Updte link to howto index

* fix typo

* fix note

* Update glossary
parent db7ecdcc
FAQs
----
### Why nGraph?
We developed nGraph to simplify the realization of optimized deep learning
performance across frameworks and hardware platforms. The value we're offering
to the developer community is empowerment: we are confident that Intel®
Architecture already provides the best computational resources available
for the breadth of ML/DL tasks.
### How do I connect a framework?
The nGraph Library manages framework bridges for some of the more widely-known
frameworks. A bridge acts as an intermediary between the nGraph core and the
framework, and the result is a function that can be compiled from a framework.
A fully-compiled function that makes use of bridge code thus becomes a
"function graph", or what we sometimes call an **nGraph graph**.
Low-level nGraph APIs are not accessible *dynamically* via bridge code; this
is the nature of stateless graphs. However, do note that a graph with a
"saved" checkpoint can be "continued" to run from a previously-applied checkpoint,
or it can loaded as static graph for further inspection.
For a more detailed dive into how custom bridge code can be implemented, see our
documentation on [Working with other frameworks]. To learn how TensorFlow and MXNet
currently make use of custom bridge code, see [Integrate supported frameworks].
![](doc/sphinx/source/graphics/bridge-to-graph-compiler.png)
<alt="JiT Compiling for computation" width="733" />
Although we only directly support a few frameworks at this time, we provide
documentation to help developers and engineers create custom solutions.
### How do I run an inference model?
Framework bridge code is *not* the only way to connect a model (function graph) to
nGraph's ../ops/index. We've also built an importer for models that have been
exported from a framework and saved as serialized file, such as ONNX. To learn
how to convert such serialized files to an nGraph model, please see the "How to"
documentation.
### What's next?
The Gold release is targeted for April 2019; it will feature broader workload
coverage, including support for quantized graphs, and more detail on our
advanced support for ``int8``. We developed nGraph to simplify the realization
of optimized deep learning performance across frameworks and hardware platforms.
You can read more about design decisions and what is tentatively in the pipeline
for development in our [arXiv paper](https://arxiv.org/pdf/1801.08058.pdf) from
the 2018 SysML conference.
[Working with other frameworks]: http://ngraph.nervanasys.com/docs/latest/frameworks/index.html
[Integrate supported frameworks]: http://ngraph.nervanasys.com/docs/latest/framework-integration-guides.html
Tested Platforms:
- Ubuntu 16.04 and 18.04
- CentOS 7.4
Our latest instructions for how to build the library are available
[in the documentation](https://ngraph.nervanasys.com/docs/latest/buildlb.html).
Use `cmake -LH` after cloning the repo to see the currently-supported
build options. We recommend using, at the least, something like:
$ cmake ../ -DCMAKE_INSTALL_PREFIX=~/ngraph_dist -DNGRAPH_USE_PREBUILT_LLVM
-DNGRAPH_ONNX_IMPORT_ENABLE=ON
...@@ -20,11 +20,17 @@ workloads on CPU for inference, please refer to the links below. ...@@ -20,11 +20,17 @@ workloads on CPU for inference, please refer to the links below.
| MXNet* 1.3 | [Pip install](https://github.com/NervanaSystems/ngraph-mxnet#Installation) or [Build from source](https://github.com/NervanaSystems/ngraph-mxnet#building-with-ngraph-support)| 18 [Validated workloads] | MXNet* 1.3 | [Pip install](https://github.com/NervanaSystems/ngraph-mxnet#Installation) or [Build from source](https://github.com/NervanaSystems/ngraph-mxnet#building-with-ngraph-support)| 18 [Validated workloads]
| ONNX 1.3 | [Pip install](https://github.com/NervanaSystems/ngraph-onnx#installation) | 14 [Validated workloads] | ONNX 1.3 | [Pip install](https://github.com/NervanaSystems/ngraph-onnx#installation) | 14 [Validated workloads]
:exclamation: :exclamation: :exclamation: Note that the ``pip`` package option
works only with Ubuntu 16.04 or greater and Intel® Xeon® CPUs. CPUs without
Intel® Advanced Vector Extensions 512 (Intel® AVX-512) will not run these
packages; the alternative is to build from source. Wider support for other
CPUs will be offered starting in early 2019 :exclamation: :exclamation: :exclamation:
Frameworks using nGraph Compiler stack to execute workloads have shown Frameworks using nGraph Compiler stack to execute workloads have shown
[**up to 45X**](https://ai.intel.com/ngraph-compiler-stack-beta-release/) performance boost when compared to native framework [**up to 45X**](https://ai.intel.com/ngraph-compiler-stack-beta-release/)
implementations. We've also seen performance boosts running workloads that performance boost when compared to native framework implementations. We've also
are not included on the list of [Validated workloads], thanks to our seen performance boosts running workloads that are not included on the list of
powerful subgraph pattern matching. [Validated workloads], thanks to our powerful subgraph pattern matching.
Additional work is also being done via [PlaidML] which will feature running Additional work is also being done via [PlaidML] which will feature running
compute for Deep Learning with GPU accleration. See our compute for Deep Learning with GPU accleration. See our
...@@ -84,7 +90,7 @@ to improve it: ...@@ -84,7 +90,7 @@ to improve it:
[Documentation]: https://ngraph.nervanasys.com/docs/latest [Documentation]: https://ngraph.nervanasys.com/docs/latest
[build the Library]: https://ngraph.nervanasys.com/docs/latest/buildlb.html [build the Library]: https://ngraph.nervanasys.com/docs/latest/buildlb.html
[Getting Started Guides]: Getting-started-guides [Getting Started Guides]: Getting-started-guides
[Validated workloads]: https://ngraph.nervanasys.com/docs/latest/frameworks/validation-testing.html [Validated workloads]: https://ngraph.nervanasys.com/docs/latest/frameworks/validation.html
[Functional]: https://github.com/NervanaSystems/ngraph-onnx/ [Functional]: https://github.com/NervanaSystems/ngraph-onnx/
[How to contribute]: How-to-contribute [How to contribute]: How-to-contribute
[framework integration guides]: http://ngraph.nervanasys.com/docs/latest/framework-integration-guides.html [framework integration guides]: http://ngraph.nervanasys.com/docs/latest/framework-integration-guides.html
...@@ -104,5 +110,6 @@ to improve it: ...@@ -104,5 +110,6 @@ to improve it:
[nGraph-ONNX]: https://github.com/NervanaSystems/ngraph-onnx/blob/master/README.md [nGraph-ONNX]: https://github.com/NervanaSystems/ngraph-onnx/blob/master/README.md
[nGraph-ONNX adaptable]: https://ai.intel.com/adaptable-deep-learning-solutions-with-ngraph-compiler-and-onnx/ [nGraph-ONNX adaptable]: https://ai.intel.com/adaptable-deep-learning-solutions-with-ngraph-compiler-and-onnx/
[nGraph for PyTorch developers]: https://ai.intel.com/investing-in-the-pytorch-developer-community [nGraph for PyTorch developers]: https://ai.intel.com/investing-in-the-pytorch-developer-community
[Validated workloads]: https://ngraph.nervanasys.com/docs/latest/frameworks/validation-testing.html [Validated workloads]: https://ngraph.nervanasys.com/docs/latest/frameworks/genre-validation.html
# ******************************************************************************
# Copyright 2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ******************************************************************************
import mxnet as mx
# Convert gluon model to a static model
from mxnet.gluon.model_zoo import vision
import time
batch_shape = (1, 3, 224, 224)
input_data = mx.nd.zeros(batch_shape)
resnet_gluon = vision.resnet50_v2(pretrained=True)
resnet_gluon.hybridize()
resnet_gluon.forward(input_data)
resnet_gluon.export('resnet50_v2')
resnet_sym, arg_params, aux_params = mx.model.load_checkpoint('resnet50_v2', 0)
# Load the model into nGraph as a static graph
model = resnet_sym.simple_bind(ctx=mx.cpu(), data=batch_shape, grad_req='null')
model.copy_params_from(arg_params, aux_params)
# To test the model's performance, we've provided this helpful code snippet
# customizable
dry_run = 5
num_batches = 100
for i in range(dry_run + num_batches):
if i == dry_run:
start_time = time.time()
outputs = model.forward(data=input_data, is_train=False)
for output in outputs:
output.wait_to_read()
print("Average Latency = ", (time.time() - start_time)/num_batches * 1000, "ms")
{%- if builder != 'singlehtml' %} {%- if builder != 'singlehtml' %}
<div role="search"> <div role="search">
<form id="rtd-search-form" class="wy-form" action="{{ pathto('search') }}" method="get"> <form id="rtd-search-form" class="wy-form" action="{{ pathto('search') }}" method="get">
<input type="text" name="q" placeholder="Search docs" /> <input type="text" name="q" placeholder="Search nGraph Documentation" />
<input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" /> <input type="hidden" name="area" value="default" />
</form> </form>
......
...@@ -347,11 +347,6 @@ big, small { ...@@ -347,11 +347,6 @@ big, small {
clear: both; clear: both;
} }
@font-face {
font-family: 'Faustina ';
src: url("../fonts/Faustina); src: url("../fonts/Faustina) format("ttf","svg"), webformat(".svg");
font-weight: normal;
font-style: normal;
/*! */ /*! */
/* NeoSansIntel FONT */ /* NeoSansIntel FONT */
...@@ -579,7 +574,7 @@ a .fa, a .rst-content .admonition-title, .rst-content a .admonition-title, a .rs ...@@ -579,7 +574,7 @@ a .fa, a .rst-content .admonition-title, .rst-content a .admonition-title, a .rs
color: #fff; color: #fff;
background: #638470; background: #638470;
margin: -5px; margin: -5px;
font-family: "Faustina", sans; font-family: "NeoSansIntel", sans;
font-weight: bold; font-weight: bold;
padding: 0.33em 0.74em; padding: 0.33em 0.74em;
margin-bottom: 0.33em; margin-bottom: 0.33em;
...@@ -1228,7 +1223,7 @@ textarea { ...@@ -1228,7 +1223,7 @@ textarea {
overflow: auto; overflow: auto;
vertical-align: top; vertical-align: top;
width: 100%; width: 100%;
font-family: "Faustina", "NanumGothicCoding", Arial, sans; font-family: "NeoSansIntel", "NanumGothicCoding", Arial, sans;
} }
select, textarea { select, textarea {
...@@ -1632,7 +1627,7 @@ html { ...@@ -1632,7 +1627,7 @@ html {
} }
body { body {
font-family: "Faustina", "NanumGothicCoding", Arial, sans; font-family: "NeoSansIntel", "NanumGothicCoding", Arial, sans;
font-weight: normal; font-weight: normal;
color: #404040; color: #404040;
min-height: 100%; min-height: 100%;
...@@ -1712,7 +1707,7 @@ a.wy-text-neutral:hover { ...@@ -1712,7 +1707,7 @@ a.wy-text-neutral:hover {
h1, h2, .rst-content .toctree-wrapper p.caption, h3, h4, h5, h6, legend { h1, h2, .rst-content .toctree-wrapper p.caption, h3, h4, h5, h6, legend {
margin-top: 0; margin-top: 0;
font-weight: 700; font-weight: 700;
font-family: "NeoSansIntel", "Faustina", Arial, sans; font-family: "NeoSansIntel", "NeoSansIntel", Arial, sans;
} }
p { p {
...@@ -1856,7 +1851,7 @@ code, p.caption { ...@@ -1856,7 +1851,7 @@ code, p.caption {
caption-text { caption-text {
font-family: 'Faustina', Lato, monospace; font-family: 'NeoSansIntel', Lato, monospace;
} }
...@@ -2223,7 +2218,7 @@ div[class^='highlight'] pre { ...@@ -2223,7 +2218,7 @@ div[class^='highlight'] pre {
color: #fcfcfc; color: #fcfcfc;
background: #1f1d1d; background: #1f1d1d;
border-top: solid 10px #5f5f5f; border-top: solid 10px #5f5f5f;
font-family: "NeoSansIntel", "Faustina", "Helvetica Neue", Arial, sans; font-family: "NeoSansIntel", "NeoSansIntel", "Helvetica Neue", Arial, sans;
z-index: 400; z-index: 400;
} }
.rst-versions a { .rst-versions a {
...@@ -2539,7 +2534,7 @@ div[class^='highlight'] pre { ...@@ -2539,7 +2534,7 @@ div[class^='highlight'] pre {
font-family: monospace; font-family: monospace;
line-height: normal; line-height: normal;
background: white; background: white;
color: #84aead; color: #0071c5;
border-top: solid 0.31em #cad8a5; border-top: solid 0.31em #cad8a5;
padding: 6px; padding: 6px;
position: relative; position: relative;
...@@ -2704,13 +2699,13 @@ span[id*='MathJax-Span'] { ...@@ -2704,13 +2699,13 @@ span[id*='MathJax-Span'] {
font-family: "Roboto Slab"; font-family: "Roboto Slab";
font-style: normal; font-style: normal;
font-weight: 400; font-weight: 400;
src: local("Roboto Slab Regular"), local("Faustina-Regular"), url(../fonts/Faustina-Regular.ttf) format("truetype"); src: local("Roboto Slab Regular"), local("NeoSansIntel-Regular"), url(../fonts/NeoSansIntel-Regular.ttf) format("truetype");
} }
@font-face { @font-face {
font-family: "Roboto Slab"; font-family: "Roboto Slab";
font-style: normal; font-style: normal;
font-weight: 700; font-weight: 700;
src: local("Roboto Slab Bold"), local("Faustina-Bold"), url(../fonts/Faustina-Bold.ttf) format("truetype"); src: local("Roboto Slab Bold"), local("NeoSansIntel-Bold"), url(../fonts/NeoSansIntel-Bold.ttf) format("truetype");
} }
.wy-affix { .wy-affix {
position: fixed; position: fixed;
...@@ -2967,7 +2962,7 @@ span[id*='MathJax-Span'] { ...@@ -2967,7 +2962,7 @@ span[id*='MathJax-Span'] {
} }
.wy-nav .wy-menu-vertical header { .wy-nav .wy-menu-vertical header {
color: #84aead; color: #0071c5;
} }
.wy-nav .wy-menu-vertical a { .wy-nav .wy-menu-vertical a {
color: #dadada; color: #dadada;
...@@ -3038,13 +3033,14 @@ span[id*='MathJax-Span'] { ...@@ -3038,13 +3033,14 @@ span[id*='MathJax-Span'] {
.wy-nav-top { .wy-nav-top {
display: none; display: none;
background: #84aead; background: #0071c5;
color: #fff; color: #fff;
padding: 0.4045em 0.809em; padding: 0.4045em 0.809em;
position: relative; position: relative;
line-height: 50px; line-height: 50px;
text-align: center; text-align: center;
font-size: 100%; font-size: 100%;
font-family: 'NeoSansIntel';
*zoom: 1; *zoom: 1;
} }
.wy-nav-top:before, .wy-nav-top:after { .wy-nav-top:before, .wy-nav-top:after {
...@@ -3062,7 +3058,7 @@ span[id*='MathJax-Span'] { ...@@ -3062,7 +3058,7 @@ span[id*='MathJax-Span'] {
margin-right: 12px; margin-right: 12px;
height: 45px; height: 45px;
width: 45px; width: 45px;
background-color: #84aead; background-color: #0071c5;
padding: 5px; padding: 5px;
border-radius: 100%; border-radius: 100%;
} }
......
.. buildlb.rst: .. buildlb.rst:
################## ######################
Build the Library Build the C++ Library
################## ######################
* :ref:`ubuntu` * :ref:`ubuntu`
* :ref:`centos` * :ref:`centos`
......
...@@ -73,11 +73,11 @@ author = 'Intel Corporation' ...@@ -73,11 +73,11 @@ author = 'Intel Corporation'
# built documents. # built documents.
# #
# The short X.Y version. # The short X.Y version.
version = '0.10' version = '0.11'
# The Documentation full version, including alpha/beta/rc tags. Some features # The Documentation full version, including alpha/beta/rc tags. Some features
# available in the latest code will not necessarily be documented first # available in the latest code will not necessarily be documented first
release = '0.10.1' release = '0.11.1'
# The language for content autogenerated by Sphinx. Refer to documentation # The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages. # for a list of supported languages.
...@@ -170,7 +170,7 @@ latex_elements = { ...@@ -170,7 +170,7 @@ latex_elements = {
# (source start file, target name, title, # (source start file, target name, title,
# author, documentclass [howto, manual, or own class]). # author, documentclass [howto, manual, or own class]).
latex_documents = [ latex_documents = [
(master_doc, 'IntelnGraphlibrary.tex', 'Intel nGraph library', (master_doc, 'IntelnGraphlibrary.tex', 'Intel nGraph Library',
'Intel Corporation', 'manual'), 'Intel Corporation', 'manual'),
] ]
...@@ -180,7 +180,7 @@ latex_documents = [ ...@@ -180,7 +180,7 @@ latex_documents = [
# One entry per manual page. List of tuples # One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section). # (source start file, name, description, authors, manual section).
man_pages = [ man_pages = [
(master_doc, 'intelngraphlibrary', 'Intel nGraph library', (master_doc, 'intelngraphlibrary', 'Intel nGraph Library',
[author], 1) [author], 1)
] ]
...@@ -191,8 +191,8 @@ man_pages = [ ...@@ -191,8 +191,8 @@ man_pages = [
# (source start file, target name, title, author, # (source start file, target name, title, author,
# dir menu entry, description, category) # dir menu entry, description, category)
texinfo_documents = [ texinfo_documents = [
(master_doc, 'IntelnGraphlibrary', 'Intel nGraph library', (master_doc, 'IntelnGraphlibrary', 'Intel nGraph Library',
author, 'IntelnGraphlibrary', 'Documentation for Intel nGraph library code base', author, 'IntelnGraphlibrary', 'Documentation for Intel nGraph Library code base',
'Miscellaneous'), 'Miscellaneous'),
] ]
......
.. distr/index: .. distr/index.rst:
##############################
Distributed Training in nGraph Distributed Training in nGraph
============================== ##############################
Why distributed training? Why distributed training?
------------------------- =========================
A tremendous amount of data is required to train DNNs in diverse areas -- from A tremendous amount of data is required to train DNNs in diverse areas -- from
computer vision to natural language processing. Meanwhile, computation used in computer vision to natural language processing. Meanwhile, computation used in
...@@ -22,7 +23,7 @@ nGraph backend computes the gradients in back-propagation, aggregates the gradie ...@@ -22,7 +23,7 @@ nGraph backend computes the gradients in back-propagation, aggregates the gradie
across all workers, and then update the weights. across all workers, and then update the weights.
How? (Generic frameworks) How? (Generic frameworks)
------------------------- =========================
* :doc:`../howto/distribute-train` * :doc:`../howto/distribute-train`
...@@ -54,12 +55,8 @@ mini-batch training, one could train ResNet-50 with Imagenet-1k data to the ...@@ -54,12 +55,8 @@ mini-batch training, one could train ResNet-50 with Imagenet-1k data to the
`arxiv.org/abs/1709.05011`_. `arxiv.org/abs/1709.05011`_.
MXNet MXNet
----- =====
We implemented a KVStore in MXNet\* (KVStore is unique to MXNet) to modify We implemented a KVStore in MXNet\* (KVStore is unique to MXNet) to modify
the SGD update op so the nGraph graph will contain the allreduce op and generate the SGD update op so the nGraph graph will contain the allreduce op and generate
...@@ -71,14 +68,11 @@ I1K training in MXNet 1, 2, 4, (and 8 if available) nodes, x-axis is the number ...@@ -71,14 +68,11 @@ I1K training in MXNet 1, 2, 4, (and 8 if available) nodes, x-axis is the number
of nodes while y-axis is the throughput (images/sec). of nodes while y-axis is the throughput (images/sec).
.. TODO add figure graphics/distributed-training-ngraph-backends.png .. TODO add figure graphics/distributed-training-ngraph-backends.png
TensorFlow TensorFlow
---------- ==========
We plan to support the same in nGraph-TensorFlow. It is still work in progress. We plan to support the same in nGraph-TensorFlow. It is still work in progress.
Meanwhile, users could still use Horovod and the current nGraph TensorFlow, Meanwhile, users could still use Horovod and the current nGraph TensorFlow,
...@@ -87,8 +81,9 @@ Figure: a bar chart shows preliminary results Resnet-50 I1K training in TF 1, ...@@ -87,8 +81,9 @@ Figure: a bar chart shows preliminary results Resnet-50 I1K training in TF 1,
2, 4, (and 8 if available) nodes, x-axis is the number of nodes while y-axis 2, 4, (and 8 if available) nodes, x-axis is the number of nodes while y-axis
is the throughput (images/sec). is the throughput (images/sec).
Future work Future work
----------- ===========
Model parallelism with more communication ops support is in the works. For Model parallelism with more communication ops support is in the works. For
more general parallelism, such as model parallel, we plan to add more more general parallelism, such as model parallel, we plan to add more
......
...@@ -28,9 +28,13 @@ as an optimizing compiler available through the framework. ...@@ -28,9 +28,13 @@ as an optimizing compiler available through the framework.
MXNet\* bridge MXNet\* bridge
=============== ===============
* See the README on `nGraph-MXNet`_ Integration for how to enable the bridge. * See the README on `nGraph-MXNet`_ Integration.
* Optional: For experimental or alternative approaches to distributed training * **Testing latency for Inference**: See the :doc:`frameworks/testing-latency`
doc for a fully-documented example how to compile and test latency with an
MXNet-supported model.
* **Training**: For experimental or alternative approaches to distributed training
methodologies, including data parallel training, see the MXNet-relevant sections methodologies, including data parallel training, see the MXNet-relevant sections
of the docs on :doc:`distr/index` and :doc:`How to <howto/index>` topics like of the docs on :doc:`distr/index` and :doc:`How to <howto/index>` topics like
:doc:`howto/distribute-train`. :doc:`howto/distribute-train`.
......
.. frameworks/generic.rst .. frameworks/generic-configs.rst:
Working with other frameworks Configurations available to any framework
############################## #########################################
An engineer may want to work with a deep learning framework that does not yet
have bridge code written. For non-supported or "generic" frameworks, it is Enabling Deep Learning paradigms
expected that engineers will use the nGraph library to create custom bridge code, ================================
and/or to design and document a user interface (UI) with specific runtime
options for whatever custom use case they need. Framework architects or engineers who can't quite find what they need among
the existing DL tools may need to build something new off a "stock" framework,
The two primary tasks that can be accomplished in the “bridge code” space of the or someting entirely from scratch. For this category of developer, we have
nGraph abstraction layer are: (1) compiling a dataflow graph and (2) executing :doc:`documented several ways <../howto/index>` you can incorporate built-in
a pre-compiled graph. See the :doc:`../framework-integration-guides` for how we compiler support for users of your framework; this includes out-of-box support
have built bridges with other frameworks. For more in-depth help in writing for things like Intel® MKL-DNN and PlaidML when your framework supports nGraph
as a "backend" or engine.
.. important:: nGraph does not provide an interface for "users" of frameworks
(for example, we cannot dictate or control how Tensorflow* or MXNet* presents
interfaces to users). Please keep in mind that designing and documenting
the :abbr:`User Interface (UI)` of step 3 above is entirely in the realm
of the framework owner or developer and beyond the scope of the nGraph
Compiler stack. However, any framework can be designed to make direct use
of nGraph Compiler stack-based features and then expose an accompanying UI,
output message, or other detail to a user.
The nGraph :abbr:`IR Intermediate Representation` is format that can understand
inputs from a framework. Today, there are two primary tasks that can be accomplished
in the “bridge code” space of the nGraph IR:
#. Compiling a dataflow graph
#. Executing a pre-compiled graph.
See the :doc:`../framework-integration-guides` for how we built bridges with our
initially-supported frameworks. For more in-depth help in writing things like
graph optimizations and bridge code, we provide articles on how to graph optimizations and bridge code, we provide articles on how to
:doc:`../fusion/index`, and programmatically :doc:`../howto/execute` that can :doc:`../fusion/index`, and programmatically :doc:`../howto/execute` that can
target various compute resources using nGraph when a framework provides some target various compute resources using nGraph when a framework provides some
inputs to be computed. inputs to be computed.
.. note:: Configuration options can be added manually on the command line or via
scripting. Please keep in mind that fine-tuning of parameters is as much of
an art as it is a science; there are virtually limitless ways to do so and
our documentation provides only a sampling.
Integrating nGraph with new frameworks Integrating nGraph with new frameworks
====================================== ======================================
...@@ -51,6 +76,21 @@ something like: ...@@ -51,6 +76,21 @@ something like:
export LD_LIBRARY_PATH=path/to/ngraph_dist/lib/ export LD_LIBRARY_PATH=path/to/ngraph_dist/lib/
FMV
===
FMV stands for :abbr:`Function Multi-Versioning`, and it can also provide a
number of generic ways to patch or bring architecture-based optimizations to
the :abbr:`Operating System (OS)` that is handling your ML environment. See
the `GCC wiki for details`_.
If your nGraph build is a Neural Network configured on Clear Linux* OS
for Intel® Architecture, and it includes at least one older CPU, the
`following article may be helpful`_.
Training Deep Neural Networks Training Deep Neural Networks
============================== ==============================
...@@ -68,14 +108,14 @@ For CPU (and most cuDNN) backends, the preferred layout is currently ``NCHW``. ...@@ -68,14 +108,14 @@ For CPU (and most cuDNN) backends, the preferred layout is currently ``NCHW``.
* **H** -- Height of the image * **H** -- Height of the image
* **W** -- Width of the image * **W** -- Width of the image
MKL-DNN Intel® Math Kernel Library for Deep Neural Networks
------- ---------------------------------------------------
The following `KMP options`_ were originally optimized for `MKLDNN`_ projects -The following `KMP options`_ were originally optimized for models using the
running models with the ``NCHW`` data layout; however, other configurations can Intel® `MKL-DNN`_ to train models with the ``NCHW`` data layout; however, other
be explored. MKL-DNN is automatically enabled as part of an nGraph build; you do configurations can be explored. MKL-DNN is automatically enabled as part of an
*not* need to add MKL-DNN separately or as an additional component to be able to nGraph compilation; you do *not* need to add MKL-DNN separately or as an
use these configuration settings. additional component to be able to use these configuration settings.
* ``KMP_BLOCKTIME`` Sets the time, in milliseconds, that a thread should wait * ``KMP_BLOCKTIME`` Sets the time, in milliseconds, that a thread should wait
after completing the execution of a parallel region, before sleeping. after completing the execution of a parallel region, before sleeping.
...@@ -158,7 +198,7 @@ Intel® Xeon® processor-based platforms, which is a key requirement for ...@@ -158,7 +198,7 @@ Intel® Xeon® processor-based platforms, which is a key requirement for
many kinds of inference-engine computations. See the next section on NUMA many kinds of inference-engine computations. See the next section on NUMA
performance to learn more about this performance feature available to systems performance to learn more about this performance feature available to systems
utilizing nGraph. utilizing nGraph.
NUMA performance NUMA performance
~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~
...@@ -174,10 +214,16 @@ increasing bandwidth demands on the Intel® Ultra-Path Interconnect (Intel® UPI ...@@ -174,10 +214,16 @@ increasing bandwidth demands on the Intel® Ultra-Path Interconnect (Intel® UPI
This situation is exacerbated with larger number of sockets found in 4, 8, and This situation is exacerbated with larger number of sockets found in 4, 8, and
16-socket systems. We believe that users need to be aware of system level 16-socket systems. We believe that users need to be aware of system level
optimizations in addition to framework specific configuration parameters to optimizations in addition to framework specific configuration parameters to
achieve the best performance for NN workloads on CPU platforms. achieve the best performance for NN workloads on CPU platforms. The nGraph
Compiler stack runs on transformers handled by Intel® Architecture (IA), and
thus can make more efficient use of the underlying hardware.
.. _KMP options: https://software.intel.com/en-us/node/522691 .. _KMP options: https://software.intel.com/en-us/node/522691
.. _MKLDNN: https://github.com/intel/mkl-dnn .. _MKL-DNN: https://github.com/intel/mkl-dnn
.. _gnu.org site: https://gcc.gnu.org/onlinedocs/libgomp/Environment-Variables.html .. _gnu.org site: https://gcc.gnu.org/onlinedocs/libgomp/Environment-Variables.html
.. _Movidius: https://www.movidius.com/ .. _Movidius: https://www.movidius.com/
.. _GCC wiki for details: https://gcc.gnu.org/wiki/FunctionMultiVersioning
.. _following article may be helpful: https://clearlinux.org/documentation/clear-linux/tutorials/fmv
.. framework/index: .. frameworks/index.rst:
Integrate Other Frameworks #####################
########################### Connecting Frameworks
#####################
In this section, written for framework architects or engineers who want While a :abbr:`Deep Learning (DL)` :term:`framework` is ultimately meant for
to optimize brand new, generic, or less widely-supported frameworks, we provide end use by data scientists, or for deployment in cloud container environments,
some of our learnings from our "framework Direct Optimization (framework DO)" nGraph Core ops and the nGraph C++ Library are designed for framework builders
work and custom bridge code, such as that for our `ngraph tensorflow bridge`_. themselves. We invite anyone working on new and novel frameworks or neural
network designs to explore our highly-modular stack of components that can
be implemented or integrated in virtually limitless ways.
.. important:: This section contains articles for framework owners or developers
who want to incorporate the nGraph library directly into their framework and
optimize for some specific compute-time characteristic.
Please read the articles in this section if you are considering incorporating
components from the nGraph Compiler stack in your framework or neural network
design. Articles here are also useful if you are working on something
built-from-scratch, or on an existing framework that is less widely-supported
than the popular frameworks like TensorFlow and PyTorch.
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
generic.rst generic-configs.rst
validation-testing.rst testing-latency.rst
validation.rst
When using a framework to run a model or deploy an algorithm on nGraph Understanding users of frameworks
devices, there are some additional configuration options that can be =================================
incorporated -- manually on the command line or via scripting -- to improve
performance. Fine-tuning an nGraph-enabled device is as much of an art as it A data scientist or ML engineer may not initially know which framework is the
is a science; there are virtually limitless ways to do so. "best" framework to use to start working on his or her problem set. While there
are several to choose from, it can be daunting and time consuming to scope the
Since a framework is typically designed around some feature, such as fast wide array of features and customization options offered by some of the more
training using image data, inference on a mobile device, or support for voice popular frameworks:
and speech pattern recognition, a framework cannot optimize for all
possibilities at the same time. #. First **find** a tested and working DL model that does something *similar*
to what the data scientist or ML engineer wants to do. To assist with this
stage, we've already provided organized tables of :doc:`validation` examples.
#. Next, **replicate** that result using well-known datasets to confirm that the
model does indeed work. To assist with this stage, we've released several
:doc:`pip installation options <../framework-integration-guides>` that can
be used to test basic examples.
#. Finally, **modify** some aspect: add new datasets, or adjust an algorithm's
parameters to hone in on specifics that can better train, forecast, or predict
scenarios modeling the real-world problem. This is also the stage where it
makes sense to `tune the workload to extract best performance`_.
.. important:: nGraph does not provide an interface for "users" of frameworks
(for example, we cannot dictate or control how Tensorflow* or MXNet* presents
interfaces to users). Please keep in mind that designing and documenting
the :abbr:`User Interface (UI)` is entirely in the realm of the framework owner
or developer and beyond the scope of the nGraph Compiler stack. However, any
framework can be designed to make direct use of nGraph Compiler stack-based
features and then expose an accompanying UI, output message, or other detail
to a user.
Clearly, one challenge of the framework developer is to differentiate from
the pack by providing a means for the data scientist to obtain reproducible
results. The other challenge is to provide sufficient documentation, or to
provide sufficient hints for how to do any "fine-tuning" for specific use cases.
With the nGraph Compiler stack powering your framework, it becomes much easier
to help your users get reproducible results with nothing more complex than the
CPU that powers their operating system.
In general, the larger and more complex a framework is, the harder it becomes In general, the larger and more complex a framework is, the harder it becomes
to navigate and extract the best performance; configuration options that are to navigate and extract the best performance; configuration options that are
...@@ -42,55 +71,8 @@ adjustments can increase performance. Likewise, a minimalistic framework that ...@@ -42,55 +71,8 @@ adjustments can increase performance. Likewise, a minimalistic framework that
is designed around one specific kind of model can sometimes offer significant is designed around one specific kind of model can sometimes offer significant
performance-improvement opportunities by lowering overhead. performance-improvement opportunities by lowering overhead.
Right now the preferred way for a data scientist to get better performance is See :doc:`generic-configs` to get started.
to shop around and select the framework that is "already" designed or optimized
for some characteristic or trait of the model they want to build, test, tweak,
or run. One challenge of the framework developer, then, is to differentiate from
the pack by providing a means for the data scientist to obtain reproducible
results. The other challenge is to provide sufficient documentation, or to
provide sufficient hints for how to do any "fine-tuning" for specific use cases.
How this has worked in creating the
:doc:`the direct optimizations <../framework-integration-guides>` we've shared
with the developer community, our engineering teams carefully
`tune the workload to extract best performance`_
from a specific :abbr:`DL (Deep Learning)` model embedded in a specific framework
that is training a specific dataset. Our forks of the frameworks adjust the code
and/or explain how to set the parameters that achieve reproducible results.
Some of the ways we attempt to improve performance include:
* Testing and recording the results of various system-level configuration options
or enabled or disabled flags,
* Compiling with a mix of custom environment variables,
* Finding semi-related comparisons for benchmarking [#1]_, and
* Tuning lower levels of the system so that the machine-learning algorithm can
learn faster or more accurately that it did on previous runs,
* Incorporating various :doc:`../ops/index` to build graphs more efficiently.
This approach, however, is obviously not a scalable solution for developers on
the framework side who are trying to support multiple use cases. Nor is it ideal
for teams looking to pivot or innovate multi-layer solutions based on something
**other than training speed**, things like accuracy or precision. Chasing
performance improvements does eventually yield a diminishing
:abbr:`Return on Investment (ROI)`, though it is up to the framework
developer to decide when that is for each of their customers.
For these reasons, we're providing some of the more commonly-used options for
fine-tuning various code deployments to the nGraph-enabled devices we
currently support. Watch this section as we enable new devices and post new
updates.
.. rubric:: Footnotes
.. [#1] Benchmarking performance of DL systems is a young discipline; it is a
good idea to be vigilant for results based on atypical distortions in the
configuration parameters. Every topology is different, and performance
changes can be attributed to multiple causes. Also watch out for the word "theoretical" in comparisons; actual performance should not be
compared to theoretical performance.
.. _ngraph tensorflow bridge: http://ngraph.nervanasys.com/docs/latest/framework-integration-guides.html#tensorflow
.. _tune the workload to extract best performance: https://ai.intel.com/accelerating-deep-learning-training-inference-system-level-optimizations .. _tune the workload to extract best performance: https://ai.intel.com/accelerating-deep-learning-training-inference-system-level-optimizations
.. _a few small: https://software.intel.com/en-us/articles/boosting-deep-learning-training-inference-performance-on-xeon-and-xeon-phi .. _a few small: https://software.intel.com/en-us/articles/boosting-deep-learning-training-inference-performance-on-xeon-and-xeon-phi
.. _Movidius: https://www.movidius.com/
\ No newline at end of file
.. frameworks/testing_latency:
Testing latency
###############
Many open-source DL frameworks provide a layer where experts in data science
can make use of optimizations contributed by machine learning engineers. Having
a common API benefits both: it simplifies deployment and makes it easier for ML
engineers working on advanced deep learning hardware to bring highly-optimized
performance to a wide range of models, especially in inference.
One DL framework with advancing efforts on graph optimizations is Apache
MXNet\*, where `Intel has contributed efforts showing`_ how to work with our
nGraph Compiler stack as an `experimental backend`_. Our approach provides
**more opportunities** to start working with different kinds of graph
optimizations **than would be available to the MXNet framework alone**, for
reasons outlined in our `features`_ documentation.
.. TODO : Link to latest on mxnet when/if they do this instead of linking to PR;
keep in mind this tutorial will still work regardless of the merge status of
the experimental backend if you already use the ngraph-mxnet Github repo
.. figure:: ../graphics/ngraph-mxnet-models.png
:width: 533px
:alt: Up to 45X faster
Up to 45X faster compilation with nGraph backend
Tutorial: Testing inference latency of ResNet-50-V2 with MXNet
==============================================================
This tutorial supports compiling MXNet with nGraph's CPU backend.
Begin by cloning MXNet from GitHub:
.. code-block:: console
git clone --recursive https://github.com/apache/incubator-mxnet
To compile run:
.. code-block:: console
cd incubator-mxnet
make -j USE_NGRAPH=1
MXNet's build system will automatically download, configure, and build the
nGraph library, then link it into ``libmxnet.so``. Once this is complete, we
recommend building a python3 virtual environment for testing, and then
install MXNet to the virtual environment:
.. code-block:: console
python3 -m venv .venv
. .venv/bin/activate
cd python
pip install -e .
cd ../
Now we're ready to use nGraph to run any model on a CPU backend. Building MXNet
with nGraph automatically enabled nGraph on your model scripts, and you
shouldn't need to do anything special. If you run into trouble, you can disable
nGraph by setting
.. code-block:: console
MXNET_SUBGRAPH_BACKEND=
If you do see trouble, please report it and we'll address it as soon as possible.
Running ResNet-50-V2 Inference
------------------------------
To show a working example, we'll demonstrate how MXNet may be used to run
ResNet-50 Inference. For ease, we'll consider the standard MXNet ResNet-50-V2
model from the `gluon model zoo`_, and we'll test with ``batch_size=1``.
Note that the nGraph-MXNet bridge supports static graphs only (dynamic graphs
are in the works); so for this example, we begin by converting the gluon model
into a static graph. Also note that any model with a saved checkpoint can be
considered a "static graph" in nGraph. For this example, we'll presume that the
model is pre-trained.
.. literalinclude:: ../../../examples/subgraph_snippets/mxnet-gluon-example.py
:language: python
:lines: 17-32
To load the model into nGraph, we simply bind the symbol into an Executor.
.. literalinclude:: ../../../examples/subgraph_snippets/mxnet-gluon-example.py
:language: python
:lines: 34-35
At binding, the MXNet Subgraph API finds nGraph, determines how to partition
the graph, and in the case of Resnet, sends the entire graph to nGraph for
compilation. This produces a single call to an NNVM ``NGraphSubgraphOp`` embedded
with the compiled model. At this point, we can test the model's performance.
.. literalinclude:: ../../../examples/subgraph_snippets/mxnet-gluon-example.py
:language: python
:lines: 40-48
.. _experimental backend: https://github.com/apache/incubator-mxnet/pull/12502
.. _Intel has contributed efforts showing: https://cwiki.apache.org/confluence/display/MXNET/MXNet+nGraph+integration+using+subgraph+backend+interface
.. _features: http://ngraph.nervanasys.com/docs/latest/project/about.html#features
.. _gluon model zoo: https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/model_zoo/vision/resnet.py#L499
.. _subgraph acceleration API: https://cwiki.apache.org/confluence/display/MXNET/Unified+integration+with+external+backend+libraries
.. _nGraph-MXNet: https://github.com/NervanaSystems/ngraph-mxnet/blob/master/README.md
.. frameworks/validation-testing: .. frameworks/validation.rst:
##############################
Validated Models and Workloads
##############################
Validation and testing We validated performance for the following TensorFlow\* and MXNet\* workloads:
######################
We validated performance for the following TensorFlow* and MXNet* workloads:
TensorFlow TensorFlow
========== ==========
.. csv-table:: .. csv-table::
:header: "TensorFlow Workload", "Type" :header: "TensorFlow Workload", "Genre of Deep Learning"
:widths: 27, 53 :widths: 27, 53
:escape: ~ :escape: ~
...@@ -40,7 +40,7 @@ MXNet ...@@ -40,7 +40,7 @@ MXNet
===== =====
.. csv-table:: .. csv-table::
:header: "MXNet Workload", "Type" :header: "MXNet Workload", "Genre of Deep Learning"
:widths: 27, 53 :widths: 27, 53
:escape: ~ :escape: ~
...@@ -66,11 +66,12 @@ MXNet ...@@ -66,11 +66,12 @@ MXNet
ONNX ONNX
===== =====
Additionally, we validated the following workloads are functional through nGraph ONNX importer: Additionally, we validated the following workloads are functional through
`nGraph ONNX importer`_:
.. csv-table:: .. csv-table::
:header: "ONNX Workload", "Type" :header: "ONNX Workload", "Genre of Deep Learning"
:widths: 27, 53 :widths: 27, 53
:escape: ~ :escape: ~
...@@ -90,31 +91,30 @@ Additionally, we validated the following workloads are functional through nGraph ...@@ -90,31 +91,30 @@ Additionally, we validated the following workloads are functional through nGraph
BVLC R-CNN ILSVRC13, Object detection BVLC R-CNN ILSVRC13, Object detection
.. important:: Please see Intel's `Optimization Notice`_ for details on disclaimers. .. important:: Please see Intel's `Optimization Notice`_ for details on disclaimers.
.. rubric:: Footnotes
.. [#1] Benchmarking performance of DL systems is a young discipline; it is a
.. _Optimization Notice: https://software.intel.com/en-us/articles/optimization-notice good idea to be vigilant for results based on atypical distortions in the
configuration parameters. Every topology is different, and performance
changes can be attributed to multiple causes. Also watch out for the word
.. Notice revision #20110804: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. "theoretical" in comparisons; actual performance should not be compared to
theoretical performance.
.. _Optimization Notice: https://software.intel.com/en-us/articles/optimization-notice
.. _nGraph ONNX importer: https://github.com/NervanaSystems/ngraph-onnx/blob/master/README.md
.. Notice revision #20110804: Intel's compilers may or may not optimize to the same degree for
non-Intel microprocessors for optimizations that are not unique to Intel microprocessors.
These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations.
Intel does not guarantee the availability, functionality, or effectiveness of any optimization
on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this
product are intended for use with Intel microprocessors. Certain optimizations not specific
to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the
applicable product User and Reference Guides for more information regarding the specific
instruction sets covered by this notice.
...@@ -25,8 +25,10 @@ Glossary ...@@ -25,8 +25,10 @@ Glossary
framework framework
A machine learning environment, such as TensorFlow, MXNet, or Frameworks provide expressive user-facing APIs for constructing,
neon. training, validating, and deploying DL/ML models: TensorFlow\*,
PaddlePaddle\*, MXNet\*, PyTorch\*, and Caffe\* are all examples of
well-known frameworks.
function graph function graph
......
...@@ -74,15 +74,15 @@ skip ahead to the next section, :ref:`install_ngonnx`. ...@@ -74,15 +74,15 @@ skip ahead to the next section, :ref:`install_ngonnx`.
$ cd onnx/ $ cd onnx/
$ . bin/activate $ . bin/activate
#. Check for the binary wheel file under ``/ngraph/python/dist/`` and install it #. Check for the binary wheel file under ``ngraph/python/dist`` and install it
with pip. with pip.
.. code-block:: console .. code-block:: console
(onnx)$ pip install -U python/dist/ngraph-0.5.0-cp35-cp35m-linux_x86_64.whl (onnx)$ pip install -U python/dist/ngraph-0.9.0-cp36-cp36m-linux_x86_64.whl
#. Confirm ngraph is properly installed through a Python interpreter: #. Confirm ``ngraph`` is properly installed through a Python interpreter:
.. code-block:: console .. code-block:: console
...@@ -116,6 +116,7 @@ Install the ``ngraph-onnx`` companion tool using pip: ...@@ -116,6 +116,7 @@ Install the ``ngraph-onnx`` companion tool using pip:
(onnx) $ pip install git+https://github.com/NervanaSystems/ngraph-onnx/ (onnx) $ pip install git+https://github.com/NervanaSystems/ngraph-onnx/
Importing a serialized model Importing a serialized model
============================= =============================
......
...@@ -23,18 +23,18 @@ Welcome ...@@ -23,18 +23,18 @@ Welcome
See the latest :doc:`project/release-notes`. See the latest :doc:`project/release-notes`.
nGraph is an open-source C++ library, compiler stack, and runtime accelerator
for software and neural network engineering within the :abbr:`Deep Learning (DL)`
ecosystem. nGraph simplifies development and makes it possible to design, write,
compile, and deploy :abbr:`Deep Neural Network (DNN)`-based solutions that can
be adapted and deployed across many frameworks and backends. See our project
:doc:`project/about` and `ecosystem`_ for more details.
.. figure:: graphics/599px-Intel-ngraph-ecosystem.png .. figure:: graphics/ngcompiler-ecosystem.png
:width: 599px :width: 650px
:alt: ecosystem
nGraph is an open-source C++ library, compiler stack, and runtime accelerator The Intel nGraph Compiler stack supports a broad ecosystem of frameworks and backends.
for software engineering in the :abbr:`Deep Learning (DL)` ecosystem. nGraph
simplifies development and makes it possible to design, write, compile, and
deploy :abbr:`Deep Neural Network (DNN)`-based solutions that can be adapted and
deployed across many frameworks and backends. A more detailed explanation, as
well as a high-level overview, can be found on our project :doc:`project/about`.
For more generalized discussion on the ecosystem, see the `ecosystem`_ document.
.. _quickstart: .. _quickstart:
...@@ -44,10 +44,17 @@ Quick Start ...@@ -44,10 +44,17 @@ Quick Start
We have many documentation pages to help you get started. We have many documentation pages to help you get started.
* **TensorFlow or MXNet users** can get started with :doc:`framework-integration-guides`; see also: * **TensorFlow or MXNet users** can get started with
:doc:`framework-integration-guides`.
* `TensorFlow bridge to nGraph`_ * `TensorFlow bridge to nGraph`_
* `Compiling MXNet with nGraph`_ * `Compiling MXNet with nGraph`_
.. note:: Note that the ``pip`` package option works only with Ubuntu 16.04
or greater and Intel® Xeon® CPUs. CPUs without Intel® Advanced Vector Extensions
512 (Intel® AVX-512) will not run these packages; the alternative is to
build from source. Wider support for other CPUs will be offered starting
in early 2019.
* **Data scientists** interested in the `ONNX`_ format will find the * **Data scientists** interested in the `ONNX`_ format will find the
`nGraph ONNX companion tool`_ of interest. `nGraph ONNX companion tool`_ of interest.
...@@ -55,7 +62,7 @@ We have many documentation pages to help you get started. ...@@ -55,7 +62,7 @@ We have many documentation pages to help you get started.
* **Framework authors and architects** will likely want to :doc:`buildlb` * **Framework authors and architects** will likely want to :doc:`buildlb`
and learn how nGraph can be used to :doc:`howto/execute`. For examples and learn how nGraph can be used to :doc:`howto/execute`. For examples
of generic configurations or optimizations available when designing or of generic configurations or optimizations available when designing or
bridging a framework directly with nGraph, see :doc:`frameworks/generic`. bridging a framework directly with nGraph, see :doc:`frameworks/index`.
* To start learning about nGraph's set of **Core ops** and how they can * To start learning about nGraph's set of **Core ops** and how they can
be used with Ops from other frameworks, go to :doc:`ops/index`. be used with Ops from other frameworks, go to :doc:`ops/index`.
...@@ -102,20 +109,32 @@ Contents ...@@ -102,20 +109,32 @@ Contents
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
:name: tocmaster :caption: Python Ops for ONNX
:caption: Documentation
python_api/index.rst
.. toctree::
:maxdepth: 1
:caption: Core Documentation
buildlb.rst buildlb.rst
framework-integration-guides.rst
frameworks/validation.rst
frameworks/index.rst
graph-basics.rst graph-basics.rst
howto/index.rst howto/index.rst
ops/about.rst
ops/index.rst ops/index.rst
framework-integration-guides.rst
frameworks/index.rst
fusion/index.rst fusion/index.rst
programmable/index.rst programmable/index.rst
distr/index.rst distr/index.rst
python_api/index.rst
.. toctree::
:maxdepth: 2
:caption: Project Metadata
project/index.rst project/index.rst
glossary.rst
Indices and tables Indices and tables
......
.. ops/about.rst:
##############
About Core Ops
##############
An ``Op``'s primary role is to function as a node in a directed acyclic graph
dependency computation graph.
*Core ops* are ops that are available and generally useful to all framework
bridges and that can be compiled by all transformers. A framework bridge may
define framework-specific ops to simplify graph construction, provided that the
bridge can enable every transformer to replace all such ops with equivalent
clusters or subgraphs composed of core ops. Similary, transformers may define
transformer-specific ops to represent kernels or other intermediate operations.
If a framework supports extending the set of ops it offers, a bridge may even
expose transformer-specific ops to the framework user.
.. figure:: ../graphics/tablengraphops.png
:width: 535px
:alt: Operations Available in the nGraph IR
Operations Available in the nGraph IR
.. important:: Our design philosophy is that the graph is not a script for
running kernels; rather, our compilation will match ``ops`` to appropriate
kernels for the backend(s) in use. Thus, we expect that adding of new Core
ops should be infrequent and that most functionality instead gets added with
new functions that build sub-graphs from existing core ops.
It is easiest to define a new op by adapting an existing op. Some of the tasks
that must be performed are:
- Op constructor:
* Checking type-consistency of arguments
* Specifying the result type for a call
- Serializer/Deserializer
- Transformer handlers:
* Interpreter (reference) implementation of behavior. The
implementation should favor clarity over efficiency.
.. ops/index.rst .. ops/index.rst
About Core Ops ####################
============== List of Core ``ops``
####################
An ``Op``'s primary role is to function as a node in a directed acyclic graph
dependency computation graph.
*Core ops* are ops that are available and generally useful to all framework
bridges and that can be compiled by all transformers. A framework bridge may
define framework-specific ops to simplify graph construction, provided that the
bridge can enable every transformer to replace all such ops with equivalent
subgraphs composed of core ops. Similary, transformers may define
transformer-specific ops to represent kernels or other intermediate operations.
If a framework supports extending the set of ops it offers, a bridge may even
expose transformer-specific ops to the framework user.
Our design philosophy is that the graph is not a script for running kernels;
rather, our compilation will match ``ops`` to appropriate kernels for the
backend(s) in use. Thus, we expect that adding of new Core ops should be
infrequent and that most functionality instead gets added with new functions
that build sub-graphs from existing core ops.
It is easiest to define a new op by adapting an existing op. Some of the tasks
that must be performed are:
- Op constructor:
* Checking type-consistency of arguments
* Specifying the result type for a call
- Serializer/Deserializer
- Transformer handlers:
* Interpreter (reference) implementation of behavior. The
implementation should favor clarity over efficiency.
Alphabetical list of Core ``ops``
=================================
Not currently a comprehensive list. Not currently a comprehensive list.
......
. about: .. about:
Architecture, Features, FAQs Architecture, Features, FAQs
...@@ -25,7 +25,7 @@ frameworks and backends currently are functioning. ...@@ -25,7 +25,7 @@ frameworks and backends currently are functioning.
:alt: :alt:
Bridge Bridge
^^^^^^ ------
Starting from the top of the stack, nGraph receives a computational Starting from the top of the stack, nGraph receives a computational
graph from a deep learning framework such as TensorFlow\* or MXNet\*. graph from a deep learning framework such as TensorFlow\* or MXNet\*.
...@@ -38,7 +38,7 @@ Parts of the graph that are not encapsulated will default to framework ...@@ -38,7 +38,7 @@ Parts of the graph that are not encapsulated will default to framework
implementation when executed. implementation when executed.
nGraph Core nGraph Core
^^^^^^^^^^^ -----------
nGraph uses a strongly-typed and platform-neutral nGraph uses a strongly-typed and platform-neutral
``Intermediate Representation (IR)`` to construct a "stateless" ``Intermediate Representation (IR)`` to construct a "stateless"
...@@ -56,7 +56,7 @@ ResNet\* for TensorFlow\ *, the same optimization can be readily applied ...@@ -56,7 +56,7 @@ ResNet\* for TensorFlow\ *, the same optimization can be readily applied
to MXNet* or ONNX\* implementations of ResNet\*. to MXNet* or ONNX\* implementations of ResNet\*.
Hybrid Transformer Hybrid Transformer
^^^^^^^^^^^^^^^^^^ ------------------
Hybrid transformer takes the nGraph IR, and partitions it into Hybrid transformer takes the nGraph IR, and partitions it into
subgraphs, which can then be assigned to the best-performing backend. subgraphs, which can then be assigned to the best-performing backend.
...@@ -72,7 +72,7 @@ Once the subgraphs are assigned, the corresponding backend will execute ...@@ -72,7 +72,7 @@ Once the subgraphs are assigned, the corresponding backend will execute
the IR. the IR.
Backends Backends
^^^^^^^^ --------
Focusing our attention on the CPU backend, when the IR is passed to the Focusing our attention on the CPU backend, when the IR is passed to the
Intel® Architecture (IA) transformer, it can be executed in two modes: Intel® Architecture (IA) transformer, it can be executed in two modes:
...@@ -104,7 +104,7 @@ Flow Graph. ...@@ -104,7 +104,7 @@ Flow Graph.
.. _features: .. _features:
Features Features
======== ########
nGraph performs a combination of device-specific and non-device-specific nGraph performs a combination of device-specific and non-device-specific
optimizations: optimizations:
...@@ -126,45 +126,20 @@ optimizations: ...@@ -126,45 +126,20 @@ optimizations:
with nGraph translating element order to work best for whatever given with nGraph translating element order to work best for whatever given
or available device. or available device.
Beta Limitations
----------------
In this Beta release, nGraph only supports Just In Time compilation, but
we plan to add support for Ahead of Time compilation in the official
release of nGraph. nGraph currently has limited support for dynamic
graphs.
.. _no-lockin:
Develop without lock-in
-----------------------
Being able to increase training performance or reduce inference latency by
simply adding another device of *any* form factor -- more compute (CPU), GPU or
VPU processing power, custom ASIC or FPGA, or a yet-to-be invented generation of
NNP or accelerator -- is a key benefit for framework developers building with
nGraph. Our commitment to bake flexibility into our ecosystem ensures developers'
freedom to design user-facing APIs for various hardware deployments directly
into their frameworks.
.. figure:: ../graphics/develop-without-lockin.png
.. _faq: .. _faq:
FAQs FAQs
==== ####
Why nGraph? Why nGraph?
----------- ===========
The value we're offering to the developer community is empowerment: we are The value we're offering to the developer community is empowerment: we are
confident that Intel® Architecture already provides the best computational confident that Intel® Architecture already provides the best computational
resources available for the breadth of ML/DL tasks. resources available for the breadth of ML/DL tasks.
How does it work? How does it work?
------------------ =================
The :doc:`nGraph Core <../ops/index>` uses a **strongly-typed** and The :doc:`nGraph Core <../ops/index>` uses a **strongly-typed** and
**platform-neutral** :abbr:`Intermediate Representation (IR)` to construct a **platform-neutral** :abbr:`Intermediate Representation (IR)` to construct a
...@@ -174,7 +149,7 @@ outputs from zero or more tensor inputs. ...@@ -174,7 +149,7 @@ outputs from zero or more tensor inputs.
How do I connect a framework? How do I connect a framework?
----------------------------- =============================
The nGraph Library manages framework bridges for some of the more widely-known The nGraph Library manages framework bridges for some of the more widely-known
frameworks. A bridge acts as an intermediary between the nGraph core and the frameworks. A bridge acts as an intermediary between the nGraph core and the
...@@ -203,7 +178,7 @@ MXNet currently make use of custom bridge code, see the section on ...@@ -203,7 +178,7 @@ MXNet currently make use of custom bridge code, see the section on
How do I run an inference model? How do I run an inference model?
-------------------------------- ================================
Framework bridge code is *not* the only way to connect a model (function graph) Framework bridge code is *not* the only way to connect a model (function graph)
to nGraph's :doc:`../ops/index`. We've also built an importer for models that to nGraph's :doc:`../ops/index`. We've also built an importer for models that
...@@ -215,7 +190,7 @@ the :doc:`../howto/import` documentation. ...@@ -215,7 +190,7 @@ the :doc:`../howto/import` documentation.
.. _whats_next: .. _whats_next:
What's next? What's next?
============ ############
We developed nGraph to simplify the realization of optimized deep learning We developed nGraph to simplify the realization of optimized deep learning
performance across frameworks and hardware platforms. You can read more about performance across frameworks and hardware platforms. You can read more about
......
.. code-contributor-README: .. code-contributor-README:
########################### ######################
Core Contributor Guidelines Code Contributor Guide
########################### ######################
License License
======= =======
...@@ -13,7 +13,7 @@ contributed with another license will need the license reviewed by ...@@ -13,7 +13,7 @@ contributed with another license will need the license reviewed by
Intel before it can be accepted. Intel before it can be accepted.
Code formatting Code formatting
================ ===============
All C/C++ source code in the repository, including the test code, must All C/C++ source code in the repository, including the test code, must
adhere to the source-code formatting and style guidelines described adhere to the source-code formatting and style guidelines described
...@@ -260,4 +260,4 @@ it is automatically enforced and reduces merge conflicts. ...@@ -260,4 +260,4 @@ it is automatically enforced and reduces merge conflicts.
.. _Apache 2: https://www.apache.org/licenses/LICENSE-2.0 .. _Apache 2: https://www.apache.org/licenses/LICENSE-2.0
.. _repo wiki: .. _repo wiki: https://github.com/NervanaSystems/ngraph/wiki
\ No newline at end of file \ No newline at end of file
Core Contributor Guidelines
===========================
Code formatting
---------------
All C/C++ source code in the repository, including the test code, must
adhere to the source-code formatting and style guidelines described
here.
### Adding ops to nGraph Core
Our design philosophy is that the graph is not a script for running
kernels; rather, the graph is a snapshot of the computation's building
blocks which we call `ops`. Compilation should match `ops` to
appropriate kernels for the backend(s) in use. Thus, we expect that
adding of new Core ops should be infrequent and that most functionality
instead gets added with new functions that build sub-graphs from
existing core ops.
The coding style described here should apply to both Core `ops`, and to
any functions that build out (upon) sub-graphs from the core.
### Coding style
We have a coding standard to help us to get development done. If part of
the standard is impeding progress, we either adjust that part or remove
it. To this end, we employ coding standards that facilitate
understanding of *what nGraph components are doing*. Programs are
easiest to understand when they can be understood locally; if most local
changes have local impact, you do not need to dig through multiple files
to understand what something does.
#### Names
Names should *briefly* describe the thing being named and follow these
casing standards:
- Define C++ class or type names with `CamelCase`.
- Assign template parameters with `UPPER_SNAKE_CASE`.
- Case variable and function names with `snake_case`.
Method names for basic accessors are prefixed by `get_` or `set_` and
should have simple $\mathcal{O}(1)$ implementations:
- A `get_` method should be externally idempotent. It may perform some
simple initialization and cache the result for later use.
- An `is_` may be used instead of `get_` for boolean accessors.
Trivial `get_` methods can be defined in a header file.
- A `set_` method should change the value returned by the
corresponding `get_` method.
- Use `set_is_` if using `is_` to get a value.
- Trivial `set_` methods may be defined in a header file.
- Names of variables should indicate the use of the variable.
- Member variables should be prefixed with `m_`.
- Static member variables should be rare and be prefixed with
`s_`.
- Do not use `using` to define a type alias at top-level in header
file. If the abstraction is useful, give it a class.
- C++ does not enforce the abstraction. For example if `X` and `Y`
are aliases for the same type, you can pass an `X` to something
expecting a `Y`.
- If one of the aliases were later changed, or turned into a real
type, many callers could require changes.
#### Namespaces
- `ngraph` is for the public API, although this is not
currently enforced.
- Use a nested namespace for implementation classes.
- Use an unnamed namespace or `static` for file-local names. This
helps prevent unintended name collisions during linking and when
using shared and dynamically-loaded libraries.
- Never use `using` at top-level in a header file.
- Doing so leaks the alias into users of the header, including
headers that follow.
- It is okay to use `using` with local scope, such as inside a class
: definiton.
- Be careful of C++'s implicit namespace inclusions. For example,
if a parameter's type is from another namespace, that namespace
can be visible in the body.
- Only use `using std` and/or `using ngraph` in `.cpp` files.
`using` a nested namespace has can result in
unexpected behavior.
#### File Names
- Do not use the same file name in multiple directories. At least one
IDE/debugger ignores the directory name when setting breakpoints.
- Use `.hpp` for headers and `.cpp` for implementation.
- Reflect the namespace nesting in the directory hierarchy.
- Unit test files are in the `tests` directory.
- Tranformer-dependent tests are tests running on the default
transformer or specifying a transformer. For these, use the form
``` {.sourceCode .cpp}
TEST(file_name, test_name)
```
- Transformer-independent tests:
- File name is `file_name.in.cpp`
- Add `#include "test_control.hpp"` to the file's includes
- Add the line
`static std::string s_manifest = "${MANIFEST}";` to the top
of the file.
- Use
``` {.sourceCode .sh}
NGRAPH_TEST(${BACKEND_NAME}, test_name)
```
for each test. Files are generated for each transformer and
the `${BACKEND_NAME}` is replaced with the transformer name.
Individual unit tests may be disabled by adding the name of
the test to the `unit_test.manifest` file found in the
transformer's source file directory.
#### Formatting
Things that look different should look different because they are
different. We use **clang format** to enforce certain formatting.
Although not always ideal, it is automatically enforced and reduces
merge conflicts.
- The .clang-format file located in the root of the project specifies
our format.
- The script maint/apply-code-format.sh enforces that formatting
at the C/C++ syntactic level.
- The script at maint/check-code-format.sh verifies that the
formatting rules are met by all C/C++ code (again, at the
syntax level). The script has an exit code of `0` when code
meets the standard and non-zero otherwise. This script does
*not* modify the source code.
- Formatting with `#include` files:
- Put headers in groups separated by a blank line. Logically order
the groups downward from system-level to 3rd-party to `ngraph`.
- Formatting will keep the files in each group in
alphabetic order.
- Use this syntax for files that **do not change during
development**; they will not be checked for changes
during builds. Normally this will be everything but the ngraph
files:
``` {.sourceCode .cpp}
#include <file>
```
- Use this syntax for files that **are changing during
development**; they will be checked for changes during builds.
Normally this will be ngraph headers:
``` {.sourceCode .cpp}
#include "file"
```
- Use this syntax for system C headers with C++ wrappers:
``` {.sourceCode .cpp}
#include <c...>
```
- To guard against multiple inclusion, avoid using the `#define X_H`
style. Use this syntax instead:
``` {.sourceCode .cpp}
#pragma once
```
- The initialization
``` {.sourceCode .cpp}
Foo x{4, 5};
```
is preferred over
``` {.sourceCode .cpp}
Foo x(4, 5);
```
- Indentation should be accompanied by braces; this includes
single-line bodies for conditionals and loops.
- Exception checking:
- Throw an exception to report a problem.
- Nothing that calls `abort`, `exit` or `terminate` should
be used. Remember that ngraph is a guest of the framework.
- Do not use exclamation points in messages!
- Be as specific as practical. Keep in mind that the person who
sees the error is likely to be on the other side of the
framework and the message might be the only information they see
about the problem.
- If you use `auto`, know what you are doing. `auto` uses the same
type-stripping rules as template parameters. If something returns a
reference, `auto` will strip the reference unless you use `auto&`:
- Don't do things like
``` {.sourceCode .cpp}
auto s = Shape{2,3};
```
Instead, use
``` {.sourceCode .cpp}
Shape s{2, 3};
```
- Indicate the type in the variable name.
- One variable declaration/definition per line
- Don't use the C-style
``` {.sourceCode .cpp}
int x, y, *z;
```
Instead, use:
``` {.sourceCode .cpp}
int x;
int y;
int* z;
```
.. project/index.rst .. project/index.rst
#################
More about nGraph More about nGraph
================== #################
This section contains documentation about the project and how to contribute. This section contains documentation about the project and how to contribute.
...@@ -13,4 +13,3 @@ This section contains documentation about the project and how to contribute. ...@@ -13,4 +13,3 @@ This section contains documentation about the project and how to contribute.
release-notes.rst release-notes.rst
code-contributor-README.rst code-contributor-README.rst
doc-contributor-README.rst doc-contributor-README.rst
../glossary.rst
.. python_api/index.rst .. python_api/index.rst
########## ###########
Python API Python API
########## ###########
This section contains the Python API component of the nGraph Compiler stack. The
Python API exposes nGraph™ C++ operations to Python users. For quick-start you
can find an example of the API usage below.
This section contains nGraph™ Python API documentation. Python API exposes
nGraph™ C++ operations to Python users. For quick-start you can find an example
of the API usage below.
.. literalinclude:: ../../../../python/examples/basic.py .. literalinclude:: ../../../../python/examples/basic.py
:language: python :language: python
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment