.. about: 

About
=====

Welcome to the Intel nGraph project, an open source C++ library for developers
of :abbr:`Deep Learning (DL)` (DL) systems. Here you will find a suite of 
components, APIs, and documentation that can be used to compile and run  
:abbr:`Deep Neural Network (DNN)` models defined in a variety of frameworks.  

.. figure:: ../graphics/ngraph-hub.png  

The nGraph library translates a framework’s representation of computations into 
an :abbr:`Intermediate Representation (IR)` designed to promote computational 
efficiency on target hardware. Initially-supported backends include Intel 
Architecture CPUs, the Intel® Nervana Neural Network Processor™ (NNP), 
and NVIDIA\* GPUs. Currently-supported compiler optimizations include efficient 
memory management and data layout abstraction. 

Why is this needed?
--------------------

When Deep Learning (DL) frameworks first emerged as the vehicle for training 
and inference models, they were designed around kernels optimized for a 
particular platform. As a result, many backend details were being exposed in 
the model definitions, making the adaptability and portability of DL models 
to other or more advanced backends inherently complex and expensive.

The traditional approach means that an algorithm developer cannot easily adapt 
his or her model to different backends. Making a model run on a different 
framework is also problematic because the user must separate the essence of 
the model from the performance adjustments made for the backend, translate 
to similar ops in the new framework, and finally make the necessary changes 
for the preferred backend configuration on the new framework.

We designed the Intel nGraph project to substantially reduce these kinds of 
engineering complexities. While optimized kernels for deep-learning primitives 
are provided through the project and via libraries like Intel® Math Kernel 
Library (Intel® MKL) for Deep Neural Networks (Intel® MKL-DNN), there are 
several compiler-inspired ways in which performance can be further optimized.

=======

The *nGraph core* uses a strongly-typed and platform-neutral stateless graph 
representation for computations. Each node, or *op*, in the graph corresponds
to one step in a computation, where each step produces zero or more tensor
outputs from zero or more tensor inputs.

There is a *framework bridge* for each supported framework which acts as 
an intermediary between the *ngraph core* and the framework. A *transformer* 
plays a similar role between the ngraph core and the various execution 
platforms.

Transformers compile the graph using a combination of generic and 
platform-specific graph transformations. The result is a function that
can be executed from the framework bridge. Transformers also allocate
and deallocate, as well as read and write tensors under direction of the
bridge.
  
We developed Intel nGraph to simplify the realization of optimized deep 
learning performance across frameworks and hardware platforms. You can
read more about design decisions and what is tentatively in the pipeline
for development in our `SysML conference paper`_.

.. _frontend: http://neon.nervanasys.com/index.html/
.. _SysML conference paper: https://arxiv.org/pdf/1801.08058.pdf
.. _MXNet: http://mxnet.incubator.apache.org/
.. _TensorFlow: https://www.tensorflow.org/