Merge pull request #10979 from dkurt:unite_dnn_samples

ab110c0a · Alexander Alekhin · cc06935a · 538fd423 · ab110c0a · ab110c0a
Commit ab110c0a authored Mar 14, 2018 by Alexander Alekhin
45 changed files
--- a/doc/tutorials/dnn/dnn_googlenet/dnn_googlenet.markdown
+++ b/doc/tutorials/dnn/dnn_googlenet/dnn_googlenet.markdown
@@ -13,50 +13,53 @@ We will demonstrate results of this example on the following picture.
 Source Code
 -----------
-We will be using snippets from the example application, that can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/caffe_googlenet.cpp).
+We will be using snippets from the example application, that can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/classification.cpp).
-@include dnn/caffe_googlenet.cpp
+@include dnn/classification.cpp
 Explanation
 -----------
 -# Firstly, download GoogLeNet model files:
-   [bvlc_googlenet.prototxt  ](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/bvlc_googlenet.prototxt) and
+   [bvlc_googlenet.prototxt  ](https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/bvlc_googlenet.prototxt) and
   [bvlc_googlenet.caffemodel](http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel)
   Also you need file with names of [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/browse-synsets) classes:
-   [synset_words.txt](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/synset_words.txt).
+   [classification_classes_ILSVRC2012.txt](https://github.com/opencv/opencv/tree/master/samples/dnn/classification_classes_ILSVRC2012.txt).
   Put these files into working dir of this program example.
 -# Read and initialize network using path to .prototxt and .caffemodel files
-   @snippet dnn/caffe_googlenet.cpp Read and initialize network
+   @snippet dnn/classification.cpp Read and initialize network
-# Check that network was read successfully
+   You can skip an argument `framework` if one of the files `model` or `config` has an
-   @snippet dnn/caffe_googlenet.cpp Check that network was read successfully
+   extension `.caffemodel` or `.prototxt`.
+   This way function cv::dnn::readNet can automatically detects a model's format.
 -# Read input image and convert to the blob, acceptable by GoogleNet
-   @snippet dnn/caffe_googlenet.cpp Prepare blob
+   @snippet dnn/classification.cpp Open a video file or an image file or a camera stream
-   We convert the image to a 4-dimensional blob (so-called batch) with 1x3x224x224 shape after applying necessary pre-processing like resizing and mean subtraction using cv::dnn::blobFromImage constructor.
-# Pass the blob to the network
+   cv::VideoCapture can load both images and videos.
-   @snippet dnn/caffe_googlenet.cpp Set input blob
-   In bvlc_googlenet.prototxt the network input blob named as "data", therefore this blob labeled as ".data" in opencv_dnn API.
+   @snippet dnn/classification.cpp Create a 4D blob from a frame
+   We convert the image to a 4-dimensional blob (so-called batch) with `1x3x224x224` shape
+   after applying necessary pre-processing like resizing and mean subtraction
+   `(-104, -117, -123)` for each blue, green and red channels correspondingly using cv::dnn::blobFromImage function.
-   Other blobs labeled as "name_of_layer.name_of_layer_output".
+-# Pass the blob to the network
+   @snippet dnn/classification.cpp Set input blob
 -# Make forward pass
-   @snippet dnn/caffe_googlenet.cpp Make forward pass
+   @snippet dnn/classification.cpp Make forward pass
-   During the forward pass output of each network layer is computed, but in this example we need output from "prob" layer only.
+   During the forward pass output of each network layer is computed, but in this example we need output from the last layer only.
 -# Determine the best class
-   @snippet dnn/caffe_googlenet.cpp Gather output
+   @snippet dnn/classification.cpp Get a class with a highest score
-   We put the output of "prob" layer, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the `prob` blob.
+   We put the output of network, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the `prob` blob.
-   And find the index of element with maximal value in this one. This index correspond to the class of the image.
+   And find the index of element with maximal value in this one. This index corresponds to the class of the image.
-# Print results
+-# Run an example from command line
-   @snippet dnn/caffe_googlenet.cpp Print results
+   @code
-   For our image we get:
+   ./example_dnn_classification --model=bvlc_googlenet.caffemodel --config=bvlc_googlenet.prototxt --width=224 --height=224 --classes=classification_classes_ILSVRC2012.txt --input=space_shuttle.jpg --mean="104 117 123"
-> Best class: #812 'space shuttle'
+   @endcode
->
+   For our image we get prediction of class `space shuttle` with more than 99% sureness.
-> Probability: 99.6378%
--- a/doc/tutorials/dnn/dnn_halide/dnn_halide.markdown
+++ b/doc/tutorials/dnn/dnn_halide/dnn_halide.markdown
@@ -74,46 +74,7 @@ When you build OpenCV add the following configuration flags:
 - `HALIDE_ROOT_DIR` - path to Halide build directory
-## Sample
+## Set Halide as a preferable backend
+@code
-@include dnn/squeezenet_halide.cpp
+net.setPreferableBackend(DNN_BACKEND_HALIDE);
+@endcode
-## Explanation
-Download Caffe model from SqueezeNet repository: [train_val.prototxt](https://github.com/DeepScale/SqueezeNet/blob/master/SqueezeNet_v1.1/train_val.prototxt) and [squeezenet_v1.1.caffemodel](https://github.com/DeepScale/SqueezeNet/blob/master/SqueezeNet_v1.1/squeezenet_v1.1.caffemodel).
-Also you need file with names of [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/browse-synsets) classes:
-[synset_words.txt](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/synset_words.txt).
-Put these files into working dir of this program example.
-# Read and initialize network using path to .prototxt and .caffemodel files
-@snippet dnn/squeezenet_halide.cpp Read and initialize network
-# Check that network was read successfully
-@snippet dnn/squeezenet_halide.cpp Check that network was read successfully
-# Read input image and convert to the 4-dimensional blob, acceptable by SqueezeNet v1.1
-@snippet dnn/squeezenet_halide.cpp Prepare blob
-# Pass the blob to the network
-@snippet dnn/squeezenet_halide.cpp Set input blob
-# Enable Halide backend for layers where it is implemented
-@snippet dnn/squeezenet_halide.cpp Enable Halide backend
-# Make forward pass
-@snippet dnn/squeezenet_halide.cpp Make forward pass
-Remember that the first forward pass after initialization require quite more
-time that the next ones. It's because of runtime compilation of Halide pipelines
-at the first invocation.
-# Determine the best class
-@snippet dnn/squeezenet_halide.cpp Determine the best class
-# Print results
-@snippet dnn/squeezenet_halide.cpp Print results
-For our image we get:
-> Best class: #812 'space shuttle'
->
-> Probability: 97.9812%
--- a/doc/tutorials/dnn/dnn_yolo/dnn_yolo.markdown
+++ b/doc/tutorials/dnn/dnn_yolo/dnn_yolo.markdown
@@ -18,40 +18,26 @@ VIDEO DEMO:
 Source Code
 -----------
-The latest version of sample source code can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/yolo_object_detection.cpp).
+Use a universal sample for object detection models written
+[in C++](https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.cpp) and
+[in Python](https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.py) languages
-@include dnn/yolo_object_detection.cpp
+Usage examples
+--------------
-How to compile in command line with pkg-config
----------------------------------------------
-@code{.bash}
-# g++ `pkg-config --cflags opencv` `pkg-config --libs opencv` yolo_object_detection.cpp -o yolo_object_detection
-@endcode
 Execute in webcam:
 @code{.bash}
-$ yolo_object_detection -camera_device=0  -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights   -class_names=[PATH-TO-DARKNET]/data/coco.names
+$ example_dnn_object_detection --config=[PATH-TO-DARKNET]/cfg/yolo.cfg --model=[PATH-TO-DARKNET]/yolo.weights --classes=object_detection_classes_pascal_voc.txt --width=416 --height=416 --scale=0.00392
-@endcode
-Execute with image:
-@code{.bash}
-$ yolo_object_detection -source=[PATH-IMAGE]  -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights   -class_names=[PATH-TO-DARKNET]/data/coco.names
 @endcode
-Execute in video file:
+Execute with image or video file:
 @code{.bash}
-$ yolo_object_detection -source=[PATH-TO-VIDEO] -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights   -class_names=[PATH-TO-DARKNET]/data/coco.names
+$ example_dnn_object_detection --config=[PATH-TO-DARKNET]/cfg/yolo.cfg --model=[PATH-TO-DARKNET]/yolo.weights --classes=object_detection_classes_pascal_voc.txt --width=416 --height=416 --scale=0.00392 --input=[PATH-TO-IMAGE-OR-VIDEO-FILE]
 @endcode

--- a/modules/core/include/opencv2/core.hpp
+++ b/modules/core/include/opencv2/core.hpp
@@ -3159,7 +3159,7 @@ protected:
 struct Param {
    enum { INT=0, BOOLEAN=1, REAL=2, STRING=3, MAT=4, MAT_VECTOR=5, ALGORITHM=6, FLOAT=7,
-           UNSIGNED_INT=8, UINT64=9, UCHAR=11 };
+           UNSIGNED_INT=8, UINT64=9, UCHAR=11, SCALAR=12 };
 };
@@ -3252,6 +3252,14 @@ template<> struct ParamType<uchar>
    enum { type = Param::UCHAR };
 };
+template<> struct ParamType<Scalar>
+{
+    typedef const Scalar& const_param_type;
+    typedef Scalar member_type;
+    enum { type = Param::SCALAR };
+};
 //! @} core_basic
 } //namespace cv

--- a/modules/core/src/command_line_parser.cpp
+++ b/modules/core/src/command_line_parser.cpp
@@ -104,6 +104,12 @@ static void from_str(const String& str, int type, void* dst)
        ss >> *(double*)dst;
    else if( type == Param::STRING )
        *(String*)dst = str;
+    else if( type == Param::SCALAR)
+    {
+        Scalar& scalar = *(Scalar*)dst;
+        for (int i = 0; i < 4 && !ss.eof(); ++i)
+            ss >> scalar[i];
+    }
    else
        CV_Error(Error::StsBadArg, "unknown/unsupported parameter type");

--- a/modules/core/test/test_utils.cpp
+++ b/modules/core/test/test_utils.cpp
@@ -261,4 +261,26 @@ TEST(AutoBuffer, allocate_test)
    EXPECT_EQ(6u, abuf.size());
 }
+TEST(CommandLineParser, testScalar)
+{
+    static const char * const keys3 =
+            "{ s0 | 3 4 5 | default scalar }"
+            "{ s1 |       | single value scalar }"
+            "{ s2 |       | two values scalar (default with zeros) }"
+            "{ s3 |       | three values scalar }"
+            "{ s4 |       | four values scalar }"
+            "{ s5 |       | five values scalar }";
+    const char* argv[] = {"<bin>", "--s1=1.1", "--s3=1.1 2.2 3",
+                          "--s4=-4.2 1 0 3", "--s5=5 -4 3 2 1"};
+    const int argc = 5;
+    CommandLineParser parser(argc, argv, keys3);
+    EXPECT_EQ(parser.get<Scalar>("s0"), Scalar(3, 4, 5));
+    EXPECT_EQ(parser.get<Scalar>("s1"), Scalar(1.1));
+    EXPECT_EQ(parser.get<Scalar>("s2"), Scalar(0));
+    EXPECT_EQ(parser.get<Scalar>("s3"), Scalar(1.1, 2.2, 3));
+    EXPECT_EQ(parser.get<Scalar>("s4"), Scalar(-4.2, 1, 0, 3));
+    EXPECT_EQ(parser.get<Scalar>("s5"), Scalar(5, -4, 3, 2));
+}
 }} // namespace
--- a/modules/dnn/include/opencv2/dnn/all_layers.hpp
+++ b/modules/dnn/include/opencv2/dnn/all_layers.hpp
@@ -153,7 +153,7 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
        */
        int inputNameToIndex(String inputName);
-        int outputNameToIndex(String outputName);
+        int outputNameToIndex(const String& outputName);
    };
    /** @brief Classical recurrent layer

--- a/modules/dnn/include/opencv2/dnn/dnn.hpp
+++ b/modules/dnn/include/opencv2/dnn/dnn.hpp
@@ -222,7 +222,7 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
        /** @brief Returns index of output blob in output array.
         *  @see inputNameToIndex()
         */
-        virtual int outputNameToIndex(String outputName);
+        CV_WRAP virtual int outputNameToIndex(const String& outputName);
        /**
         * @brief Ask layer if it support specific backend for doing computations.
@@ -683,6 +683,29 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
     */
     CV_EXPORTS_W Net readNetFromTorch(const String &model, bool isBinary = true);
+     /**
+      * @brief Read deep learning network represented in one of the supported formats.
+      * @param[in] model Binary file contains trained weights. The following file
+      *                  extensions are expected for models from different frameworks:
+      *                  * `*.caffemodel` (Caffe, http://caffe.berkeleyvision.org/)
+      *                  * `*.pb` (TensorFlow, https://www.tensorflow.org/)
+      *                  * `*.t7` | `*.net` (Torch, http://torch.ch/)
+      *                  * `*.weights` (Darknet, https://pjreddie.com/darknet/)
+      * @param[in] config Text file contains network configuration. It could be a
+      *                   file with the following extensions:
+      *                  * `*.prototxt` (Caffe, http://caffe.berkeleyvision.org/)
+      *                  * `*.pbtxt` (TensorFlow, https://www.tensorflow.org/)
+      *                  * `*.cfg` (Darknet, https://pjreddie.com/darknet/)
+      * @param[in] framework Explicit framework name tag to determine a format.
+      * @returns Net object.
+      *
+      * This function automatically detects an origin framework of trained model
+      * and calls an appropriate function such @ref readNetFromCaffe, @ref readNetFromTensorflow,
+      * @ref readNetFromTorch or @ref readNetFromDarknet. An order of @p model and @p config
+      * arguments does not matter.
+      */
+     CV_EXPORTS_W Net readNet(const String& model, const String& config = "", const String& framework = "");
    /** @brief Loads blob which was serialized as torch.Tensor object of Torch7 framework.
     *  @warning This function has the same limitations as readNetFromTorch().
     */

--- a/modules/dnn/src/dnn.cpp
+++ b/modules/dnn/src/dnn.cpp
@@ -399,7 +399,7 @@ struct DataLayer : public Layer
    void forward(std::vector<Mat*>&, std::vector<Mat>&, std::vector<Mat> &) {}
    void forward(InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals) {}
-    int outputNameToIndex(String tgtName)
+    int outputNameToIndex(const String& tgtName)
    {
        int idx = (int)(std::find(outNames.begin(), outNames.end(), tgtName) - outNames.begin());
        return (idx < (int)outNames.size()) ? idx : -1;
@@ -2521,7 +2521,7 @@ int Layer::inputNameToIndex(String)
    return -1;
 }
-int Layer::outputNameToIndex(String)
+int Layer::outputNameToIndex(const String&)
 {
    return -1;
 }
@@ -2813,5 +2813,43 @@ BackendWrapper::BackendWrapper(const Ptr<BackendWrapper>& base, const MatShape&
 BackendWrapper::~BackendWrapper() {}
+Net readNet(const String& _model, const String& _config, const String& _framework)
+{
+    String framework = _framework.toLowerCase();
+    String model = _model;
+    String config = _config;
+    const std::string modelExt = model.substr(model.rfind('.') + 1);
+    const std::string configExt = config.substr(config.rfind('.') + 1);
+    if (framework == "caffe" || modelExt == "caffemodel" || configExt == "caffemodel" ||
+                                modelExt == "prototxt" || configExt == "prototxt")
+    {
+        if (modelExt == "prototxt" || configExt == "caffemodel")
+            std::swap(model, config);
+        return readNetFromCaffe(config, model);
+    }
+    if (framework == "tensorflow" || modelExt == "pb" || configExt == "pb" ||
+                                     modelExt == "pbtxt" || configExt == "pbtxt")
+    {
+        if (modelExt == "pbtxt" || configExt == "pb")
+            std::swap(model, config);
+        return readNetFromTensorflow(model, config);
+    }
+    if (framework == "torch" || modelExt == "t7" || modelExt == "net" ||
+                                configExt == "t7" || configExt == "net")
+    {
+        return readNetFromTorch(model.empty() ? config : model);
+    }
+    if (framework == "darknet" || modelExt == "weights" || configExt == "weights" ||
+                                  modelExt == "cfg" || configExt == "cfg")
+    {
+        if (modelExt == "cfg" || configExt == "weights")
+            std::swap(model, config);
+        return readNetFromDarknet(config, model);
+    }
+    CV_Error(Error::StsError, "Cannot determine an origin framework of files: " +
+                              model + (config.empty() ? "" : ", " + config));
+    return Net();
+}
 CV__DNN_EXPERIMENTAL_NS_END
 }} // namespace
--- a/modules/dnn/src/layers/recurrent_layers.cpp
+++ b/modules/dnn/src/layers/recurrent_layers.cpp
@@ -355,7 +355,7 @@ int LSTMLayer::inputNameToIndex(String inputName)
    return -1;
 }
-int LSTMLayer::outputNameToIndex(String outputName)
+int LSTMLayer::outputNameToIndex(const String& outputName)
 {
    if (outputName.toLowerCase() == "h")
        return 0;

--- a/modules/dnn/test/test_misc.cpp
+++ b/modules/dnn/test/test_misc.cpp
@@ -57,4 +57,22 @@ TEST(imagesFromBlob, Regression)
    }
 }
+TEST(readNet, Regression)
+{
+    Net net = readNet(findDataFile("dnn/squeezenet_v1.1.prototxt", false),
+                      findDataFile("dnn/squeezenet_v1.1.caffemodel", false));
+    EXPECT_FALSE(net.empty());
+    net = readNet(findDataFile("dnn/opencv_face_detector.caffemodel", false),
+                  findDataFile("dnn/opencv_face_detector.prototxt", false));
+    EXPECT_FALSE(net.empty());
+    net = readNet(findDataFile("dnn/openface_nn4.small2.v1.t7", false));
+    EXPECT_FALSE(net.empty());
+    net = readNet(findDataFile("dnn/tiny-yolo-voc.cfg", false),
+                  findDataFile("dnn/tiny-yolo-voc.weights", false));
+    EXPECT_FALSE(net.empty());
+    net = readNet(findDataFile("dnn/ssd_mobilenet_v1_coco.pbtxt", false),
+                  findDataFile("dnn/ssd_mobilenet_v1_coco.pb", false));
+    EXPECT_FALSE(net.empty());
+}
 }} // namespace
--- a/samples/data/dnn/.gitignore
+++ b/samples/data/dnn/.gitignore
-*.caffemodel
--- a/samples/data/dnn/MobileNetSSD_300x300.prototxt
+++ b/samples/data/dnn/MobileNetSSD_300x300.prototxt
--- a/samples/data/dnn/VGG_VOC0712_SSD_300x300_iter_60000.prototxt
+++ b/samples/data/dnn/VGG_VOC0712_SSD_300x300_iter_60000.prototxt
--- a/samples/data/dnn/bvlc_googlenet.prototxt
+++ b/samples/data/dnn/bvlc_googlenet.prototxt
--- a/samples/data/dnn/classification_classes_ILSVRC2012.txt
+++ b/samples/data/dnn/classification_classes_ILSVRC2012.txt
--- a/samples/data/dnn/enet-classes.txt
+++ b/samples/data/dnn/enet-classes.txt
-Unlabeled    0   0   0
+Unlabeled
-Road         128  64 128
+Road
-Sidewalk     244  35 232
+Sidewalk
-Building     70  70  70
+Building
-Wall         102 102 156
+Wall
-Fence        190 153 153
+Fence
-Pole         153 153 153
+Pole
-TrafficLight 250 170  30
+TrafficLight
-TrafficSign  220 220   0
+TrafficSign
-Vegetation   107 142  35
+Vegetation
-Terrain      152 251 152
+Terrain
-Sky          70 130 180
+Sky
-Person       220  20  60
+Person
-Rider        255   0   0
+Rider
-Car          0   0 142
+Car
-Truck        0   0  70
+Truck
-Bus          0  60 100
+Bus
-Train        0  80 100
+Train
-Motorcycle   0   0 230
+Motorcycle
-Bicycle      119  11  32
+Bicycle
\ No newline at end of file
--- a/samples/data/dnn/fcn32s-heavy-pascal.prototxt
+++ b/samples/data/dnn/fcn32s-heavy-pascal.prototxt
--- a/samples/data/dnn/fcn8s-heavy-pascal.prototxt
+++ b/samples/data/dnn/fcn8s-heavy-pascal.prototxt
--- a/samples/data/dnn/object_detection_classes_coco.txt
+++ b/samples/data/dnn/object_detection_classes_coco.txt
+person
+bicycle
+car
+motorcycle
+airplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+couch
+potted plant
+bed
+dining table
+toilet
+tv
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush
--- a/samples/data/dnn/object_detection_classes_pascal_voc.txt
+++ b/samples/data/dnn/object_detection_classes_pascal_voc.txt
+aeroplane
+bicycle
+bird
+boat
+bottle
+bus
+car
+cat
+chair
+cow
+diningtable
+dog
+horse
+motorbike
+person
+pottedplant
+sheep
+sofa
+train
+tvmonitor
--- a/samples/data/dnn/pascal-classes.txt
+++ b/samples/data/dnn/pascal-classes.txt
-background 0 0 0
-aeroplane 128 0 0
-bicycle 0 128 0
-bird 128 128 0
-boat 0 0 128
-bottle 128 0 128
-bus 0 128 128
-car 128 128 128
-cat 64 0 0
-chair 192 0 0
-cow 64 128 0
-diningtable 192 128 0
-dog 64 0 128
-horse 192 0 128
-motorbike 64 128 128
-person 192 128 128
-pottedplant 0 64 0
-sheep 128 64 0
-sofa 0 192 0
-train 128 192 0
-tvmonitor 0 64 128
--- a/samples/data/dnn/rgb.jpg
+++ b/samples/data/dnn/rgb.jpg
--- a/samples/data/dnn/space_shuttle.jpg
+++ b/samples/data/dnn/space_shuttle.jpg
--- a/samples/data/dnn/synset_words.txt
+++ b/samples/data/dnn/synset_words.txt
--- a/samples/dnn/README.md
+++ b/samples/dnn/README.md
+# OpenCV deep learning module samples
+## Model Zoo
+### Object detection
+|    Model | Scale |   Size WxH|   Mean subtraction | Channels order |
+|---------------|-------|-----------|--------------------|-------|
+| [MobileNet-SSD, Caffe](https://github.com/chuanqi305/MobileNet-SSD/) | `0.00784 (2/255)` | `300x300` | `127.5 127.5 127.5` | BGR |
+| [OpenCV face detector](https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector) | `1.0` | `300x300` | `104 177 123` | BGR |
+| [SSDs from TensorFlow](https://github.com/tensorflow/models/tree/master/research/object_detection/) | `0.00784 (2/255)` | `300x300` | `127.5 127.5 127.5` | RGB |
+| [YOLO](https://pjreddie.com/darknet/yolo/) | `0.00392 (1/255)` | `416x416` | `0 0 0` | RGB |
+| [VGG16-SSD](https://github.com/weiliu89/caffe/tree/ssd) | `1.0` | `300x300` | `104 117 123` | BGR |
+| [Faster-RCNN](https://github.com/rbgirshick/py-faster-rcnn) | `1.0` | `800x600` | `102.9801, 115.9465, 122.7717` | BGR |
+| [R-FCN](https://github.com/YuwenXiong/py-R-FCN) | `1.0` | `800x600` | `102.9801 115.9465 122.7717` | BGR |
+### Classification
+|    Model | Scale |   Size WxH|   Mean subtraction | Channels order |
+|---------------|-------|-----------|--------------------|-------|
+| GoogLeNet | `1.0` | `224x224` | `104 117 123` | BGR |
+| [SqueezeNet](https://github.com/DeepScale/SqueezeNet) | `1.0` | `227x227` | `0 0 0` | BGR |
+### Semantic segmentation
+|    Model | Scale |   Size WxH|   Mean subtraction | Channels order |
+|---------------|-------|-----------|--------------------|-------|
+| [ENet](https://github.com/e-lab/ENet-training) | `0.00392 (1/255)` | `1024x512` | `0 0 0` | RGB |
+| FCN8s | `1.0` | `500x500` | `0 0 0` | BGR |
+## References
+* [Models downloading script](https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/download_models.py)
+* [Configuration files adopted for OpenCV](https://github.com/opencv/opencv_extra/tree/master/testdata/dnn)
+* [How to import models from TensorFlow Object Detection API](https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API)
+* [Names of classes from different datasets](https://github.com/opencv/opencv/tree/master/samples/data/dnn)
--- a/samples/dnn/caffe_googlenet.cpp
+++ b/samples/dnn/caffe_googlenet.cpp
-/**M///////////////////////////////////////////////////////////////////////////////////////
-//
-//  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
-//
-//  By downloading, copying, installing or using the software you agree to this license.
-//  If you do not agree to this license, do not download, install,
-//  copy or use the software.
-//
-//
-//                           License Agreement
-//                For Open Source Computer Vision Library
-//
-// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
-// Third party copyrights are property of their respective owners.
-//
-// Redistribution and use in source and binary forms, with or without modification,
-// are permitted provided that the following conditions are met:
-//
-//   * Redistribution's of source code must retain the above copyright notice,
-//     this list of conditions and the following disclaimer.
-//
-//   * Redistribution's in binary form must reproduce the above copyright notice,
-//     this list of conditions and the following disclaimer in the documentation
-//     and/or other materials provided with the distribution.
-//
-//   * The name of the copyright holders may not be used to endorse or promote products
-//     derived from this software without specific prior written permission.
-//
-// This software is provided by the copyright holders and contributors "as is" and
-// any express or implied warranties, including, but not limited to, the implied
-// warranties of merchantability and fitness for a particular purpose are disclaimed.
-// In no event shall the Intel Corporation or contributors be liable for any direct,
-// indirect, incidental, special, exemplary, or consequential damages
-// (including, but not limited to, procurement of substitute goods or services;
-// loss of use, data, or profits; or business interruption) however caused
-// and on any theory of liability, whether in contract, strict liability,
-// or tort (including negligence or otherwise) arising in any way out of
-// the use of this software, even if advised of the possibility of such damage.
-//
-//M*/
-#include <opencv2/dnn.hpp>
-#include <opencv2/imgproc.hpp>
-#include <opencv2/highgui.hpp>
-#include <opencv2/core/utils/trace.hpp>
-using namespace cv;
-using namespace cv::dnn;
-#include <fstream>
-#include <iostream>
-#include <cstdlib>
-using namespace std;
-/* Find best class for the blob (i. e. class with maximal probability) */
-static void getMaxClass(const Mat &probBlob, int *classId, double *classProb)
-{
-    Mat probMat = probBlob.reshape(1, 1); //reshape the blob to 1x1000 matrix
-    Point classNumber;
-    minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
-    *classId = classNumber.x;
-}
-static std::vector<String> readClassNames(const char *filename )
-{
-    std::vector<String> classNames;
-    std::ifstream fp(filename);
-    if (!fp.is_open())
-    {
-        std::cerr << "File with classes labels not found: " << filename << std::endl;
-        exit(-1);
-    }
-    std::string name;
-    while (!fp.eof())
-    {
-        std::getline(fp, name);
-        if (name.length())
-            classNames.push_back( name.substr(name.find(' ')+1) );
-    }
-    fp.close();
-    return classNames;
-}
-const char* params
-    = "{ help           | false | Sample app for loading googlenet model }"
-      "{ proto          | bvlc_googlenet.prototxt | model configuration }"
-      "{ model          | bvlc_googlenet.caffemodel | model weights }"
-      "{ label          | synset_words.txt | names of ILSVRC2012 classes }"
-      "{ image          | space_shuttle.jpg | path to image file }"
-      "{ opencl         | false | enable OpenCL }"
-;
-int main(int argc, char **argv)
-{
-    CV_TRACE_FUNCTION();
-    CommandLineParser parser(argc, argv, params);
-    if (parser.get<bool>("help"))
-    {
-        parser.printMessage();
-        return 0;
-    }
-    String modelTxt = parser.get<string>("proto");
-    String modelBin = parser.get<string>("model");
-    String imageFile = parser.get<String>("image");
-    String classNameFile = parser.get<String>("label");
-    Net net;
-    try {
-        //! [Read and initialize network]
-        net = dnn::readNetFromCaffe(modelTxt, modelBin);
-        //! [Read and initialize network]
-    }
-    catch (const cv::Exception& e) {
-        std::cerr << "Exception: " << e.what() << std::endl;
-        //! [Check that network was read successfully]
-        if (net.empty())
-        {
-            std::cerr << "Can't load network by using the following files: " << std::endl;
-            std::cerr << "prototxt:   " << modelTxt << std::endl;
-            std::cerr << "caffemodel: " << modelBin << std::endl;
-            std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
-            std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
-            exit(-1);
-        }
-        //! [Check that network was read successfully]
-    }
-    if (parser.get<bool>("opencl"))
-    {
-        net.setPreferableTarget(DNN_TARGET_OPENCL);
-    }
-    //! [Prepare blob]
-    Mat img = imread(imageFile);
-    if (img.empty())
-    {
-        std::cerr << "Can't read image from the file: " << imageFile << std::endl;
-        exit(-1);
-    }
-    //GoogLeNet accepts only 224x224 BGR-images
-    Mat inputBlob = blobFromImage(img, 1.0f, Size(224, 224),
-                                  Scalar(104, 117, 123), false);   //Convert Mat to batch of images
-    //! [Prepare blob]
-    net.setInput(inputBlob, "data");        //set the network input
-    Mat prob = net.forward("prob");         //compute output
-    cv::TickMeter t;
-    for (int i = 0; i < 10; i++)
-    {
-        CV_TRACE_REGION("forward");
-        //! [Set input blob]
-        net.setInput(inputBlob, "data");        //set the network input
-        //! [Set input blob]
-        t.start();
-        //! [Make forward pass]
-        prob = net.forward("prob");                          //compute output
-        //! [Make forward pass]
-        t.stop();
-    }
-    //! [Gather output]
-    int classId;
-    double classProb;
-    getMaxClass(prob, &classId, &classProb);//find the best class
-    //! [Gather output]
-    //! [Print results]
-    std::vector<String> classNames = readClassNames(classNameFile.c_str());
-    std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
-    std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
-    //! [Print results]
-    std::cout << "Time: " << (double)t.getTimeMilli() / t.getCounter() << " ms (average from " << t.getCounter() << " iterations)" << std::endl;
-    return 0;
-} //main
--- a/samples/dnn/classification.cpp
+++ b/samples/dnn/classification.cpp
+#include <fstream>
+#include <sstream>
+#include <opencv2/dnn.hpp>
+#include <opencv2/imgproc.hpp>
+#include <opencv2/highgui.hpp>
+const char* keys =
+    "{ help  h     | | Print help message. }"
+    "{ input i     | | Path to input image or video file. Skip this argument to capture frames from a camera.}"
+    "{ model m     | | Path to a binary file of model contains trained weights. "
+                      "It could be a file with extensions .caffemodel (Caffe), "
+                      ".pb (TensorFlow), .t7 or .net (Torch), .weights (Darknet) }"
+    "{ config c    | | Path to a text file of model contains network configuration. "
+                      "It could be a file with extensions .prototxt (Caffe), .pbtxt (TensorFlow), .cfg (Darknet) }"
+    "{ framework f | | Optional name of an origin framework of the model. Detect it automatically if it does not set. }"
+    "{ classes     | | Optional path to a text file with names of classes. }"
+    "{ mean        | | Preprocess input image by subtracting mean values. Mean values should be in BGR order and delimited by spaces. }"
+    "{ scale       | 1 | Preprocess input image by multiplying on a scale factor. }"
+    "{ width       |   | Preprocess input image by resizing to a specific width. }"
+    "{ height      |   | Preprocess input image by resizing to a specific height. }"
+    "{ rgb         |   | Indicate that model works with RGB input images instead BGR ones. }"
+    "{ backend     | 0 | Choose one of computation backends: "
+                        "0: default C++ backend, "
+                        "1: Halide language (http://halide-lang.org/), "
+                        "2: Intel's Deep Learning Inference Engine (https://software.seek.intel.com/deep-learning-deployment)}"
+    "{ target      | 0 | Choose one of target computation devices: "
+                        "0: CPU target (by default),"
+                        "1: OpenCL }";
+using namespace cv;
+using namespace dnn;
+std::vector<std::string> classes;
+int main(int argc, char** argv)
+{
+    CommandLineParser parser(argc, argv, keys);
+    parser.about("Use this script to run classification deep learning networks using OpenCV.");
+    if (argc == 1 || parser.has("help"))
+    {
+        parser.printMessage();
+        return 0;
+    }
+    float scale = parser.get<float>("scale");
+    Scalar mean = parser.get<Scalar>("mean");
+    bool swapRB = parser.get<bool>("rgb");
+    CV_Assert(parser.has("width"), parser.has("height"));
+    int inpWidth = parser.get<int>("width");
+    int inpHeight = parser.get<int>("height");
+    String model = parser.get<String>("model");
+    String config = parser.get<String>("config");
+    String framework = parser.get<String>("framework");
+    int backendId = parser.get<int>("backend");
+    int targetId = parser.get<int>("target");
+    // Open file with classes names.
+    if (parser.has("classes"))
+    {
+        std::string file = parser.get<String>("classes");
+        std::ifstream ifs(file.c_str());
+        if (!ifs.is_open())
+            CV_Error(Error::StsError, "File " + file + " not found");
+        std::string line;
+        while (std::getline(ifs, line))
+        {
+            classes.push_back(line);
+        }
+    }
+    CV_Assert(parser.has("model"));
+    //! [Read and initialize network]
+    Net net = readNet(model, config, framework);
+    net.setPreferableBackend(backendId);
+    net.setPreferableTarget(targetId);
+    //! [Read and initialize network]
+    // Create a window
+    static const std::string kWinName = "Deep learning image classification in OpenCV";
+    namedWindow(kWinName, WINDOW_NORMAL);
+    //! [Open a video file or an image file or a camera stream]
+    VideoCapture cap;
+    if (parser.has("input"))
+        cap.open(parser.get<String>("input"));
+    else
+        cap.open(0);
+    //! [Open a video file or an image file or a camera stream]
+    // Process frames.
+    Mat frame, blob;
+    while (waitKey(1) < 0)
+    {
+        cap >> frame;
+        if (frame.empty())
+        {
+            waitKey();
+            break;
+        }
+        //! [Create a 4D blob from a frame]
+        blobFromImage(frame, blob, scale, Size(inpWidth, inpHeight), mean, swapRB, false);
+        //! [Create a 4D blob from a frame]
+        //! [Set input blob]
+        net.setInput(blob);
+        //! [Set input blob]
+        //! [Make forward pass]
+        Mat prob = net.forward();
+        //! [Make forward pass]
+        //! [Get a class with a highest score]
+        Point classIdPoint;
+        double confidence;
+        minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
+        int classId = classIdPoint.x;
+        //! [Get a class with a highest score]
+        // Put efficiency information.
+        std::vector<double> layersTimes;
+        double freq = getTickFrequency() / 1000;
+        double t = net.getPerfProfile(layersTimes) / freq;
+        std::string label = format("Inference time: %.2f ms", t);
+        putText(frame, label, Point(0, 15), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));
+        // Print predicted class.
+        label = format("%s: %.4f", (classes.empty() ? format("Class #%d", classId).c_str() :
+                                                      classes[classId].c_str()),
+                                   confidence);
+        putText(frame, label, Point(0, 40), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));
+        imshow(kWinName, frame);
+    }
+    return 0;
+}
--- a/samples/dnn/classification.py
+++ b/samples/dnn/classification.py
--- a/samples/dnn/faster_rcnn.cpp
+++ b/samples/dnn/faster_rcnn.cpp
--- a/samples/dnn/fcn_semsegm.cpp
+++ b/samples/dnn/fcn_semsegm.cpp
--- a/samples/dnn/googlenet_python.py
+++ b/samples/dnn/googlenet_python.py
--- a/samples/dnn/mobilenet_ssd_python.py
+++ b/samples/dnn/mobilenet_ssd_python.py
--- a/samples/dnn/object_detection.cpp
+++ b/samples/dnn/object_detection.cpp
--- a/samples/dnn/object_detection.py
+++ b/samples/dnn/object_detection.py
--- a/samples/dnn/resnet_ssd_face.cpp
+++ b/samples/dnn/resnet_ssd_face.cpp
--- a/samples/dnn/resnet_ssd_face_python.py
+++ b/samples/dnn/resnet_ssd_face_python.py
--- a/samples/dnn/segmentation.cpp
+++ b/samples/dnn/segmentation.cpp
--- a/samples/dnn/segmentation.py
+++ b/samples/dnn/segmentation.py
--- a/samples/dnn/squeezenet_halide.cpp
+++ b/samples/dnn/squeezenet_halide.cpp
--- a/samples/dnn/ssd_mobilenet_object_detection.cpp
+++ b/samples/dnn/ssd_mobilenet_object_detection.cpp
--- a/samples/dnn/ssd_object_detection.cpp
+++ b/samples/dnn/ssd_object_detection.cpp
--- a/samples/dnn/tf_inception.cpp
+++ b/samples/dnn/tf_inception.cpp
--- a/samples/dnn/torch_enet.cpp
+++ b/samples/dnn/torch_enet.cpp
--- a/samples/dnn/yolo_object_detection.cpp
+++ b/samples/dnn/yolo_object_detection.cpp