Commit ab110c0a authored by Alexander Alekhin's avatar Alexander Alekhin

Merge pull request #10979 from dkurt:unite_dnn_samples

parents cc06935a 538fd423
......@@ -13,50 +13,53 @@ We will demonstrate results of this example on the following picture.
Source Code
-----------
We will be using snippets from the example application, that can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/caffe_googlenet.cpp).
We will be using snippets from the example application, that can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/classification.cpp).
@include dnn/caffe_googlenet.cpp
@include dnn/classification.cpp
Explanation
-----------
-# Firstly, download GoogLeNet model files:
[bvlc_googlenet.prototxt ](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/bvlc_googlenet.prototxt) and
[bvlc_googlenet.prototxt ](https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/bvlc_googlenet.prototxt) and
[bvlc_googlenet.caffemodel](http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel)
Also you need file with names of [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/browse-synsets) classes:
[synset_words.txt](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/synset_words.txt).
[classification_classes_ILSVRC2012.txt](https://github.com/opencv/opencv/tree/master/samples/dnn/classification_classes_ILSVRC2012.txt).
Put these files into working dir of this program example.
-# Read and initialize network using path to .prototxt and .caffemodel files
@snippet dnn/caffe_googlenet.cpp Read and initialize network
@snippet dnn/classification.cpp Read and initialize network
-# Check that network was read successfully
@snippet dnn/caffe_googlenet.cpp Check that network was read successfully
You can skip an argument `framework` if one of the files `model` or `config` has an
extension `.caffemodel` or `.prototxt`.
This way function cv::dnn::readNet can automatically detects a model's format.
-# Read input image and convert to the blob, acceptable by GoogleNet
@snippet dnn/caffe_googlenet.cpp Prepare blob
We convert the image to a 4-dimensional blob (so-called batch) with 1x3x224x224 shape after applying necessary pre-processing like resizing and mean subtraction using cv::dnn::blobFromImage constructor.
@snippet dnn/classification.cpp Open a video file or an image file or a camera stream
-# Pass the blob to the network
@snippet dnn/caffe_googlenet.cpp Set input blob
In bvlc_googlenet.prototxt the network input blob named as "data", therefore this blob labeled as ".data" in opencv_dnn API.
cv::VideoCapture can load both images and videos.
@snippet dnn/classification.cpp Create a 4D blob from a frame
We convert the image to a 4-dimensional blob (so-called batch) with `1x3x224x224` shape
after applying necessary pre-processing like resizing and mean subtraction
`(-104, -117, -123)` for each blue, green and red channels correspondingly using cv::dnn::blobFromImage function.
Other blobs labeled as "name_of_layer.name_of_layer_output".
-# Pass the blob to the network
@snippet dnn/classification.cpp Set input blob
-# Make forward pass
@snippet dnn/caffe_googlenet.cpp Make forward pass
During the forward pass output of each network layer is computed, but in this example we need output from "prob" layer only.
@snippet dnn/classification.cpp Make forward pass
During the forward pass output of each network layer is computed, but in this example we need output from the last layer only.
-# Determine the best class
@snippet dnn/caffe_googlenet.cpp Gather output
We put the output of "prob" layer, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the `prob` blob.
And find the index of element with maximal value in this one. This index correspond to the class of the image.
-# Print results
@snippet dnn/caffe_googlenet.cpp Print results
For our image we get:
> Best class: #812 'space shuttle'
>
> Probability: 99.6378%
@snippet dnn/classification.cpp Get a class with a highest score
We put the output of network, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the `prob` blob.
And find the index of element with maximal value in this one. This index corresponds to the class of the image.
-# Run an example from command line
@code
./example_dnn_classification --model=bvlc_googlenet.caffemodel --config=bvlc_googlenet.prototxt --width=224 --height=224 --classes=classification_classes_ILSVRC2012.txt --input=space_shuttle.jpg --mean="104 117 123"
@endcode
For our image we get prediction of class `space shuttle` with more than 99% sureness.
......@@ -74,46 +74,7 @@ When you build OpenCV add the following configuration flags:
- `HALIDE_ROOT_DIR` - path to Halide build directory
## Sample
@include dnn/squeezenet_halide.cpp
## Explanation
Download Caffe model from SqueezeNet repository: [train_val.prototxt](https://github.com/DeepScale/SqueezeNet/blob/master/SqueezeNet_v1.1/train_val.prototxt) and [squeezenet_v1.1.caffemodel](https://github.com/DeepScale/SqueezeNet/blob/master/SqueezeNet_v1.1/squeezenet_v1.1.caffemodel).
Also you need file with names of [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/browse-synsets) classes:
[synset_words.txt](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/synset_words.txt).
Put these files into working dir of this program example.
-# Read and initialize network using path to .prototxt and .caffemodel files
@snippet dnn/squeezenet_halide.cpp Read and initialize network
-# Check that network was read successfully
@snippet dnn/squeezenet_halide.cpp Check that network was read successfully
-# Read input image and convert to the 4-dimensional blob, acceptable by SqueezeNet v1.1
@snippet dnn/squeezenet_halide.cpp Prepare blob
-# Pass the blob to the network
@snippet dnn/squeezenet_halide.cpp Set input blob
-# Enable Halide backend for layers where it is implemented
@snippet dnn/squeezenet_halide.cpp Enable Halide backend
-# Make forward pass
@snippet dnn/squeezenet_halide.cpp Make forward pass
Remember that the first forward pass after initialization require quite more
time that the next ones. It's because of runtime compilation of Halide pipelines
at the first invocation.
-# Determine the best class
@snippet dnn/squeezenet_halide.cpp Determine the best class
-# Print results
@snippet dnn/squeezenet_halide.cpp Print results
For our image we get:
> Best class: #812 'space shuttle'
>
> Probability: 97.9812%
## Set Halide as a preferable backend
@code
net.setPreferableBackend(DNN_BACKEND_HALIDE);
@endcode
......@@ -18,40 +18,26 @@ VIDEO DEMO:
Source Code
-----------
The latest version of sample source code can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/yolo_object_detection.cpp).
Use a universal sample for object detection models written
[in C++](https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.cpp) and
[in Python](https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.py) languages
@include dnn/yolo_object_detection.cpp
How to compile in command line with pkg-config
----------------------------------------------
@code{.bash}
# g++ `pkg-config --cflags opencv` `pkg-config --libs opencv` yolo_object_detection.cpp -o yolo_object_detection
@endcode
Usage examples
--------------
Execute in webcam:
@code{.bash}
$ yolo_object_detection -camera_device=0 -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights -class_names=[PATH-TO-DARKNET]/data/coco.names
@endcode
Execute with image:
@code{.bash}
$ yolo_object_detection -source=[PATH-IMAGE] -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights -class_names=[PATH-TO-DARKNET]/data/coco.names
$ example_dnn_object_detection --config=[PATH-TO-DARKNET]/cfg/yolo.cfg --model=[PATH-TO-DARKNET]/yolo.weights --classes=object_detection_classes_pascal_voc.txt --width=416 --height=416 --scale=0.00392
@endcode
Execute in video file:
Execute with image or video file:
@code{.bash}
$ yolo_object_detection -source=[PATH-TO-VIDEO] -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights -class_names=[PATH-TO-DARKNET]/data/coco.names
$ example_dnn_object_detection --config=[PATH-TO-DARKNET]/cfg/yolo.cfg --model=[PATH-TO-DARKNET]/yolo.weights --classes=object_detection_classes_pascal_voc.txt --width=416 --height=416 --scale=0.00392 --input=[PATH-TO-IMAGE-OR-VIDEO-FILE]
@endcode
......
......@@ -3159,7 +3159,7 @@ protected:
struct Param {
enum { INT=0, BOOLEAN=1, REAL=2, STRING=3, MAT=4, MAT_VECTOR=5, ALGORITHM=6, FLOAT=7,
UNSIGNED_INT=8, UINT64=9, UCHAR=11 };
UNSIGNED_INT=8, UINT64=9, UCHAR=11, SCALAR=12 };
};
......@@ -3252,6 +3252,14 @@ template<> struct ParamType<uchar>
enum { type = Param::UCHAR };
};
template<> struct ParamType<Scalar>
{
typedef const Scalar& const_param_type;
typedef Scalar member_type;
enum { type = Param::SCALAR };
};
//! @} core_basic
} //namespace cv
......
......@@ -104,6 +104,12 @@ static void from_str(const String& str, int type, void* dst)
ss >> *(double*)dst;
else if( type == Param::STRING )
*(String*)dst = str;
else if( type == Param::SCALAR)
{
Scalar& scalar = *(Scalar*)dst;
for (int i = 0; i < 4 && !ss.eof(); ++i)
ss >> scalar[i];
}
else
CV_Error(Error::StsBadArg, "unknown/unsupported parameter type");
......
......@@ -261,4 +261,26 @@ TEST(AutoBuffer, allocate_test)
EXPECT_EQ(6u, abuf.size());
}
TEST(CommandLineParser, testScalar)
{
static const char * const keys3 =
"{ s0 | 3 4 5 | default scalar }"
"{ s1 | | single value scalar }"
"{ s2 | | two values scalar (default with zeros) }"
"{ s3 | | three values scalar }"
"{ s4 | | four values scalar }"
"{ s5 | | five values scalar }";
const char* argv[] = {"<bin>", "--s1=1.1", "--s3=1.1 2.2 3",
"--s4=-4.2 1 0 3", "--s5=5 -4 3 2 1"};
const int argc = 5;
CommandLineParser parser(argc, argv, keys3);
EXPECT_EQ(parser.get<Scalar>("s0"), Scalar(3, 4, 5));
EXPECT_EQ(parser.get<Scalar>("s1"), Scalar(1.1));
EXPECT_EQ(parser.get<Scalar>("s2"), Scalar(0));
EXPECT_EQ(parser.get<Scalar>("s3"), Scalar(1.1, 2.2, 3));
EXPECT_EQ(parser.get<Scalar>("s4"), Scalar(-4.2, 1, 0, 3));
EXPECT_EQ(parser.get<Scalar>("s5"), Scalar(5, -4, 3, 2));
}
}} // namespace
......@@ -153,7 +153,7 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
*/
int inputNameToIndex(String inputName);
int outputNameToIndex(String outputName);
int outputNameToIndex(const String& outputName);
};
/** @brief Classical recurrent layer
......
......@@ -222,7 +222,7 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
/** @brief Returns index of output blob in output array.
* @see inputNameToIndex()
*/
virtual int outputNameToIndex(String outputName);
CV_WRAP virtual int outputNameToIndex(const String& outputName);
/**
* @brief Ask layer if it support specific backend for doing computations.
......@@ -683,6 +683,29 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
*/
CV_EXPORTS_W Net readNetFromTorch(const String &model, bool isBinary = true);
/**
* @brief Read deep learning network represented in one of the supported formats.
* @param[in] model Binary file contains trained weights. The following file
* extensions are expected for models from different frameworks:
* * `*.caffemodel` (Caffe, http://caffe.berkeleyvision.org/)
* * `*.pb` (TensorFlow, https://www.tensorflow.org/)
* * `*.t7` | `*.net` (Torch, http://torch.ch/)
* * `*.weights` (Darknet, https://pjreddie.com/darknet/)
* @param[in] config Text file contains network configuration. It could be a
* file with the following extensions:
* * `*.prototxt` (Caffe, http://caffe.berkeleyvision.org/)
* * `*.pbtxt` (TensorFlow, https://www.tensorflow.org/)
* * `*.cfg` (Darknet, https://pjreddie.com/darknet/)
* @param[in] framework Explicit framework name tag to determine a format.
* @returns Net object.
*
* This function automatically detects an origin framework of trained model
* and calls an appropriate function such @ref readNetFromCaffe, @ref readNetFromTensorflow,
* @ref readNetFromTorch or @ref readNetFromDarknet. An order of @p model and @p config
* arguments does not matter.
*/
CV_EXPORTS_W Net readNet(const String& model, const String& config = "", const String& framework = "");
/** @brief Loads blob which was serialized as torch.Tensor object of Torch7 framework.
* @warning This function has the same limitations as readNetFromTorch().
*/
......
......@@ -399,7 +399,7 @@ struct DataLayer : public Layer
void forward(std::vector<Mat*>&, std::vector<Mat>&, std::vector<Mat> &) {}
void forward(InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals) {}
int outputNameToIndex(String tgtName)
int outputNameToIndex(const String& tgtName)
{
int idx = (int)(std::find(outNames.begin(), outNames.end(), tgtName) - outNames.begin());
return (idx < (int)outNames.size()) ? idx : -1;
......@@ -2521,7 +2521,7 @@ int Layer::inputNameToIndex(String)
return -1;
}
int Layer::outputNameToIndex(String)
int Layer::outputNameToIndex(const String&)
{
return -1;
}
......@@ -2813,5 +2813,43 @@ BackendWrapper::BackendWrapper(const Ptr<BackendWrapper>& base, const MatShape&
BackendWrapper::~BackendWrapper() {}
Net readNet(const String& _model, const String& _config, const String& _framework)
{
String framework = _framework.toLowerCase();
String model = _model;
String config = _config;
const std::string modelExt = model.substr(model.rfind('.') + 1);
const std::string configExt = config.substr(config.rfind('.') + 1);
if (framework == "caffe" || modelExt == "caffemodel" || configExt == "caffemodel" ||
modelExt == "prototxt" || configExt == "prototxt")
{
if (modelExt == "prototxt" || configExt == "caffemodel")
std::swap(model, config);
return readNetFromCaffe(config, model);
}
if (framework == "tensorflow" || modelExt == "pb" || configExt == "pb" ||
modelExt == "pbtxt" || configExt == "pbtxt")
{
if (modelExt == "pbtxt" || configExt == "pb")
std::swap(model, config);
return readNetFromTensorflow(model, config);
}
if (framework == "torch" || modelExt == "t7" || modelExt == "net" ||
configExt == "t7" || configExt == "net")
{
return readNetFromTorch(model.empty() ? config : model);
}
if (framework == "darknet" || modelExt == "weights" || configExt == "weights" ||
modelExt == "cfg" || configExt == "cfg")
{
if (modelExt == "cfg" || configExt == "weights")
std::swap(model, config);
return readNetFromDarknet(config, model);
}
CV_Error(Error::StsError, "Cannot determine an origin framework of files: " +
model + (config.empty() ? "" : ", " + config));
return Net();
}
CV__DNN_EXPERIMENTAL_NS_END
}} // namespace
......@@ -355,7 +355,7 @@ int LSTMLayer::inputNameToIndex(String inputName)
return -1;
}
int LSTMLayer::outputNameToIndex(String outputName)
int LSTMLayer::outputNameToIndex(const String& outputName)
{
if (outputName.toLowerCase() == "h")
return 0;
......
......@@ -57,4 +57,22 @@ TEST(imagesFromBlob, Regression)
}
}
TEST(readNet, Regression)
{
Net net = readNet(findDataFile("dnn/squeezenet_v1.1.prototxt", false),
findDataFile("dnn/squeezenet_v1.1.caffemodel", false));
EXPECT_FALSE(net.empty());
net = readNet(findDataFile("dnn/opencv_face_detector.caffemodel", false),
findDataFile("dnn/opencv_face_detector.prototxt", false));
EXPECT_FALSE(net.empty());
net = readNet(findDataFile("dnn/openface_nn4.small2.v1.t7", false));
EXPECT_FALSE(net.empty());
net = readNet(findDataFile("dnn/tiny-yolo-voc.cfg", false),
findDataFile("dnn/tiny-yolo-voc.weights", false));
EXPECT_FALSE(net.empty());
net = readNet(findDataFile("dnn/ssd_mobilenet_v1_coco.pbtxt", false),
findDataFile("dnn/ssd_mobilenet_v1_coco.pb", false));
EXPECT_FALSE(net.empty());
}
}} // namespace
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Unlabeled 0 0 0
Road 128 64 128
Sidewalk 244 35 232
Building 70 70 70
Wall 102 102 156
Fence 190 153 153
Pole 153 153 153
TrafficLight 250 170 30
TrafficSign 220 220 0
Vegetation 107 142 35
Terrain 152 251 152
Sky 70 130 180
Person 220 20 60
Rider 255 0 0
Car 0 0 142
Truck 0 0 70
Bus 0 60 100
Train 0 80 100
Motorcycle 0 0 230
Bicycle 119 11 32
\ No newline at end of file
Unlabeled
Road
Sidewalk
Building
Wall
Fence
Pole
TrafficLight
TrafficSign
Vegetation
Terrain
Sky
Person
Rider
Car
Truck
Bus
Train
Motorcycle
Bicycle
This diff is collapsed.
This diff is collapsed.
person
bicycle
car
motorcycle
airplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
couch
potted plant
bed
dining table
toilet
tv
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush
aeroplane
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
diningtable
dog
horse
motorbike
person
pottedplant
sheep
sofa
train
tvmonitor
background 0 0 0
aeroplane 128 0 0
bicycle 0 128 0
bird 128 128 0
boat 0 0 128
bottle 128 0 128
bus 0 128 128
car 128 128 128
cat 64 0 0
chair 192 0 0
cow 64 128 0
diningtable 192 128 0
dog 64 0 128
horse 192 0 128
motorbike 64 128 128
person 192 128 128
pottedplant 0 64 0
sheep 128 64 0
sofa 0 192 0
train 128 192 0
tvmonitor 0 64 128
This diff is collapsed.
# OpenCV deep learning module samples
## Model Zoo
### Object detection
| Model | Scale | Size WxH| Mean subtraction | Channels order |
|---------------|-------|-----------|--------------------|-------|
| [MobileNet-SSD, Caffe](https://github.com/chuanqi305/MobileNet-SSD/) | `0.00784 (2/255)` | `300x300` | `127.5 127.5 127.5` | BGR |
| [OpenCV face detector](https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector) | `1.0` | `300x300` | `104 177 123` | BGR |
| [SSDs from TensorFlow](https://github.com/tensorflow/models/tree/master/research/object_detection/) | `0.00784 (2/255)` | `300x300` | `127.5 127.5 127.5` | RGB |
| [YOLO](https://pjreddie.com/darknet/yolo/) | `0.00392 (1/255)` | `416x416` | `0 0 0` | RGB |
| [VGG16-SSD](https://github.com/weiliu89/caffe/tree/ssd) | `1.0` | `300x300` | `104 117 123` | BGR |
| [Faster-RCNN](https://github.com/rbgirshick/py-faster-rcnn) | `1.0` | `800x600` | `102.9801, 115.9465, 122.7717` | BGR |
| [R-FCN](https://github.com/YuwenXiong/py-R-FCN) | `1.0` | `800x600` | `102.9801 115.9465 122.7717` | BGR |
### Classification
| Model | Scale | Size WxH| Mean subtraction | Channels order |
|---------------|-------|-----------|--------------------|-------|
| GoogLeNet | `1.0` | `224x224` | `104 117 123` | BGR |
| [SqueezeNet](https://github.com/DeepScale/SqueezeNet) | `1.0` | `227x227` | `0 0 0` | BGR |
### Semantic segmentation
| Model | Scale | Size WxH| Mean subtraction | Channels order |
|---------------|-------|-----------|--------------------|-------|
| [ENet](https://github.com/e-lab/ENet-training) | `0.00392 (1/255)` | `1024x512` | `0 0 0` | RGB |
| FCN8s | `1.0` | `500x500` | `0 0 0` | BGR |
## References
* [Models downloading script](https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/download_models.py)
* [Configuration files adopted for OpenCV](https://github.com/opencv/opencv_extra/tree/master/testdata/dnn)
* [How to import models from TensorFlow Object Detection API](https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API)
* [Names of classes from different datasets](https://github.com/opencv/opencv/tree/master/samples/data/dnn)
/**M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/core/utils/trace.hpp>
using namespace cv;
using namespace cv::dnn;
#include <fstream>
#include <iostream>
#include <cstdlib>
using namespace std;
/* Find best class for the blob (i. e. class with maximal probability) */
static void getMaxClass(const Mat &probBlob, int *classId, double *classProb)
{
Mat probMat = probBlob.reshape(1, 1); //reshape the blob to 1x1000 matrix
Point classNumber;
minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
*classId = classNumber.x;
}
static std::vector<String> readClassNames(const char *filename )
{
std::vector<String> classNames;
std::ifstream fp(filename);
if (!fp.is_open())
{
std::cerr << "File with classes labels not found: " << filename << std::endl;
exit(-1);
}
std::string name;
while (!fp.eof())
{
std::getline(fp, name);
if (name.length())
classNames.push_back( name.substr(name.find(' ')+1) );
}
fp.close();
return classNames;
}
const char* params
= "{ help | false | Sample app for loading googlenet model }"
"{ proto | bvlc_googlenet.prototxt | model configuration }"
"{ model | bvlc_googlenet.caffemodel | model weights }"
"{ label | synset_words.txt | names of ILSVRC2012 classes }"
"{ image | space_shuttle.jpg | path to image file }"
"{ opencl | false | enable OpenCL }"
;
int main(int argc, char **argv)
{
CV_TRACE_FUNCTION();
CommandLineParser parser(argc, argv, params);
if (parser.get<bool>("help"))
{
parser.printMessage();
return 0;
}
String modelTxt = parser.get<string>("proto");
String modelBin = parser.get<string>("model");
String imageFile = parser.get<String>("image");
String classNameFile = parser.get<String>("label");
Net net;
try {
//! [Read and initialize network]
net = dnn::readNetFromCaffe(modelTxt, modelBin);
//! [Read and initialize network]
}
catch (const cv::Exception& e) {
std::cerr << "Exception: " << e.what() << std::endl;
//! [Check that network was read successfully]
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
exit(-1);
}
//! [Check that network was read successfully]
}
if (parser.get<bool>("opencl"))
{
net.setPreferableTarget(DNN_TARGET_OPENCL);
}
//! [Prepare blob]
Mat img = imread(imageFile);
if (img.empty())
{
std::cerr << "Can't read image from the file: " << imageFile << std::endl;
exit(-1);
}
//GoogLeNet accepts only 224x224 BGR-images
Mat inputBlob = blobFromImage(img, 1.0f, Size(224, 224),
Scalar(104, 117, 123), false); //Convert Mat to batch of images
//! [Prepare blob]
net.setInput(inputBlob, "data"); //set the network input
Mat prob = net.forward("prob"); //compute output
cv::TickMeter t;
for (int i = 0; i < 10; i++)
{
CV_TRACE_REGION("forward");
//! [Set input blob]
net.setInput(inputBlob, "data"); //set the network input
//! [Set input blob]
t.start();
//! [Make forward pass]
prob = net.forward("prob"); //compute output
//! [Make forward pass]
t.stop();
}
//! [Gather output]
int classId;
double classProb;
getMaxClass(prob, &classId, &classProb);//find the best class
//! [Gather output]
//! [Print results]
std::vector<String> classNames = readClassNames(classNameFile.c_str());
std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
//! [Print results]
std::cout << "Time: " << (double)t.getTimeMilli() / t.getCounter() << " ms (average from " << t.getCounter() << " iterations)" << std::endl;
return 0;
} //main
#include <fstream>
#include <sstream>
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
const char* keys =
"{ help h | | Print help message. }"
"{ input i | | Path to input image or video file. Skip this argument to capture frames from a camera.}"
"{ model m | | Path to a binary file of model contains trained weights. "
"It could be a file with extensions .caffemodel (Caffe), "
".pb (TensorFlow), .t7 or .net (Torch), .weights (Darknet) }"
"{ config c | | Path to a text file of model contains network configuration. "
"It could be a file with extensions .prototxt (Caffe), .pbtxt (TensorFlow), .cfg (Darknet) }"
"{ framework f | | Optional name of an origin framework of the model. Detect it automatically if it does not set. }"
"{ classes | | Optional path to a text file with names of classes. }"
"{ mean | | Preprocess input image by subtracting mean values. Mean values should be in BGR order and delimited by spaces. }"
"{ scale | 1 | Preprocess input image by multiplying on a scale factor. }"
"{ width | | Preprocess input image by resizing to a specific width. }"
"{ height | | Preprocess input image by resizing to a specific height. }"
"{ rgb | | Indicate that model works with RGB input images instead BGR ones. }"
"{ backend | 0 | Choose one of computation backends: "
"0: default C++ backend, "
"1: Halide language (http://halide-lang.org/), "
"2: Intel's Deep Learning Inference Engine (https://software.seek.intel.com/deep-learning-deployment)}"
"{ target | 0 | Choose one of target computation devices: "
"0: CPU target (by default),"
"1: OpenCL }";
using namespace cv;
using namespace dnn;
std::vector<std::string> classes;
int main(int argc, char** argv)
{
CommandLineParser parser(argc, argv, keys);
parser.about("Use this script to run classification deep learning networks using OpenCV.");
if (argc == 1 || parser.has("help"))
{
parser.printMessage();
return 0;
}
float scale = parser.get<float>("scale");
Scalar mean = parser.get<Scalar>("mean");
bool swapRB = parser.get<bool>("rgb");
CV_Assert(parser.has("width"), parser.has("height"));
int inpWidth = parser.get<int>("width");
int inpHeight = parser.get<int>("height");
String model = parser.get<String>("model");
String config = parser.get<String>("config");
String framework = parser.get<String>("framework");
int backendId = parser.get<int>("backend");
int targetId = parser.get<int>("target");
// Open file with classes names.
if (parser.has("classes"))
{
std::string file = parser.get<String>("classes");
std::ifstream ifs(file.c_str());
if (!ifs.is_open())
CV_Error(Error::StsError, "File " + file + " not found");
std::string line;
while (std::getline(ifs, line))
{
classes.push_back(line);
}
}
CV_Assert(parser.has("model"));
//! [Read and initialize network]
Net net = readNet(model, config, framework);
net.setPreferableBackend(backendId);
net.setPreferableTarget(targetId);
//! [Read and initialize network]
// Create a window
static const std::string kWinName = "Deep learning image classification in OpenCV";
namedWindow(kWinName, WINDOW_NORMAL);
//! [Open a video file or an image file or a camera stream]
VideoCapture cap;
if (parser.has("input"))
cap.open(parser.get<String>("input"));
else
cap.open(0);
//! [Open a video file or an image file or a camera stream]
// Process frames.
Mat frame, blob;
while (waitKey(1) < 0)
{
cap >> frame;
if (frame.empty())
{
waitKey();
break;
}
//! [Create a 4D blob from a frame]
blobFromImage(frame, blob, scale, Size(inpWidth, inpHeight), mean, swapRB, false);
//! [Create a 4D blob from a frame]
//! [Set input blob]
net.setInput(blob);
//! [Set input blob]
//! [Make forward pass]
Mat prob = net.forward();
//! [Make forward pass]
//! [Get a class with a highest score]
Point classIdPoint;
double confidence;
minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
int classId = classIdPoint.x;
//! [Get a class with a highest score]
// Put efficiency information.
std::vector<double> layersTimes;
double freq = getTickFrequency() / 1000;
double t = net.getPerfProfile(layersTimes) / freq;
std::string label = format("Inference time: %.2f ms", t);
putText(frame, label, Point(0, 15), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));
// Print predicted class.
label = format("%s: %.4f", (classes.empty() ? format("Class #%d", classId).c_str() :
classes[classId].c_str()),
confidence);
putText(frame, label, Point(0, 40), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));
imshow(kWinName, frame);
}
return 0;
}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment