Commit 7d9bbdca authored by Maksim Shabunin's avatar Maksim Shabunin

Remove all sphinx files

parent 61f36de5
.. _Table-Of-Content-Bioinspired:
*bioinspired* module. Algorithms inspired from biological models
----------------------------------------------------------------
Here you will learn how to use additional modules of OpenCV defined in the "bioinspired" module.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|RetinaDemoImg| **Title:** :ref:`Retina_Model`
*Compatibility:* > OpenCV 2.4
*Author:* |Author_AlexB|
You will learn how to process images and video streams with a model of retina filter for details enhancement, spatio-temporal noise removal, luminance correction and spatio-temporal events detection.
=============== ======================================================
.. |RetinaDemoImg| image:: images/retina_TreeHdr_small.jpg
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../retina_model/retina_model
.. _Table-Of-Content-CVV:
*cvv* module. GUI for Interactive Visual Debugging
--------------------------------------------------
Here you will learn how to use the cvv module to ease programming computer vision software through visual debugging aids.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|cvvIntro| *Title:* :ref:`Visual_Debugging_Introduction`
*Compatibility:* > OpenCV 2.4.8
*Author:* |Author_Bihlmaier|
We will learn how to debug our applications in a visual and interactive way.
=============== ======================================================
.. |cvvIntro| image:: images/Visual_Debugging_Introduction_Tutorial_Cover.jpg
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../visual_debugging_introduction/visual_debugging_introduction
.. ximgproc:
Structured forests for fast edge detection
******************************************
Introduction
------------
In this tutorial you will learn how to use structured forests for the purpose of edge detection in an image.
Examples
--------
.. image:: images/01.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/02.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/03.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/04.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/05.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/06.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/07.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/08.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/09.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/10.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/11.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/12.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
**Note :** binarization techniques like Canny edge detector are applicable
to edges produced by both algorithms (``Sobel`` and ``StructuredEdgeDetection::detectEdges``).
Source Code
-----------
.. literalinclude:: ../../../../modules/ximpgroc/samples/cpp/structured_edge_detection.cpp
:language: cpp
:linenos:
:tab-width: 4
Explanation
-----------
1. **Load source color image**
.. code-block:: cpp
cv::Mat image = cv::imread(inFilename, 1);
if ( image.empty() )
{
printf("Cannot read image file: %s\n", inFilename.c_str());
return -1;
}
2. **Convert source image to [0;1] range**
.. code-block:: cpp
image.convertTo(image, cv::DataType<float>::type, 1/255.0);
3. **Run main algorithm**
.. code-block:: cpp
cv::Mat edges(image.size(), image.type());
cv::Ptr<StructuredEdgeDetection> pDollar =
cv::createStructuredEdgeDetection(modelFilename);
pDollar->detectEdges(image, edges);
4. **Show results**
.. code-block:: cpp
if ( outFilename == "" )
{
cv::namedWindow("edges", 1);
cv::imshow("edges", edges);
cv::waitKey(0);
}
else
cv::imwrite(outFilename, 255*edges);
Literature
----------
For more information, refer to the following papers :
.. [Dollar2013] Dollar P., Zitnick C. L., "Structured forests for fast edge detection",
IEEE International Conference on Computer Vision (ICCV), 2013,
pp. 1841-1848. `DOI <http://dx.doi.org/10.1109/ICCV.2013.231>`_
.. [Lim2013] Lim J. J., Zitnick C. L., Dollar P., "Sketch Tokens: A Learned
Mid-level Representation for Contour and Object Detection",
Comoputer Vision and Pattern Recognition (CVPR), 2013,
pp. 3158-3165. `DOI <http://dx.doi.org/10.1109/CVPR.2013.406>`_
.. ximgproc:
Structured forest training
**************************
Introduction
------------
In this tutorial we show how to train your own structured forest using author's initial Matlab implementation.
Training pipeline
-----------------
1. Download "Piotr's Toolbox" from `link <http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html>`_
and put it into separate directory, e.g. PToolbox
2. Download BSDS500 dataset from `link <http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/BSR/>`
and put it into separate directory named exactly BSR
3. Add both directory and their subdirectories to Matlab path.
4. Download detector code from `link <http://research.microsoft.com/en-us/downloads/389109f6-b4e8-404c-84bf-239f7cbf4e3d/>`
and put it into root directory. Now you should have ::
.
BSR
PToolbox
models
private
Contents.m
edgesChns.m
edgesDemo.m
edgesDemoRgbd.m
edgesDetect.m
edgesEval.m
edgesEvalDir.m
edgesEvalImg.m
edgesEvalPlot.m
edgesSweeps.m
edgesTrain.m
license.txt
readme.txt
5. Rename models/forest/modelFinal.mat to models/forest/modelFinal.mat.backup
6. Open edgesChns.m and comment lines 26--41. Add after commented lines the following: ::
shrink=opts.shrink;
chns = single(getFeatures( im2double(I) ));
7. Now it is time to compile promised getFeatures. I do with the following code:
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
#include <mat.h>
#include <mex.h>
#include "MxArray.hpp" // https://github.com/kyamagu/mexopencv
class NewRFFeatureGetter : public cv::RFFeatureGetter
{
public:
NewRFFeatureGetter() : name("NewRFFeatureGetter"){}
virtual void getFeatures(const cv::Mat &src, NChannelsMat &features,
const int gnrmRad, const int gsmthRad,
const int shrink, const int outNum, const int gradNum) const
{
// here your feature extraction code, the default one is:
// resulting features Mat should be n-channels, floating point matrix
}
protected:
cv::String name;
};
MEXFUNCTION_LINKAGE void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if (nlhs != 1) mexErrMsgTxt("nlhs != 1");
if (nrhs != 1) mexErrMsgTxt("nrhs != 1");
cv::Mat src = MxArray(prhs[0]).toMat();
src.convertTo(src, cv::DataType<float>::type);
std::string modelFile = MxArray(prhs[1]).toString();
NewRFFeatureGetter *pDollar = createNewRFFeatureGetter();
cv::Mat edges;
pDollar->getFeatures(src, edges, 4, 0, 2, 13, 4);
// you can use other numbers here
edges.convertTo(edges, cv::DataType<double>::type);
plhs[0] = MxArray(edges);
}
8. Place compiled mex file into root dir and run edgesDemo.
You will need to wait a couple of hours after that the new model
will appear inside models/forest/.
9. The final step is converting trained model from Matlab binary format
to YAML which you can use with our ocv::StructuredEdgeDetection.
For this purpose run opencv_contrib/doc/tutorials/ximpgroc/training/modelConvert(model, "model.yml")
How to use your model
---------------------
Just use expanded constructor with above defined class NewRFFeatureGetter
.. code-block:: cpp
cv::StructuredEdgeDetection pDollar
= cv::createStructuredEdgeDetection( modelName, makePtr<NewRFFeatureGetter>() );
********************************************************************
bioinspired. Biologically inspired vision models and derivated tools
********************************************************************
The module provides biological visual systems models (human visual system and others). It also provides derivated objects that take advantage of those bio-inspired models.
.. toctree::
:maxdepth: 2
Human retina documentation <retina>
This diff is collapsed.
Custom Calibration Pattern
==========================
.. highlight:: cpp
CustomPattern
-------------
A custom pattern class that can be used to calibrate a camera and to further track the translation and rotation of the pattern. Defaultly it uses an ``ORB`` feature detector and a ``BruteForce-Hamming(2)`` descriptor matcher to find the location of the pattern feature points that will subsequently be used for calibration.
.. ocv:class:: CustomPattern : public Algorithm
CustomPattern::CustomPattern
----------------------------
CustomPattern constructor.
.. ocv:function:: CustomPattern()
CustomPattern::create
---------------------
A method that initializes the class and generates the necessary detectors, extractors and matchers.
.. ocv:function:: bool create(InputArray pattern, const Size2f boardSize, OutputArray output = noArray())
:param pattern: The image, which will be used as a pattern. If the desired pattern is part of a bigger image, you can crop it out using image(roi).
:param boardSize: The size of the pattern in physical dimensions. These will be used to scale the points when the calibration occurs.
:param output: A matrix that is the same as the input pattern image, but has all the feature points drawn on it.
:return Returns whether the initialization was successful or not. Possible reason for failure may be that no feature points were detected.
.. seealso::
:ocv:func:`getFeatureDetector`,
:ocv:func:`getDescriptorExtractor`,
:ocv:func:`getDescriptorMatcher`
.. note::
* Determine the number of detected feature points can be done through :ocv:func:`getPatternPoints` method.
* The feature detector, extractor and matcher cannot be changed after initialization.
CustomPattern::findPattern
--------------------------
Finds the pattern in the input image
.. ocv:function:: bool findPattern(InputArray image, OutputArray matched_features, OutputArray pattern_points, const double ratio = 0.7, const double proj_error = 8.0, const bool refine_position = false, OutputArray out = noArray(), OutputArray H = noArray(), OutputArray pattern_corners = noArray());
:param image: The input image where the pattern is searched for.
:param matched_features: A ``vector<Point2f>`` of the projections of calibration pattern points, matched in the image. The points correspond to the ``pattern_points``.``matched_features`` and ``pattern_points`` have the same size.
:param pattern_points: A ``vector<Point3f>`` of calibration pattern points in the calibration pattern coordinate space.
:param ratio: A ratio used to threshold matches based on D. Lowe's point ratio test.
:param proj_error: The maximum projection error that is allowed when the found points are back projected. A lower projection error will be beneficial for eliminating mismatches. Higher values are recommended when the camera lens has greater distortions.
:param refine_position: Whether to refine the position of the feature points with :ocv:func:`cornerSubPix`.
:param out: An image showing the matched feature points and a contour around the estimated pattern.
:param H: The homography transformation matrix between the pattern and the current image.
:param pattern_corners: A ``vector<Point2f>`` containing the 4 corners of the found pattern.
:return The method return whether the pattern was found or not.
CustomPattern::isInitialized
----------------------------
.. ocv:function:: bool isInitialized()
:return If the class is initialized or not.
CustomPattern::getPatternPoints
-------------------------------
.. ocv:function:: void getPatternPoints(OutputArray original_points)
:param original_points: Fills the vector with the points found in the pattern.
CustomPattern::getPixelSize
---------------------------
.. ocv:function:: double getPixelSize()
:return Get the physical pixel size as initialized by the pattern.
CustomPattern::setFeatureDetector
---------------------------------
.. ocv:function:: bool setFeatureDetector(Ptr<FeatureDetector> featureDetector)
:param featureDetector: Set a new FeatureDetector.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
.. note::
* It is left to user discretion to select matching feature detector, extractor and matchers. Please consult the documentation for each to confirm coherence.
CustomPattern::setDescriptorExtractor
-------------------------------------
.. ocv:function:: bool setDescriptorExtractor(Ptr<DescriptorExtractor> extractor)
:param extractor: Set a new DescriptorExtractor.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
CustomPattern::setDescriptorMatcher
-----------------------------------
.. ocv:function:: bool setDescriptorMatcher(Ptr<DescriptorMatcher> matcher)
:param matcher: Set a new DescriptorMatcher.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
CustomPattern::getFeatureDetector
---------------------------------
.. ocv:function:: Ptr<FeatureDetector> getFeatureDetector()
:return The used FeatureDetector.
CustomPattern::getDescriptorExtractor
-------------------------------------
.. ocv:function:: Ptr<DescriptorExtractor> getDescriptorExtractor()
:return The used DescriptorExtractor.
CustomPattern::getDescriptorMatcher
-----------------------------------
.. ocv:function:: Ptr<DescriptorMatcher> getDescriptorMatcher()
:return The used DescriptorMatcher.
CustomPattern::calibrate
------------------------
Calibrates the camera.
.. ocv:function:: double calibrate(InputArrayOfArrays objectPoints, InputArrayOfArrays imagePoints, Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs, OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs, int flags = 0, TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON))
See :ocv:func:`calibrateCamera` for parameter information.
CustomPattern::findRt
---------------------
Finds the rotation and translation vectors of the pattern.
.. ocv:function:: bool findRt(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int flags = ITERATIVE)
.. ocv:function:: bool findRt(InputArray image, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int flags = ITERATIVE)
:param image: The image, in which the rotation and translation of the pattern will be found.
See :ocv:func:`solvePnP` for parameter information.
CustomPattern::findRtRANSAC
---------------------------
Finds the rotation and translation vectors of the pattern using RANSAC.
.. ocv:function:: bool findRtRANSAC(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int iterationsCount = 100, float reprojectionError = 8.0, int minInliersCount = 100, OutputArray inliers = noArray(), int flags = ITERATIVE)
.. ocv:function:: bool findRtRANSAC(InputArray image, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int iterationsCount = 100, float reprojectionError = 8.0, int minInliersCount = 100, OutputArray inliers = noArray(), int flags = ITERATIVE)
:param image: The image, in which the rotation and translation of the pattern will be found.
See :ocv:func:`solvePnPRANSAC` for parameter information.
CustomPattern::drawOrientation
------------------------------
Draws the ``(x,y,z)`` axis on the image, in the center of the pattern, showing the orientation of the pattern.
.. ocv:function:: void drawOrientation(InputOutputArray image, InputArray tvec, InputArray rvec, InputArray cameraMatrix, InputArray distCoeffs, double axis_length = 3, int axis_width = 2)
:param image: The image, based on which the rotation and translation was calculated. The axis will be drawn in color - ``x`` - in red, ``y`` - in green, ``z`` - in blue.
:param tvec: Translation vector.
:param rvec: Rotation vector.
:param cameraMatrix: The camera matrix.
:param distCoeffs: The distortion coefficients.
:param axis_length: The length of the axis symbol.
:param axis_width: The width of the axis symbol.
*********************************************************************
cvv. GUI for Interactive Visual Debugging of Computer Vision Programs
*********************************************************************
The module provides an interactive GUI to debug and incrementally design computer vision algorithms. The debug statements can remain in the code after development and aid in further changes because they have neglectable overhead if the program is compiled in release mode.
.. toctree::
:maxdepth: 2
CVV API Documentation <cvv_api>
CVV GUI Documentation <cvv_gui>
CVV : the API
*************
.. highlight:: cpp
Introduction
++++++++++++
Namespace for all functions is **cvv**, i.e. *cvv::showImage()*.
Compilation:
* For development, i.e. for cvv GUI to show up, compile your code using cvv with *g++ -DCVVISUAL_DEBUGMODE*.
* For release, i.e. cvv calls doing nothing, compile your code without above flag.
See cvv tutorial for a commented example application using cvv.
API Functions
+++++++++++++
showImage
---------
Add a single image to debug GUI (similar to :imshow:`imshow <>`).
.. ocv:function:: void showImage(InputArray img, const CallMetaData& metaData, const string& description, const string& view)
:param img: Image to show in debug GUI.
:param metaData: Properly initialized CallMetaData struct, i.e. information about file, line and function name for GUI. Use CVVISUAL_LOCATION macro.
:param description: Human readable description to provide context to image.
:param view: Preselect view that will be used to visualize this image in GUI. Other views can still be selected in GUI later on.
debugFilter
-----------
Add two images to debug GUI for comparison. Usually the input and output of some filter operation, whose result should be inspected.
.. ocv:function:: void debugFilter(InputArray original, InputArray result, const CallMetaData& metaData, const string& description, const string& view)
:param original: First image for comparison, e.g. filter input.
:param result: Second image for comparison, e.g. filter output.
:param metaData: See :ocv:func:`showImage`
:param description: See :ocv:func:`showImage`
:param view: See :ocv:func:`showImage`
debugDMatch
-----------
Add a filled in :basicstructures:`DMatch <dmatch>` to debug GUI. The matches can are visualized for interactive inspection in different GUI views (one similar to an interactive :draw_matches:`drawMatches<>`).
.. ocv:function:: void debugDMatch(InputArray img1, std::vector<cv::KeyPoint> keypoints1, InputArray img2, std::vector<cv::KeyPoint> keypoints2, std::vector<cv::DMatch> matches, const CallMetaData& metaData, const string& description, const string& view, bool useTrainDescriptor)
:param img1: First image used in :basicstructures:`DMatch <dmatch>`.
:param keypoints1: Keypoints of first image.
:param img2: Second image used in DMatch.
:param keypoints2: Keypoints of second image.
:param metaData: See :ocv:func:`showImage`
:param description: See :ocv:func:`showImage`
:param view: See :ocv:func:`showImage`
:param useTrainDescriptor: Use :basicstructures:`DMatch <dmatch>`'s train descriptor index instead of query descriptor index.
finalShow
---------
This function **must** be called *once* *after* all cvv calls if any.
As an alternative create an instance of FinalShowCaller, which calls finalShow() in its destructor (RAII-style).
.. ocv:function:: void finalShow()
setDebugFlag
------------
Enable or disable cvv for current translation unit and thread (disabled this way has higher - but still low - overhead compared to using the compile flags).
.. ocv:function:: void setDebugFlag(bool active)
:param active: See above
CVV : the GUI
*************
.. highlight:: cpp
Introduction
++++++++++++
For now: See cvv tutorial.
Overview
++++++++
Filter
------
Views
++++++++
HMDB: A Large Human Motion Database
===================================
.. ocv:class:: AR_hmdb
Implements loading dataset:
_`"HMDB: A Large Human Motion Database"`: http://serre-lab.clps.brown.edu/resource/hmdb-a-large-human-motion-database/
.. note:: Usage
1. From link above download dataset files: hmdb51_org.rar & test_train_splits.rar.
2. Unpack them. Unpack all archives from directory: hmdb51_org/ and remove them.
3. To load data run: ./opencv/build/bin/example_datasets_ar_hmdb -p=/home/user/path_to_unpacked_folders/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy: 0.107407 (using precomputed HOG/HOF "STIP" features from site, averaging for 3 splits)
To run this benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_ar_hmdb_benchmark -p=/home/user/path_to_unpacked_folders/
(precomputed features should be unpacked in the same folder: /home/user/path_to_unpacked_folders/hmdb51_org_stips/. Also unpack all archives from directory: hmdb51_org_stips/ and remove them.)
**References:**
.. [Kuehne11] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. HMDB: A Large Video Database for Human Motion Recognition. ICCV, 2011
.. [Laptev08] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning Realistic Human Actions From Movies. CVPR, 2008
Sports-1M Dataset
=================
.. ocv:class:: AR_sports
Implements loading dataset:
_`"Sports-1M Dataset"`: http://cs.stanford.edu/people/karpathy/deepvideo/
.. note:: Usage
1. From link above download dataset files (git clone https://code.google.com/p/sports-1m-dataset/).
2. To load data run: ./opencv/build/bin/example_datasets_ar_sports -p=/home/user/path_to_downloaded_folders/
**References:**
.. [KarpathyCVPR14] Andrej Karpathy and George Toderici and Sanketh Shetty and Thomas Leung and Rahul Sukthankar and Li Fei-Fei. Large-scale Video Classification with Convolutional Neural Networks. CVPR, 2014
*******************************************************
datasets. Framework for working with different datasets
*******************************************************
.. highlight:: cpp
The datasets module includes classes for working with different datasets: load data, evaluate different algorithms on them, contains benchmarks, etc.
It is planned to have:
* basic: loading code for all datasets to help start work with them.
* next stage: quick benchmarks for all datasets to show how to solve them using OpenCV and implement evaluation code.
* finally: implement on OpenCV state-of-the-art algorithms, which solve these tasks.
.. toctree::
:hidden:
ar_hmdb
ar_sports
fr_adience
fr_lfw
gr_chalearn
gr_skig
hpe_humaneva
hpe_parse
ir_affine
ir_robot
is_bsds
is_weizmann
msm_epfl
msm_middlebury
or_imagenet
or_mnist
or_sun
pd_caltech
slam_kitti
slam_tumindoor
tr_chars
tr_svt
Action Recognition
------------------
:doc:`ar_hmdb` [#f1]_
:doc:`ar_sports`
Face Recognition
----------------
:doc:`fr_adience`
:doc:`fr_lfw` [#f1]_
Gesture Recognition
-------------------
:doc:`gr_chalearn`
:doc:`gr_skig`
Human Pose Estimation
---------------------
:doc:`hpe_humaneva`
:doc:`hpe_parse`
Image Registration
------------------
:doc:`ir_affine`
:doc:`ir_robot`
Image Segmentation
------------------
:doc:`is_bsds`
:doc:`is_weizmann`
Multiview Stereo Matching
-------------------------
:doc:`msm_epfl`
:doc:`msm_middlebury`
Object Recognition
------------------
:doc:`or_imagenet`
:doc:`or_mnist` [#f2]_
:doc:`or_sun`
Pedestrian Detection
--------------------
:doc:`pd_caltech` [#f2]_
SLAM
----
:doc:`slam_kitti`
:doc:`slam_tumindoor`
Text Recognition
----------------
:doc:`tr_chars`
:doc:`tr_svt` [#f1]_
*Footnotes*
.. [#f1] Benchmark implemented
.. [#f2] Not used in Vision Challenge
Adience
=======
.. ocv:class:: FR_adience
Implements loading dataset:
_`"Adience"`: http://www.openu.ac.il/home/hassner/Adience/data.html
.. note:: Usage
1. From link above download any dataset file: faces.tar.gz\\aligned.tar.gz and files with splits: fold_0_data.txt-fold_4_data.txt, fold_frontal_0_data.txt-fold_frontal_4_data.txt. (For face recognition task another splits should be created)
2. Unpack dataset file to some folder and place split files into the same folder.
3. To load data run: ./opencv/build/bin/example_datasets_fr_adience -p=/home/user/path_to_created_folder/
**References:**
.. [Eidinger] E. Eidinger, R. Enbar, and T. Hassner. Age and Gender Estimation of Unfiltered Faces
Labeled Faces in the Wild
=========================
.. ocv:class:: FR_lfw
Implements loading dataset:
_`"Labeled Faces in the Wild"`: http://vis-www.cs.umass.edu/lfw/
.. note:: Usage
1. From link above download any dataset file: lfw.tgz\\lfwa.tar.gz\\lfw-deepfunneled.tgz\\lfw-funneled.tgz and files with pairs: 10 test splits: pairs.txt and developer train split: pairsDevTrain.txt.
2. Unpack dataset file and place pairs.txt and pairsDevTrain.txt in created folder.
3. To load data run: ./opencv/build/bin/example_datasets_fr_lfw -p=/home/user/path_to_unpacked_folder/lfw2/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy: 0.623833 +- 0.005223 (train split: pairsDevTrain.txt, dataset: lfwa)
To run this benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_fr_lfw_benchmark -p=/home/user/path_to_unpacked_folder/lfw2/
**References:**
.. [Huang07] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. 2007
ChaLearn Looking at People
==========================
.. ocv:class:: GR_chalearn
Implements loading dataset:
_`"ChaLearn Looking at People"`: http://gesture.chalearn.org/
.. note:: Usage
1. Follow instruction from site above, download files for dataset "Track 3: Gesture Recognition": Train1.zip-Train5.zip, Validation1.zip-Validation3.zip (Register on site: www.codalab.org and accept the terms and conditions of competition: https://www.codalab.org/competitions/991#learn_the_details There are three mirrors for downloading dataset files. When I downloaded data only mirror: "Universitat Oberta de Catalunya" works).
2. Unpack train archives Train1.zip-Train5.zip to folder Train/, validation archives Validation1.zip-Validation3.zip to folder Validation/
3. Unpack all archives in Train/ & Validation/ in the folders with the same names, for example: Sample0001.zip to Sample0001/
4. To load data run: ./opencv/build/bin/example_datasets_gr_chalearn -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Escalera14] S. Escalera, X. Baró, J. Gonzàlez, M.A. Bautista, M. Madadi, M. Reyes, V. Ponce-López, H.J. Escalante, J. Shotton, I. Guyon, "ChaLearn Looking at People Challenge 2014: Dataset and Results", ECCV Workshops, 2014
Sheffield Kinect Gesture Dataset
================================
.. ocv:class:: GR_skig
Implements loading dataset:
_`"Sheffield Kinect Gesture Dataset"`: http://lshao.staff.shef.ac.uk/data/SheffieldKinectGesture.htm
.. note:: Usage
1. From link above download dataset files: subject1_dep.7z-subject6_dep.7z, subject1_rgb.7z-subject6_rgb.7z.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_gr_skig -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Liu13] L. Liu and L. Shao, “Learning Discriminative Representations from RGB-D Video Data”, In Proc. International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013.
HumanEva Dataset
================
.. ocv:class:: HPE_humaneva
Implements loading dataset:
_`"HumanEva Dataset"`: http://humaneva.is.tue.mpg.de
.. note:: Usage
1. From link above download dataset files for HumanEva-I (tar) & HumanEva-II.
2. Unpack them to HumanEva_1 & HumanEva_2 accordingly.
3. To load data run: ./opencv/build/bin/example_datasets_hpe_humaneva -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Sigal10] L. Sigal, A. Balan and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion, In International Journal of Computer Vision, Vol. 87 (1-2), 2010
.. [Sigal06] L. Sigal and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion, Techniacl Report CS-06-08, Brown University, 2006
PARSE Dataset
=============
.. ocv:class:: HPE_parse
Implements loading dataset:
_`"PARSE Dataset"`: http://www.ics.uci.edu/~dramanan/papers/parse/
.. note:: Usage
1. From link above download dataset file: people.zip.
2. Unpack it.
3. To load data run: ./opencv/build/bin/example_datasets_hpe_parse -p=/home/user/path_to_unpacked_folder/people_all/
**References:**
.. [Ramanan06] D. Ramanan "Learning to Parse Images of Articulated Bodies." Neural Info. Proc. Systems (NIPS) To appear. Dec 2006.
Affine Covariant Regions Datasets
=================================
.. ocv:class:: IR_affine
Implements loading dataset:
_`"Affine Covariant Regions Datasets"`: http://www.robots.ox.ac.uk/~vgg/data/data-aff.html
.. note:: Usage
1. From link above download dataset files: bark\\bikes\\boat\\graf\\leuven\\trees\\ubc\\wall.tar.gz.
2. Unpack them.
3. To load data, for example, for "bark", run: ./opencv/build/bin/example_datasets_ir_affine -p=/home/user/path_to_unpacked_folder/bark/
**References:**
.. [Mikolajczyk05] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. Van Gool. A Comparison of Affine Region Detectors. International Journal of Computer Vision, Volume 65, Number 1/2, page 43--72, 2005
Robot Data Set
==============
.. ocv:class:: IR_robot
Implements loading dataset:
_`"Robot Data Set, Point Feature Data Set – 2010"`: http://roboimagedata.compute.dtu.dk/?page_id=24
.. note:: Usage
1. From link above download dataset files: SET001_6.tar.gz-SET055_60.tar.gz
2. Unpack them to one folder.
3. To load data run: ./opencv/build/bin/example_datasets_ir_robot -p=/home/user/path_to_unpacked_folder/
**References:**
.. [aanæsinteresting] Aan{\ae}s, H. and Dahl, A.L. and Steenstrup Pedersen, K. Interesting Interest Points. International Journal of Computer Vision. 2012.
The Berkeley Segmentation Dataset and Benchmark
===============================================
.. ocv:class:: IS_bsds
Implements loading dataset:
_`"The Berkeley Segmentation Dataset and Benchmark"`: https://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
.. note:: Usage
1. From link above download dataset files: BSDS300-human.tgz & BSDS300-images.tgz.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_is_bsds -p=/home/user/path_to_unpacked_folder/BSDS300/
**References:**
.. [MartinFTM01] D. Martin and C. Fowlkes and D. Tal and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. 2001
Weizmann Segmentation Evaluation Database
=========================================
.. ocv:class:: IS_weizmann
Implements loading dataset:
_`"Weizmann Segmentation Evaluation Database"`: http://www.wisdom.weizmann.ac.il/~vision/Seg_Evaluation_DB/
.. note:: Usage
1. From link above download dataset files: Weizmann_Seg_DB_1obj.ZIP & Weizmann_Seg_DB_2obj.ZIP.
2. Unpack them.
3. To load data, for example, for 1 object dataset, run: ./opencv/build/bin/example_datasets_is_weizmann -p=/home/user/path_to_unpacked_folder/1obj/
**References:**
.. [AlpertGBB07] Sharon Alpert and Meirav Galun and Ronen Basri and Achi Brandt. Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration. 2007
EPFL Multi-View Stereo
======================
.. ocv:class:: MSM_epfl
Implements loading dataset:
_`"EPFL Multi-View Stereo"`: http://cvlabwww.epfl.ch/~strecha/multiview/denseMVS.html
.. note:: Usage
1. From link above download dataset files: castle_dense\\castle_dense_large\\castle_entry\\fountain\\herzjesu_dense\\herzjesu_dense_large_bounding\\cameras\\images\\p.tar.gz.
2. Unpack them in separate folder for each object. For example, for "fountain", in folder fountain/ : fountain_dense_bounding.tar.gz -> bounding/, fountain_dense_cameras.tar.gz -> camera/, fountain_dense_images.tar.gz -> png/, fountain_dense_p.tar.gz -> P/
3. To load data, for example, for "fountain", run: ./opencv/build/bin/example_datasets_msm_epfl -p=/home/user/path_to_unpacked_folder/fountain/
**References:**
.. [Strecha08] C. Strecha, W. von Hansen, L. Van Gool, P. Fua, U. Thoennessen. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery. CVPR, 2008
Stereo – Middlebury Computer Vision
===================================
.. ocv:class:: MSM_middlebury
Implements loading dataset:
_`"Stereo – Middlebury Computer Vision"`: http://vision.middlebury.edu/mview/
.. note:: Usage
1. From link above download dataset files: dino\\dinoRing\\dinoSparseRing\\temple\\templeRing\\templeSparseRing.zip
2. Unpack them.
3. To load data, for example "temple" dataset, run: ./opencv/build/bin/example_datasets_msm_middlebury -p=/home/user/path_to_unpacked_folder/temple/
**References:**
.. [Seitz06] S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, R. Szeliski. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, CVPR, 2006
ImageNet
========
.. ocv:class:: OR_imagenet
Implements loading dataset:
_`"ImageNet"`: http://www.image-net.org/
.. note:: Usage
1. From link above download dataset files: ILSVRC2010_images_train.tar\\ILSVRC2010_images_test.tar\\ILSVRC2010_images_val.tar & devkit: ILSVRC2010_devkit-1.0.tar.gz (Implemented loading of 2010 dataset as only this dataset has ground truth for test data, but structure for ILSVRC2014 is similar)
2. Unpack them to: some_folder/train/\\some_folder/test/\\some_folder/val & some_folder/ILSVRC2010_validation_ground_truth.txt\\some_folder/ILSVRC2010_test_ground_truth.txt.
3. Create file with labels: some_folder/labels.txt, for example, using :ref:`python script <python-script>` below (each file's row format: synset,labelID,description. For example: "n07751451,18,plum").
4. Unpack all tar files in train.
5. To load data run: ./opencv/build/bin/example_datasets_or_imagenet -p=/home/user/some_folder/
.. _python-script:
Python script to parse meta.mat:
::
import scipy.io
meta_mat = scipy.io.loadmat("devkit-1.0/data/meta.mat")
labels_dic = dict((m[0][1][0], m[0][0][0][0]-1) for m in meta_mat['synsets']
label_names_dic = dict((m[0][1][0], m[0][2][0]) for m in meta_mat['synsets']
for label in labels_dic.keys():
print "{0},{1},{2}".format(label, labels_dic[label], label_names_dic[label])
**References:**
.. [ILSVRCarxiv14] Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. 2014
MNIST
=====
.. ocv:class:: OR_mnist
Implements loading dataset:
_`"MNIST"`: http://yann.lecun.com/exdb/mnist/
.. note:: Usage
1. From link above download dataset files: t10k-images-idx3-ubyte.gz, t10k-labels-idx1-ubyte.gz, train-images-idx3-ubyte.gz, train-labels-idx1-ubyte.gz.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_or_mnist -p=/home/user/path_to_unpacked_files/
**References:**
.. [LeCun98a] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
SUN Database
============
.. ocv:class:: OR_sun
Implements loading dataset:
_`"SUN Database, Scene Recognition Benchmark. SUN397"`: http://vision.cs.princeton.edu/projects/2010/SUN/
.. note:: Usage
1. From link above download dataset file: SUN397.tar & file with splits: Partitions.zip
2. Unpack SUN397.tar into folder: SUN397/ & Partitions.zip into folder: SUN397/Partitions/
3. To load data run: ./opencv/build/bin/example_datasets_or_sun -p=/home/user/path_to_unpacked_files/SUN397/
**References:**
.. [Xiao10] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. SUN Database: Large-scale Scene Recognition from Abbey to Zoo. IEEE Conference on Computer Vision and Pattern Recognition. CVPR, 2010
.. [Xiao14] J. Xiao, K. A. Ehinger, J. Hays, A. Torralba, and A. Oliva. SUN Database: Exploring a Large Collection of Scene Categories. International Journal of Computer Vision. IJCV, 2014
Caltech Pedestrian Detection Benchmark
======================================
.. ocv:class:: PD_caltech
Implements loading dataset:
_`"Caltech Pedestrian Detection Benchmark"`: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
.. note:: First version of Caltech Pedestrian dataset loading.
Code to unpack all frames from seq files commented as their number is huge!
So currently load only meta information without data.
Also ground truth isn't processed, as need to convert it from mat files first.
.. note:: Usage
1. From link above download dataset files: set00.tar-set10.tar.
2. Unpack them to separate folder.
3. To load data run: ./opencv/build/bin/example_datasets_pd_caltech -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Dollár12] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: An Evaluation of the State of the Art. PAMI, 2012.
.. [DollárCVPR09] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: A Benchmark. CVPR, 2009
KITTI Vision Benchmark
======================
.. ocv:class:: SLAM_kitti
Implements loading dataset:
_`"KITTI Vision Benchmark"`: http://www.cvlibs.net/datasets/kitti/eval_odometry.php
.. note:: Usage
1. From link above download "Odometry" dataset files: data_odometry_gray\\data_odometry_color\\data_odometry_velodyne\\data_odometry_poses\\data_odometry_calib.zip.
2. Unpack data_odometry_poses.zip, it creates folder dataset/poses/. After that unpack data_odometry_gray.zip, data_odometry_color.zip, data_odometry_velodyne.zip. Folder dataset/sequences/ will be created with folders 00/..21/. Each of these folders will contain: image_0/, image_1/, image_2/, image_3/, velodyne/ and files calib.txt & times.txt. These two last files will be replaced after unpacking data_odometry_calib.zip at the end.
3. To load data run: ./opencv/build/bin/example_datasets_slam_kitti -p=/home/user/path_to_unpacked_folder/dataset/
**References:**
.. [Geiger2012CVPR] Andreas Geiger and Philip Lenz and Raquel Urtasun. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012
.. [Geiger2013IJRR] Andreas Geiger and Philip Lenz and Christoph Stiller and Raquel Urtasun. Vision meets Robotics: The KITTI Dataset. IJRR, 2013
.. [Fritsch2013ITSC] Jannik Fritsch and Tobias Kuehnl and Andreas Geiger. A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms. ITSC, 2013
TUMindoor Dataset
=================
.. ocv:class:: SLAM_tumindoor
Implements loading dataset:
_`"TUMindoor Dataset"`: http://www.navvis.lmt.ei.tum.de/dataset/
.. note:: Usage
1. From link above download dataset files: dslr\\info\\ladybug\\pointcloud.tar.bz2 for each dataset: 11-11-28 (1st floor)\\11-12-13 (1st floor N1)\\11-12-17a (4th floor)\\11-12-17b (3rd floor)\\11-12-17c (Ground I)\\11-12-18a (Ground II)\\11-12-18b (2nd floor)
2. Unpack them in separate folder for each dataset. dslr.tar.bz2 -> dslr/, info.tar.bz2 -> info/, ladybug.tar.bz2 -> ladybug/, pointcloud.tar.bz2 -> pointcloud/.
3. To load each dataset run: ./opencv/build/bin/example_datasets_slam_tumindoor -p=/home/user/path_to_unpacked_folders/
**References:**
.. [TUMindoor] R. Huitl and G. Schroth and S. Hilsenbeck and F. Schweiger and E. Steinbach. {TUM}indoor: An Extensive Image and Point Cloud Dataset for Visual Indoor Localization and Mapping. 2012
The Chars74K Dataset
====================
.. ocv:class:: TR_chars
Implements loading dataset:
_`"The Chars74K Dataset"`: http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
.. note:: Usage
1. From link above download dataset files: EnglishFnt\\EnglishHnd\\EnglishImg\\KannadaHnd\\KannadaImg.tgz, ListsTXT.tgz.
2. Unpack them.
3. Move .m files from folder ListsTXT/ to appropriate folder. For example, English/list_English_Img.m for EnglishImg.tgz.
4. To load data, for example "EnglishImg", run: ./opencv/build/bin/example_datasets_tr_chars -p=/home/user/path_to_unpacked_folder/English/
**References:**
.. [Campos09] T. E. de Campos, B. R. Babu and M. Varma. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), 2009
The Street View Text Dataset
============================
.. ocv:class:: TR_svt
Implements loading dataset:
_`"The Street View Text Dataset"`: http://vision.ucsd.edu/~kai/svt/
.. note:: Usage
1. From link above download dataset file: svt.zip.
2. Unpack it.
3. To load data run: ./opencv/build/bin/example_datasets_tr_svt -p=/home/user/path_to_unpacked_folder/svt/svt1/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy (mean f1): 0.217
To run benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_tr_svt_benchmark -p=/home/user/path_to_unpacked_folders/svt/svt1/
**References:**
.. [Wang11] Kai Wang, Boris Babenko and Serge Belongie. End-to-end Scene Text Recognition. ICCV, 2011
.. [Wang10] Kai Wang and Serge Belongie. Word Spotting in the Wild. ECCV, 2010
***************************************
face. Face Recognition
***************************************
The module contains some recently added functionality that has not been stabilized, or functionality that is considered optional.
.. toctree::
:maxdepth: 2
FaceRecognizer Documentation <index>
This diff is collapsed.
Changelog
=========
Release 0.05
------------
This library is now included in the official OpenCV distribution (from 2.4 on).
The :ocv:class`FaceRecognizer` is now an :ocv:class:`Algorithm`, which better fits into the overall
OpenCV API.
To reduce the confusion on user side and minimize my work, libfacerec and OpenCV
have been synchronized and are now based on the same interfaces and implementation.
The library now has an extensive documentation:
* The API is explained in detail and with a lot of code examples.
* The face recognition guide I had written for Python and GNU Octave/MATLAB has been adapted to the new OpenCV C++ ``cv::FaceRecognizer``.
* A tutorial for gender classification with Fisherfaces.
* A tutorial for face recognition in videos (e.g. webcam).
Release highlights
++++++++++++++++++
* There are no single highlights to pick from, this release is a highlight itself.
Release 0.04
------------
This version is fully Windows-compatible and works with OpenCV 2.3.1. Several
bugfixes, but none influenced the recognition rate.
Release highlights
++++++++++++++++++
* A whole lot of exceptions with meaningful error messages.
* A tutorial for Windows users: `http://bytefish.de/blog/opencv_visual_studio_and_libfacerec <http://bytefish.de/blog/opencv_visual_studio_and_libfacerec>`_
Release 0.03
------------
Reworked the library to provide separate implementations in cpp files, because
it's the preferred way of contributing OpenCV libraries. This means the library
is not header-only anymore. Slight API changes were done, please see the
documentation for details.
Release highlights
++++++++++++++++++
* New Unit Tests (for LBP Histograms) make the library more robust.
* Added more documentation.
Release 0.02
------------
Reworked the library to provide separate implementations in cpp files, because
it's the preferred way of contributing OpenCV libraries. This means the library
is not header-only anymore. Slight API changes were done, please see the
documentation for details.
Release highlights
++++++++++++++++++
* New Unit Tests (for LBP Histograms) make the library more robust.
* Added a documentation and changelog in reStructuredText.
Release 0.01
------------
Initial release as header-only library.
Release highlights
++++++++++++++++++
* Colormaps for OpenCV to enhance the visualization.
* Face Recognition algorithms implemented:
* Eigenfaces [TP91]_
* Fisherfaces [BHK97]_
* Local Binary Patterns Histograms [AHP04]_
* Added persistence facilities to store the models with a common API.
* Unit Tests (using `gtest <http://code.google.com/p/googletest/>`_).
* Providing a CMakeLists.txt to enable easy cross-platform building.
This diff is collapsed.
FaceRecognizer - Face Recognition with OpenCV
##############################################
OpenCV 2.4 now comes with the very new :ocv:class:`FaceRecognizer` class for face recognition. This documentation is going to explain you :doc:`the API <facerec_api>` in detail and it will give you a lot of help to get started (full source code examples). :doc:`Face Recognition with OpenCV <facerec_tutorial>` is the definite guide to the new :ocv:class:`FaceRecognizer`. There's also a :doc:`tutorial on gender classification <tutorial/facerec_gender_classification>`, a :doc:`tutorial for face recognition in videos <tutorial/facerec_video_recognition>` and it's shown :doc:`how to load & save your results <tutorial/facerec_save_load>`.
These documents are the help I have wished for, when I was working myself into face recognition. I hope you also think the new :ocv:class:`FaceRecognizer` is a useful addition to OpenCV.
Please issue any feature requests and/or bugs on the official OpenCV bug tracker at:
* http://code.opencv.org/projects/opencv/issues
Contents
========
.. toctree::
:maxdepth: 1
FaceRecognizer API <facerec_api>
Guide to Face Recognition with OpenCV <facerec_tutorial>
Tutorial on Gender Classification <tutorial/facerec_gender_classification>
Tutorial on Face Recognition in Videos <tutorial/facerec_video_recognition>
Tutorial On Saving & Loading a FaceRecognizer <tutorial/facerec_save_load>
Changelog <facerec_changelog>
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
Saving and Loading a FaceRecognizer
===================================
Introduction
------------
Saving and loading a :ocv:class:`FaceRecognizer` is very important. Training a FaceRecognizer can be a very time-intense task, plus it's often impossible to ship the whole face database to the user of your product. The task of saving and loading a FaceRecognizer is easy with :ocv:class:`FaceRecognizer`. You only have to call :ocv:func:`FaceRecognizer::load` for loading and :ocv:func:`FaceRecognizer::save` for saving a :ocv:class:`FaceRecognizer`.
I'll adapt the Eigenfaces example from the :doc:`../facerec_tutorial`: Imagine we want to learn the Eigenfaces of the `AT&T Facedatabase <http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html>`_, store the model to a YAML file and then load it again.
From the loaded model, we'll get a prediction, show the mean, Eigenfaces and the image reconstruction.
Using FaceRecognizer::save and FaceRecognizer::load
-----------------------------------------------------
The source code for this demo application is also available in the ``src`` folder coming with this documentation:
* :download:`src/facerec_save_load.cpp <../src/facerec_save_load.cpp>`
.. literalinclude:: ../src/facerec_save_load.cpp
:language: cpp
:linenos:
Results
-------
``eigenfaces_at.yml`` then contains the model state, we'll simply look at the first 10 lines with ``head eigenfaces_at.yml``:
.. code-block:: none
philipp@mango:~/github/libfacerec-build$ head eigenfaces_at.yml
%YAML:1.0
num_components: 399
mean: !!opencv-matrix
rows: 1
cols: 10304
dt: d
data: [ 8.5558897243107765e+01, 8.5511278195488714e+01,
8.5854636591478695e+01, 8.5796992481203006e+01,
8.5952380952380949e+01, 8.6162907268170414e+01,
8.6082706766917283e+01, 8.5776942355889716e+01,
And here is the Reconstruction, which is the same as the original:
.. image:: ../img/eigenface_reconstruction_opencv.png
:align: center
Face Recognition in Videos with OpenCV
=======================================
.. contents:: Table of Contents
:depth: 3
Introduction
------------
Whenever you hear the term *face recognition*, you instantly think of surveillance in videos. So performing face recognition in videos (e.g. webcam) is one of the most requested features I have got. I have heard your cries, so here it is. An application, that shows you how to do face recognition in videos! For the face detection part we'll use the awesome :ocv:class:`CascadeClassifier` and we'll use :ocv:class:`FaceRecognizer` for face recognition. This example uses the Fisherfaces method for face recognition, because it is robust against large changes in illumination.
Here is what the final application looks like. As you can see I am only writing the id of the recognized person above the detected face (by the way this id is Arnold Schwarzenegger for my data set):
.. image:: ../img/tutorial/facerec_video/facerec_video.png
:align: center
:scale: 70%
This demo is a basis for your research and it shows you how to implement face recognition in videos. You probably want to extend the application and make it more sophisticated: You could combine the id with the name, then show the confidence of the prediction, recognize the emotion... and and and. But before you send mails, asking what these Haar-Cascade thing is or what a CSV is: Make sure you have read the entire tutorial. It's all explained in here. If you just want to scroll down to the code, please note:
* The available Haar-Cascades for face detection are located in the ``data`` folder of your OpenCV installation! One of the available Haar-Cascades for face detection is for example ``/path/to/opencv/data/haarcascades/haarcascade_frontalface_default.xml``.
I encourage you to experiment with the application. Play around with the available :ocv:class:`FaceRecognizer` implementations, try the available cascades in OpenCV and see if you can improve your results!
Prerequisites
--------------
You want to do face recognition, so you need some face images to learn a :ocv:class:`FaceRecognizer` on. I have decided to reuse the images from the gender classification example: :doc:`facerec_gender_classification`.
I have the following celebrities in my training data set:
* Angelina Jolie
* Arnold Schwarzenegger
* Brad Pitt
* George Clooney
* Johnny Depp
* Justin Timberlake
* Katy Perry
* Keanu Reeves
* Patrick Stewart
* Tom Cruise
In the demo I have decided to read the images from a very simple CSV file. Why? Because it's the simplest platform-independent approach I can think of. However, if you know a simpler solution please ping me about it. Basically all the CSV file needs to contain are lines composed of a ``filename`` followed by a ``;`` followed by the ``label`` (as *integer number*), making up a line like this:
.. code-block:: none
/path/to/image.ext;0
Let's dissect the line. ``/path/to/image.ext`` is the path to an image, probably something like this if you are in Windows: ``C:/faces/person0/image0.jpg``. Then there is the separator ``;`` and finally we assign a label ``0`` to the image. Think of the label as the subject (the person, the gender or whatever comes to your mind). In the face recognition scenario, the label is the person this image belongs to. In the gender classification scenario, the label is the gender the person has. So my CSV file looks like this:
.. code-block:: none
/home/philipp/facerec/data/c/keanu_reeves/keanu_reeves_01.jpg;0
/home/philipp/facerec/data/c/keanu_reeves/keanu_reeves_02.jpg;0
/home/philipp/facerec/data/c/keanu_reeves/keanu_reeves_03.jpg;0
...
/home/philipp/facerec/data/c/katy_perry/katy_perry_01.jpg;1
/home/philipp/facerec/data/c/katy_perry/katy_perry_02.jpg;1
/home/philipp/facerec/data/c/katy_perry/katy_perry_03.jpg;1
...
/home/philipp/facerec/data/c/brad_pitt/brad_pitt_01.jpg;2
/home/philipp/facerec/data/c/brad_pitt/brad_pitt_02.jpg;2
/home/philipp/facerec/data/c/brad_pitt/brad_pitt_03.jpg;2
...
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_08.jpg;6
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_05.jpg;6
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_02.jpg;6
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_03.jpg;6
All images for this example were chosen to have a frontal face perspective. They have been cropped, scaled and rotated to be aligned at the eyes, just like this set of George Clooney images:
.. image:: ../img/tutorial/gender_classification/clooney_set.png
:align: center
Face Recongition from Videos
-----------------------------
The source code for the demo is available in the ``src`` folder coming with this documentation:
* :download:`src/facerec_video.cpp <../src/facerec_video.cpp>`
This demo uses the :ocv:class:`CascadeClassifier`:
.. literalinclude:: ../src/facerec_video.cpp
:language: cpp
:linenos:
Running the Demo
----------------
You'll need:
* The path to a valid Haar-Cascade for detecting a face with a :ocv:class:`CascadeClassifier`.
* The path to a valid CSV File for learning a :ocv:class:`FaceRecognizer`.
* A webcam and its device id (you don't know the device id? Simply start from 0 on and see what happens).
If you are in Windows, then simply start the demo by running (from command line):
.. code-block:: none
facerec_video.exe <C:/path/to/your/haar_cascade.xml> <C:/path/to/your/csv.ext> <video device>
If you are in Linux, then simply start the demo by running:
.. code-block:: none
./facerec_video </path/to/your/haar_cascade.xml> </path/to/your/csv.ext> <video device>
An example. If the haar-cascade is at ``C:/opencv/data/haarcascades/haarcascade_frontalface_default.xml``, the CSV file is at ``C:/facerec/data/celebrities.txt`` and I have a webcam with deviceId ``1``, then I would call the demo with:
.. code-block:: none
facerec_video.exe C:/opencv/data/haarcascades/haarcascade_frontalface_default.xml C:/facerec/data/celebrities.txt 1
That's it.
Results
-------
Enjoy!
Appendix
--------
Creating the CSV File
+++++++++++++++++++++
You don't really want to create the CSV file by hand. I have prepared you a little Python script ``create_csv.py`` (you find it at ``/src/create_csv.py`` coming with this tutorial) that automatically creates you a CSV file. If you have your images in hierarchie like this (``/basepath/<subject>/<image.ext>``):
.. code-block:: none
philipp@mango:~/facerec/data/at$ tree
.
|-- s1
| |-- 1.pgm
| |-- ...
| |-- 10.pgm
|-- s2
| |-- 1.pgm
| |-- ...
| |-- 10.pgm
...
|-- s40
| |-- 1.pgm
| |-- ...
| |-- 10.pgm
Then simply call ``create_csv.py`` with the path to the folder, just like this and you could save the output:
.. code-block:: none
philipp@mango:~/facerec/data$ python create_csv.py
at/s13/2.pgm;0
at/s13/7.pgm;0
at/s13/6.pgm;0
at/s13/9.pgm;0
at/s13/5.pgm;0
at/s13/3.pgm;0
at/s13/4.pgm;0
at/s13/10.pgm;0
at/s13/8.pgm;0
at/s13/1.pgm;0
at/s17/2.pgm;1
at/s17/7.pgm;1
at/s17/6.pgm;1
at/s17/9.pgm;1
at/s17/5.pgm;1
at/s17/3.pgm;1
[...]
Here is the script, if you can't find it:
.. literalinclude:: ../src/create_csv.py
:language: python
:linenos:
Aligning Face Images
++++++++++++++++++++
An accurate alignment of your image data is especially important in tasks like emotion detection, were you need as much detail as possible. Believe me... You don't want to do this by hand. So I've prepared you a tiny Python script. The code is really easy to use. To scale, rotate and crop the face image you just need to call *CropFace(image, eye_left, eye_right, offset_pct, dest_sz)*, where:
* *eye_left* is the position of the left eye
* *eye_right* is the position of the right eye
* *offset_pct* is the percent of the image you want to keep next to the eyes (horizontal, vertical direction)
* *dest_sz* is the size of the output image
If you are using the same *offset_pct* and *dest_sz* for your images, they are all aligned at the eyes.
.. literalinclude:: ../src/crop_face.py
:language: python
:linenos:
Imagine we are given `this photo of Arnold Schwarzenegger <http://en.wikipedia.org/wiki/File:Arnold_Schwarzenegger_edit%28ws%29.jpg>`_, which is under a Public Domain license. The (x,y)-position of the eyes is approximately *(252,364)* for the left and *(420,366)* for the right eye. Now you only need to define the horizontal offset, vertical offset and the size your scaled, rotated & cropped face should have.
Here are some examples:
+---------------------------------+----------------------------------------------------------------------------+
| Configuration | Cropped, Scaled, Rotated Face |
+=================================+============================================================================+
| 0.1 (10%), 0.1 (10%), (200,200) | .. image:: ../img/tutorial/gender_classification/arnie_10_10_200_200.jpg |
+---------------------------------+----------------------------------------------------------------------------+
| 0.2 (20%), 0.2 (20%), (200,200) | .. image:: ../img/tutorial/gender_classification/arnie_20_20_200_200.jpg |
+---------------------------------+----------------------------------------------------------------------------+
| 0.3 (30%), 0.3 (30%), (200,200) | .. image:: ../img/tutorial/gender_classification/arnie_30_30_200_200.jpg |
+---------------------------------+----------------------------------------------------------------------------+
| 0.2 (20%), 0.2 (20%), (70,70) | .. image:: ../img/tutorial/gender_classification/arnie_20_20_70_70.jpg |
+---------------------------------+----------------------------------------------------------------------------+
Latent SVM
===============================================================
Discriminatively Trained Part Based Models for Object Detection
---------------------------------------------------------------
The object detector described below has been initially proposed by
P.F. Felzenszwalb in [Felzenszwalb2010a]_. It is based on a
Dalal-Triggs detector that uses a single filter on histogram of
oriented gradients (HOG) features to represent an object category.
This detector uses a sliding window approach, where a filter is
applied at all positions and scales of an image. The first
innovation is enriching the Dalal-Triggs model using a
star-structured part-based model defined by a "root" filter
(analogous to the Dalal-Triggs filter) plus a set of parts filters
and associated deformation models. The score of one of star models
at a particular position and scale within an image is the score of
the root filter at the given location plus the sum over parts of the
maximum, over placements of that part, of the part filter score on
its location minus a deformation cost easuring the deviation of the
part from its ideal location relative to the root. Both root and
part filter scores are defined by the dot product between a filter
(a set of weights) and a subwindow of a feature pyramid computed
from the input image. Another improvement is a representation of the
class of models by a mixture of star models. The score of a mixture
model at a particular position and scale is the maximum over
components, of the score of that component model at the given
location.
The detector was dramatically speeded-up with cascade algorithm
proposed by P.F. Felzenszwalb in [Felzenszwalb2010b]_. The algorithm
prunes partial hypotheses using thresholds on their scores.The basic
idea of the algorithm is to use a hierarchy of models defined by an
ordering of the original model's parts. For a model with (n+1)
parts, including the root, a sequence of (n+1) models is obtained.
The i-th model in this sequence is defined by the first i parts from
the original model. Using this hierarchy, low scoring hypotheses can be
pruned after looking at the best configuration of a subset of the parts.
Hypotheses that score high under a weak model are evaluated further using
a richer model.
In OpenCV there is an C++ implementation of Latent SVM.
.. highlight:: cpp
LSVMDetector
-----------------
.. ocv:class:: LSVMDetector
This is a C++ abstract class, it provides external user API to work with Latent SVM.
LSVMDetector::ObjectDetection
----------------------------------
.. ocv:struct:: LSVMDetector::ObjectDetection
Structure contains the detection information.
.. ocv:member:: Rect rect
bounding box for a detected object
.. ocv:member:: float score
confidence level
.. ocv:member:: int classID
class (model or detector) ID that detect an object
LSVMDetector::~LSVMDetector
-------------------------------------
Destructor.
.. ocv:function:: LSVMDetector::~LSVMDetector()
LSVMDetector::create
-----------------------
Load the trained models from given ``.xml`` files and return ``cv::Ptr<LSVMDetector>``.
.. ocv:function:: static cv::Ptr<LSVMDetector> LSVMDetector::create( const vector<string>& filenames, const vector<string>& classNames=vector<string>() )
:param filenames: A set of filenames storing the trained detectors (models). Each file contains one model. See examples of such files here /opencv_extra/testdata/cv/LSVMDetector/models_VOC2007/.
:param classNames: A set of trained models names. If it's empty then the name of each model will be constructed from the name of file containing the model. E.g. the model stored in "/home/user/cat.xml" will get the name "cat".
LSVMDetector::detect
-------------------------
Find rectangular regions in the given image that are likely to contain objects of loaded classes (models)
and corresponding confidence levels.
.. ocv:function:: void LSVMDetector::detect( const Mat& image, vector<ObjectDetection>& objectDetections, float overlapThreshold=0.5f, int numThreads=-1 )
:param image: An image.
:param objectDetections: The detections: rectangulars, scores and class IDs.
:param overlapThreshold: Threshold for the non-maximum suppression algorithm.
:param numThreads: Number of threads used in parallel version of the algorithm.
LSVMDetector::getClassNames
--------------------------------
Return the class (model) names that were passed in constructor or method ``load`` or extracted from models filenames in those methods.
.. ocv:function:: const vector<string>& LSVMDetector::getClassNames() const
LSVMDetector::getClassCount
--------------------------------
Return a count of loaded models (classes).
.. ocv:function:: size_t LSVMDetector::getClassCount() const
.. [Felzenszwalb2010a] Felzenszwalb, P. F. and Girshick, R. B. and McAllester, D. and Ramanan, D. *Object Detection with Discriminatively Trained Part Based Models*. PAMI, vol. 32, no. 9, pp. 1627-1645, September 2010
.. [Felzenszwalb2010b] Felzenszwalb, P. F. and Girshick, R. B. and McAllester, D. *Cascade Object Detection with Deformable Part Models*. CVPR 2010, pp. 2241-2248
.. _LSDDetector:
Line Segments Detector
======================
Lines extraction methodology
----------------------------
The lines extraction methodology described in the following is mainly based on [EDLN]_.
The extraction starts with a Gaussian pyramid generated from an original image, downsampled N-1 times, blurred N times, to obtain N layers (one for each octave), with layer 0 corresponding to input image. Then, from each layer (octave) in the pyramid, lines are extracted using LSD algorithm.
Differently from EDLine lines extractor used in original article, LSD furnishes information only about lines extremes; thus, additional information regarding slope and equation of line are computed via analytic methods. The number of pixels is obtained using *LineIterator*. Extracted lines are returned in the form of KeyLine objects, but since extraction is based on a method different from the one used in *BinaryDescriptor* class, data associated to a line's extremes in original image and in octave it was extracted from, coincide. KeyLine's field *class_id* is used as an index to indicate the order of extraction of a line inside a single octave.
LSDDetector::createLSDDetector
------------------------------
Creates ad LSDDetector object, using smart pointers.
.. ocv:function:: Ptr<LSDDetector> LSDDetector::createLSDDetector()
LSDDetector::detect
-------------------
Detect lines inside an image.
.. ocv:function:: void LSDDetector::detect( const Mat& image, std::vector<KeyLine>& keylines, int scale, int numOctaves, const Mat& mask=Mat())
.. ocv:function:: void LSDDetector::detect( const std::vector<Mat>& images, std::vector<std::vector<KeyLine> >& keylines, int scale, int numOctaves, const std::vector<Mat>& masks=std::vector<Mat>() ) const
:param image: input image
:param images: input images
:param keylines: vector or set of vectors that will store extracted lines for one or more images
:param mask: mask matrix to detect only KeyLines of interest
:param masks: vector of mask matrices to detect only KeyLines of interest from each input image
:param scale: scale factor used in pyramids generation
:param numOctaves: number of octaves inside pyramid
References
----------
.. [EDLN] Von Gioi, R. Grompone, et al. *LSD: A fast line segment detector with a false detection control*, IEEE Transactions on Pattern Analysis and Machine Intelligence 32.4 (2010): 722-732.
.. _binary_descriptor:
BinaryDescriptor Class
======================
.. highlight:: cpp
BinaryDescriptor Class implements both functionalities for detection of lines and computation of their binary descriptor. Class' interface is mainly based on the ones of classical detectors and extractors, such as Feature2d's `FeatureDetector <http://docs.opencv.org/modules/features2d/doc/common_interfaces_of_feature_detectors.html?highlight=featuredetector#featuredetector>`_ and `DescriptorExtractor <http://docs.opencv.org/modules/features2d/doc/common_interfaces_of_descriptor_extractors.html?highlight=extractor#DescriptorExtractor : public Algorithm>`_.
Retrieved information about lines is stored in *KeyLine* objects.
BinaryDescriptor::Params
-----------------------------------------------------------------------
.. ocv:struct:: BinaryDescriptor::Params
List of BinaryDescriptor parameters::
struct CV_EXPORTS_W_SIMPLE Params{
CV_WRAP Params();
/* the number of image octaves (default = 1) */
CV_PROP_RW int numOfOctave_;
/* the width of band; (default = 7) */
CV_PROP_RW int widthOfBand_;
/* image's reduction ratio in construction of Gaussian pyramids (default = 2) */
CV_PROP_RW int reductionRatio;
/* read parameters from a FileNode object and store them (struct function) */
void read( const FileNode& fn );
/* store parameters to a FileStorage object (struct function) */
void write( FileStorage& fs ) const;
};
BinaryDescriptor::BinaryDescriptor
----------------------------------
Constructor
.. ocv:function:: bool BinaryDescriptor::BinaryDescriptor( const BinaryDescriptor::Params &parameters = BinaryDescriptor::Params() )
:param parameters: configuration parameters :ocv:struct:`BinaryDescriptor::Params`
If no argument is provided, constructor sets default values (see comments in the code snippet in previous section). Default values are strongly reccomended.
BinaryDescriptor::getNumOfOctaves
---------------------------------
Get current number of octaves
.. ocv:function:: int BinaryDescriptor::getNumOfOctaves()
BinaryDescriptor::setNumOfOctaves
---------------------------------
Set number of octaves
.. ocv:function:: void BinaryDescriptor::setNumOfOctaves( int octaves )
:param octaves: number of octaves
BinaryDescriptor::getWidthOfBand
--------------------------------
Get current width of bands
.. ocv:function:: int BinaryDescriptor::getWidthOfBand()
BinaryDescriptor::setWidthOfBand
--------------------------------
Set width of bands
.. ocv:function:: void BinaryDescriptor::setWidthOfBand( int width )
:param width: width of bands
BinaryDescriptor::getReductionRatio
-----------------------------------
Get current reduction ratio (used in Gaussian pyramids)
.. ocv:function:: int BinaryDescriptor::getReductionRatio()
BinaryDescriptor::setReductionRatio
-----------------------------------
Set reduction ratio (used in Gaussian pyramids)
.. ocv:function:: void BinaryDescriptor::setReductionRatio( int rRatio )
:param rRatio: reduction ratio
BinaryDescriptor::createBinaryDescriptor
----------------------------------------
Create a BinaryDescriptor object with default parameters (or with the ones provided) and return a smart pointer to it
.. ocv:function:: Ptr<BinaryDescriptor> BinaryDescriptor::createBinaryDescriptor()
.. ocv:function:: Ptr<BinaryDescriptor> BinaryDescriptor::createBinaryDescriptor( Params parameters )
BinaryDescriptor::operator()
----------------------------
Define operator '()' to perform detection of KeyLines and computation of descriptors in a row.
.. ocv:function:: void BinaryDescriptor::operator()( InputArray image, InputArray mask, vector<KeyLine>& keylines, OutputArray descriptors, bool useProvidedKeyLines=false, bool returnFloatDescr ) const
:param image: input image
:param mask: mask matrix to select which lines in KeyLines must be accepted among the ones extracted (used when *keylines* is not empty)
:param keylines: vector that contains input lines (when filled, the detection part will be skipped and input lines will be passed as input to the algorithm computing descriptors)
:param descriptors: matrix that will store final descriptors
:param useProvidedKeyLines: flag (when set to true, detection phase will be skipped and only computation of descriptors will be executed, using lines provided in *keylines*)
:param returnFloatDescr: flag (when set to true, original non-binary descriptors are returned)
BinaryDescriptor::read
----------------------
Read parameters from a FileNode object and store them
.. ocv:function:: void BinaryDescriptor::read( const FileNode& fn )
:param fn: source FileNode file
BinaryDescriptor::write
-----------------------
Store parameters to a FileStorage object
.. ocv:function:: void BinaryDescriptor::write( FileStorage& fs ) const
:param fs: output FileStorage file
BinaryDescriptor::defaultNorm
-----------------------------
Return norm mode
.. ocv:function:: int BinaryDescriptor::defaultNorm() const
BinaryDescriptor::descriptorType
--------------------------------
Return data type
.. ocv:function:: int BinaryDescriptor::descriptorType() const
BinaryDescriptor::descriptorSize
--------------------------------
Return descriptor size
.. ocv:function:: int BinaryDescriptor::descriptorSize() const
BinaryDescriptor::detect
------------------------
Requires line detection (for one or more images)
.. ocv:function:: void detect( const Mat& image, vector<KeyLine>& keylines, Mat& mask=Mat() )
.. ocv:function:: void detect( const vector<Mat>& images, vector<vector<KeyLine> >& keylines, vector<Mat>& masks=vector<Mat>() ) const
:param image: input image
:param images: input images
:param keylines: vector or set of vectors that will store extracted lines for one or more images
:param mask: mask matrix to detect only KeyLines of interest
:param masks: vector of mask matrices to detect only KeyLines of interest from each input image
BinaryDescriptor::compute
-------------------------
Requires descriptors computation (for one or more images)
.. ocv:function:: void compute( const Mat& image, vector<KeyLine>& keylines, Mat& descriptors, bool returnFloatDescr ) const
.. ocv:function:: void compute( const vector<Mat>& images, vector<vector<KeyLine> >& keylines, vector<Mat>& descriptors, bool returnFloatDescr ) const
:param image: input image
:param images: input images
:param keylines: vector or set of vectors containing lines for which descriptors must be computed
:param mask: mask to select for which lines, among the ones provided in input, descriptors must be computed
:param masks: set of masks to select for which lines, among the ones provided in input, descriptors must be computed
:param returnFloatDescr: flag (when set to true, original non-binary descriptors are returned)
Related pages
-------------
* :ref:`line_descriptor`
* :ref:`matching`
* :ref:`drawing`
.. _drawing:
Drawing Functions for Keylines and Matches
==========================================
.. highlight:: cpp
drawLineMatches
---------------
Draws the found matches of keylines from two images.
.. ocv:function:: void drawLineMatches( const Mat& img1, const std::vector<KeyLine>& keylines1, const Mat& img2, const std::vector<KeyLine>& keylines2, const std::vector<DMatch>& matches1to2, Mat& outImg, const Scalar& matchColor=Scalar::all(-1), const Scalar& singleLineColor=Scalar::all(-1), const std::vector<char>& matchesMask=std::vector<char>(), int flags=DrawLinesMatchesFlags::DEFAULT )
:param img1: first image
:param keylines1: keylines extracted from first image
:param img2: second image
:param keylines2: keylines extracted from second image
:param matches1to2: vector of matches
:param outImg: output matrix to draw on
:param matchColor: drawing color for matches (chosen randomly in case of default value)
:param singleLineColor: drawing color for keylines (chosen randomly in case of default value)
:param matchesMask: mask to indicate which matches must be drawn
:param flags: drawing flags
.. note:: If both *matchColor* and *singleLineColor* are set to their default values, function draws matched lines and line connecting them with same color
The structure of drawing flags is shown in the following:
.. code-block:: cpp
/* struct for drawing options */
struct CV_EXPORTS DrawLinesMatchesFlags
{
enum
{
DEFAULT = 0, // Output image matrix will be created (Mat::create),
// i.e. existing memory of output image may be reused.
// Two source images, matches, and single keylines
// will be drawn.
DRAW_OVER_OUTIMG = 1, // Output image matrix will not be
// created (using Mat::create). Matches will be drawn
// on existing content of output image.
NOT_DRAW_SINGLE_LINES = 2 // Single keylines will not be drawn.
};
};
..
drawKeylines
------------
Draws keylines.
.. ocv:function:: void drawKeylines( const Mat& image, const std::vector<KeyLine>& keylines, Mat& outImage, const Scalar& color=Scalar::all(-1), int flags=DrawLinesMatchesFlags::DEFAULT )
:param image: input image
:param keylines: keylines to be drawn
:param outImage: output image to draw on
:param color: color of lines to be drawn (if set to defaul value, color is chosen randomly)
:param flags: drawing flags
Related pages
-------------
* :ref:`line_descriptor`
* :ref:`binary_descriptor`
* :ref:`matching`
.. _line_descriptor:
Binary descriptors for lines extracted from an image
====================================================
.. highlight:: cpp
Introduction
------------
One of the most challenging activities in computer vision is the extraction of useful information from a given image. Such information, usually comes in the form of points that preserve some kind of property (for instance, they are scale-invariant) and are actually representative of input image.
The goal of this module is seeking a new kind of representative information inside an image and providing the functionalities for its extraction and representation. In particular, differently from previous methods for detection of relevant elements inside an image, lines are extracted in place of points; a new class is defined ad hoc to summarize a line's properties, for reuse and plotting purposes.
A class to represent a line: KeyLine
------------------------------------
As aformentioned, it is been necessary to design a class that fully stores the information needed to characterize completely a line and plot it on image it was extracted from, when required.
*KeyLine* class has been created for such goal; it is mainly inspired to Feature2d's KeyPoint class, since KeyLine shares some of *KeyPoint*'s fields, even if a part of them assumes a different meaning, when speaking about lines.
In particular:
* the *class_id* field is used to gather lines extracted from different octaves which refer to same line inside original image (such lines and the one they represent in original image share the same *class_id* value)
* the *angle* field represents line's slope with respect to (positive) X axis
* the *pt* field represents line's midpoint
* the *response* field is computed as the ratio between the line's length and maximum between image's width and height
* the *size* field is the area of the smallest rectangle containing line
Apart from fields inspired to KeyPoint class, KeyLines stores information about extremes of line in original image and in octave it was extracted from, about line's length and number of pixels it covers. Code relative to KeyLine class is reported in the following snippet:
.. ocv:class:: KeyLine
::
class CV_EXPORTS_W KeyLine
{
public:
/* orientation of the line */
float angle;
/* object ID, that can be used to cluster keylines by the line they represent */
int class_id;
/* octave (pyramid layer), from which the keyline has been extracted */
int octave;
/* coordinates of the middlepoint */
Point pt;
/* the response, by which the strongest keylines have been selected.
It's represented by the ratio between line's length and maximum between
image's width and height */
float response;
/* minimum area containing line */
float size;
/* lines's extremes in original image */
float startPointX;
float startPointY;
float endPointX;
float endPointY;
/* line's extremes in image it was extracted from */
float sPointInOctaveX;
float sPointInOctaveY;
float ePointInOctaveX;
float ePointInOctaveY;
/* the length of line */
float lineLength;
/* number of pixels covered by the line */
unsigned int numOfPixels;
/* constructor */
KeyLine(){}
};
Computation of binary descriptors
---------------------------------
To obtatin a binary descriptor representing a certain line detected from a certain octave of an image, we first compute a non-binary descriptor as described in [LBD]_. Such algorithm works on lines extracted using EDLine detector, as explained in [EDL]_. Given a line, we consider a rectangular region centered at it and called *line support region (LSR)*. Such region is divided into a set of bands :math:`\{B_1, B_2, ..., B_m\}`, whose length equals the one of line.
If we indicate with :math:`\bf{d}_L` the direction of line, the orthogonal and clockwise direction to line :math:`\bf{d}_{\perp}` can be determined; these two directions, are used to construct a reference frame centered in the middle point of line. The gradients of pixels :math:`\bf{g'}` inside LSR can be projected to the newly determined frame, obtaining their local equivalent :math:`\bf{g'} = (\bf{g}^T \cdot \bf{d}_{\perp}, \bf{g}^T \cdot \bf{d}_L)^T \triangleq (\bf{g'}_{d_{\perp}}, \bf{g'}_{d_L})^T`.
Later on, a Gaussian function is applied to all LSR's pixels along :math:`\bf{d}_\perp` direction; first, we assign a global weighting coefficient :math:`f_g(i) = (1/\sqrt{2\pi}\sigma_g)e^{-d^2_i/2\sigma^2_g}` to *i*-th row in LSR, where :math:`d_i` is the distance of *i*-th row from the center row in LSR, :math:`\sigma_g = 0.5(m \cdot w - 1)` and :math:`w` is the width of bands (the same for every band). Secondly, considering a band :math:`B_j` and its neighbor bands :math:`B_{j-1}, B_{j+1}`, we assign a local weighting :math:`F_l(k) = (1/\sqrt{2\pi}\sigma_l)e^{-d'^2_k/2\sigma_l^2}`, where :math:`d'_k` is the distance of *k*-th row from the center row in :math:`B_j` and :math:`\sigma_l = w`. Using the global and local weights, we obtain, at the same time, the reduction of role played by gradients far from line and of boundary effect, respectively.
Each band :math:`B_j` in LSR has an associated *band descriptor(BD)* which is computed considering previous and next band (top and bottom bands are ignored when computing descriptor for first and last band). Once each band has been assignen its BD, the LBD descriptor of line is simply given by
.. math::
LBD = (BD_1^T, BD_2^T, ... , BD^T_m)^T.
To compute a band descriptor :math:`B_j`, each *k*-th row in it is considered and the gradients in such row are accumulated:
.. math::
\begin{matrix} \bf{V1}^k_j = \lambda \sum\limits_{\bf{g}'_{d_\perp}>0}\bf{g}'_{d_\perp}, & \bf{V2}^k_j = \lambda \sum\limits_{\bf{g}'_{d_\perp}<0} -\bf{g}'_{d_\perp}, \\ \bf{V3}^k_j = \lambda \sum\limits_{\bf{g}'_{d_L}>0}\bf{g}'_{d_L}, & \bf{V4}^k_j = \lambda \sum\limits_{\bf{g}'_{d_L}<0} -\bf{g}'_{d_L}\end{matrix}.
with :math:`\lambda = f_g(k)f_l(k)`.
By stacking previous results, we obtain the *band description matrix (BDM)*
.. math::
BDM_j = \left(\begin{matrix} \bf{V1}_j^1 & \bf{V1}_j^2 & \ldots & \bf{V1}_j^n \\ \bf{V2}_j^1 & \bf{V2}_j^2 & \ldots & \bf{V2}_j^n \\ \bf{V3}_j^1 & \bf{V3}_j^2 & \ldots & \bf{V3}_j^n \\ \bf{V4}_j^1 & \bf{V4}_j^2 & \ldots & \bf{V4}_j^n \end{matrix} \right) \in \mathbb{R}^{4\times n},
with :math:`n` the number of rows in band :math:`B_j`:
.. math::
n = \begin{cases} 2w, & j = 1||m; \\ 3w, & \mbox{else}. \end{cases}
Each :math:`BD_j` can be obtained using the standard deviation vector :math:`S_j` and mean vector :math:`M_j` of :math:`BDM_J`. Thus, finally:
.. math::
LBD = (M_1^T, S_1^T, M_2^T, S_2^T, \ldots, M_m^T, S_m^T)^T \in \mathbb{R}^{8m}
Once the LBD has been obtained, it must be converted into a binary form. For such purpose, we consider 32 possible pairs of BD inside it; each couple of BD is compared bit by bit and comparison generates an 8 bit string. Concatenating 32 comparison strings, we get the 256-bit final binary representation of a single LBD.
References
----------
.. [LBD] Zhang, Lilian, and Reinhard Koch. *An efficient and robust line segment matching approach based on LBD descriptor and pairwise geometric consistency*, Journal of Visual Communication and Image Representation 24.7 (2013): 794-805.
.. [EDL] Von Gioi, R. Grompone, et al. *LSD: A fast line segment detector with a false detection control*, IEEE Transactions on Pattern Analysis and Machine Intelligence 32.4 (2010): 722-732.
Summary
-------
.. toctree::
:maxdepth: 2
binary_descriptor
LSDDetector
matching
drawing_functions
tutorial
.. _matching:
Matching with binary descriptors
================================
.. highlight:: cpp
Once descriptors have been extracted from an image (both they represent lines and points), it becomes interesting to be able to match a descriptor with another one extracted from a different image and representing the same line or point, seen from a differente perspective or on a different scale.
In reaching such goal, the main headache is designing an efficient search algorithm to associate a query descriptor to one extracted from a dataset.
In the following, a matching modality based on *Multi-Index Hashing (MiHashing)* will be described.
Multi-Index Hashing
-------------------
The theory described in this section is based on [MIH]_.
Given a dataset populated with binary codes, each code is indexed *m* times into *m* different hash tables, according to *m* substrings it has been divided into. Thus, given a query code, all the entries close to it at least in one substring are returned by search as *neighbor candidates*. Returned entries are then checked for validity by verifying that their full codes are not distant (in Hamming space) more than *r* bits from query code.
In details, each binary code **h** composed of *b* bits is divided into *m* disjoint substrings :math:`\mathbf{h}^{(1)}, ..., \mathbf{h}^{(m)}`, each with length :math:`\lfloor b/m \rfloor` or :math:`\lceil b/m \rceil` bits. Formally, when two codes **h** and **g** differ by at the most *r* bits, in at the least one of their *m* substrings they differ by at the most :math:`\lfloor r/m \rfloor` bits. In particular, when :math:`||\mathbf{h}-\mathbf{g}||_H \le r` (where :math:`||.||_H` is the Hamming norm), there must exist a substring *k* (with :math:`1 \le k \le m`) such that
.. math::
||\mathbf{h}^{(k)} - \mathbf{g}^{(k)}||_H \le \left\lfloor \frac{r}{m} \right\rfloor .
That means that if Hamming distance between each of the *m* substring is strictly greater than :math:`\lfloor r/m \rfloor`, then :math:`||\mathbf{h}-\mathbf{g}||_H` must be larger that *r* and that is a contradiction.
If the codes in dataset are divided into *m* substrings, then *m* tables will be built. Given a query **q** with substrings :math:`\{\mathbf{q}^{(i)}\}^m_{i=1}`, *i*-th hash table is searched for entries distant at the most :math:`\lfloor r/m \rfloor` from :math:`\mathbf{q}^{(i)}` and a set of candidates :math:`\mathcal{N}_i(\mathbf{q})` is obtained.
The union of sets :math:`\mathcal{N}(\mathbf{q}) = \bigcup_i \mathcal{N}_i(\mathbf{q})` is a superset of the *r*-neighbors of **q**. Then, last step of algorithm is computing the Hamming distance between **q** and each element in :math:`\mathcal{N}(\mathbf{q})`, deleting the codes that are distant more that *r* from **q**.
BinaryDescriptorMatcher Class
=============================
BinaryDescriptorMatcher Class furnishes all functionalities for querying a dataset provided by user or internal to class (that user must, anyway, populate) on the model of Feature2d's `DescriptorMatcher <http://docs.opencv.org/modules/features2d/doc/common_interfaces_of_descriptor_matchers.html?highlight=bfmatcher#descriptormatcher>`_.
BinaryDescriptorMatcher::BinaryDescriptorMatcher
--------------------------------------------------
Constructor.
.. ocv:function:: BinaryDescriptorMatcher::BinaryDescriptorMatcher()
The BinaryDescriptorMatcher constructed is able to store and manage 256-bits long entries.
BinaryDescriptorMatcher::createBinaryDescriptorMatcher
------------------------------------------------------
Create a BinaryDescriptorMatcher object and return a smart pointer to it.
.. ocv:function:: Ptr<BinaryDescriptorMatcher> BinaryDescriptorMatcher::createBinaryDescriptorMatcher()
BinaryDescriptorMatcher::add
----------------------------
Store locally new descriptors to be inserted in dataset, without updating dataset.
.. ocv:function:: void BinaryDescriptorMatcher::add( const std::vector<Mat>& descriptors )
:param descriptors: matrices containing descriptors to be inserted into dataset
.. note:: Each matrix *i* in **descriptors** should contain descriptors relative to lines extracted from *i*-th image.
BinaryDescriptorMatcher::train
------------------------------
Update dataset by inserting into it all descriptors that were stored locally by *add* function.
.. ocv:function:: void BinaryDescriptorMatcher::train()
.. note:: Every time this function is invoked, current dataset is deleted and locally stored descriptors are inserted into dataset. The locally stored copy of just inserted descriptors is then removed.
BinaryDescriptorMatcher::clear
------------------------------
Clear dataset and internal data
.. ocv:function:: void BinaryDescriptorMatcher::clear()
BinaryDescriptorMatcher::match
------------------------------
For every input query descriptor, retrieve the best matching one from a dataset provided from user or from the one internal to class
.. ocv:function:: void BinaryDescriptorMatcher::match( const Mat& queryDescriptors, const Mat& trainDescriptors, std::vector<DMatch>& matches, const Mat& mask=Mat() ) const
.. ocv:function:: void BinaryDescriptorMatcher::match( const Mat& queryDescriptors, std::vector<DMatch>& matches, const std::vector<Mat>& masks=std::vector<Mat>() )
:param queryDescriptors: query descriptors
:param trainDescriptors: dataset of descriptors furnished by user
:param matches: vector to host retrieved matches
:param mask: mask to select which input descriptors must be matched to one in dataset
:param masks: vector of masks to select which input descriptors must be matched to one in dataset (the *i*-th mask in vector indicates whether each input query can be matched with descriptors in dataset relative to *i*-th image)
BinaryDescriptorMatcher::knnMatch
---------------------------------
For every input query descriptor, retrieve the best *k* matching ones from a dataset provided from user or from the one internal to class
.. ocv:function:: void BinaryDescriptorMatcher::knnMatch( const Mat& queryDescriptors, const Mat& trainDescriptors, std::vector<std::vector<DMatch> >& matches, int k, const Mat& mask=Mat(), bool compactResult=false ) const
.. ocv:function:: void BinaryDescriptorMatcher::knnMatch( const Mat& queryDescriptors, std::vector<std::vector<DMatch> >& matches, int k, const std::vector<Mat>& masks=std::vector<Mat>(), bool compactResult=false )
:param queryDescriptors: query descriptors
:param trainDescriptors: dataset of descriptors furnished by user
:param matches: vector to host retrieved matches
:param k: number of the closest descriptors to be returned for every input query
:param mask: mask to select which input descriptors must be matched to ones in dataset
:param masks: vector of masks to select which input descriptors must be matched to ones in dataset (the *i*-th mask in vector indicates whether each input query can be matched with descriptors in dataset relative to *i*-th image)
:param compactResult: flag to obtain a compact result (if true, a vector that doesn't contain any matches for a given query is not inserted in final result)
BinaryDescriptorMatcher::radiusMatch
------------------------------------
For every input query descriptor, retrieve, from a dataset provided from user or from the one internal to class, all the descriptors that are not further than *maxDist* from input query
.. ocv:function:: void BinaryDescriptorMatcher::radiusMatch( const Mat& queryDescriptors, const Mat& trainDescriptors, std::vector<std::vector<DMatch> >& matches, float maxDistance, const Mat& mask=Mat(), bool compactResult=false ) const
.. ocv:function:: void BinaryDescriptorMatcher::radiusMatch( const Mat& queryDescriptors, std::vector<std::vector<DMatch> >& matches, float maxDistance, const std::vector<Mat>& masks=std::vector<Mat>(), bool compactResult=false )
:param queryDescriptors: query descriptors
:param trainDescriptors: dataset of descriptors furnished by user
:param matches: vector to host retrieved matches
:param maxDist: search radius
:param mask: mask to select which input descriptors must be matched to ones in dataset
:param masks: vector of masks to select which input descriptors must be matched to ones in dataset (the *i*-th mask in vector indicates whether each input query can be matched with descriptors in dataset relative to *i*-th image)
:param compactResult: flag to obtain a compact result (if true, a vector that doesn't contain any matches for a given query is not inserted in final result)
Related pages
-------------
* :ref:`line_descriptor`
* :ref:`binary_descriptor`
* :ref:`drawing`
References
----------
.. [MIH] Norouzi, Mohammad, Ali Punjani, and David J. Fleet. *Fast search in hamming space with multi-index hashing*, Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
**************************
reg. Image Registration
**************************
.. toctree::
:maxdepth: 2
registration
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment