Commit ec2b3ed6 authored by Vadim Pisarevsky's avatar Vadim Pisarevsky

Merge pull request #185 from mshabunin/remove-docs

Removed sphinx documentation
parents f861dc9e 380e6111
.. _Table-Of-Content-Bioinspired:
*bioinspired* module. Algorithms inspired from biological models
----------------------------------------------------------------
Here you will learn how to use additional modules of OpenCV defined in the "bioinspired" module.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|RetinaDemoImg| **Title:** :ref:`Retina_Model`
*Compatibility:* > OpenCV 2.4
*Author:* |Author_AlexB|
You will learn how to process images and video streams with a model of retina filter for details enhancement, spatio-temporal noise removal, luminance correction and spatio-temporal events detection.
=============== ======================================================
.. |RetinaDemoImg| image:: images/retina_TreeHdr_small.jpg
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../retina_model/retina_model
.. _Table-Of-Content-CVV:
*cvv* module. GUI for Interactive Visual Debugging
--------------------------------------------------
Here you will learn how to use the cvv module to ease programming computer vision software through visual debugging aids.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|cvvIntro| *Title:* :ref:`Visual_Debugging_Introduction`
*Compatibility:* > OpenCV 2.4.8
*Author:* |Author_Bihlmaier|
We will learn how to debug our applications in a visual and interactive way.
=============== ======================================================
.. |cvvIntro| image:: images/Visual_Debugging_Introduction_Tutorial_Cover.jpg
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../visual_debugging_introduction/visual_debugging_introduction
.. ximgproc:
Structured forests for fast edge detection
******************************************
Introduction
------------
In this tutorial you will learn how to use structured forests for the purpose of edge detection in an image.
Examples
--------
.. image:: images/01.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/02.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/03.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/04.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/05.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/06.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/07.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/08.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/09.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/10.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/11.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/12.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
**Note :** binarization techniques like Canny edge detector are applicable
to edges produced by both algorithms (``Sobel`` and ``StructuredEdgeDetection::detectEdges``).
Source Code
-----------
.. literalinclude:: ../../../../modules/ximpgroc/samples/cpp/structured_edge_detection.cpp
:language: cpp
:linenos:
:tab-width: 4
Explanation
-----------
1. **Load source color image**
.. code-block:: cpp
cv::Mat image = cv::imread(inFilename, 1);
if ( image.empty() )
{
printf("Cannot read image file: %s\n", inFilename.c_str());
return -1;
}
2. **Convert source image to [0;1] range**
.. code-block:: cpp
image.convertTo(image, cv::DataType<float>::type, 1/255.0);
3. **Run main algorithm**
.. code-block:: cpp
cv::Mat edges(image.size(), image.type());
cv::Ptr<StructuredEdgeDetection> pDollar =
cv::createStructuredEdgeDetection(modelFilename);
pDollar->detectEdges(image, edges);
4. **Show results**
.. code-block:: cpp
if ( outFilename == "" )
{
cv::namedWindow("edges", 1);
cv::imshow("edges", edges);
cv::waitKey(0);
}
else
cv::imwrite(outFilename, 255*edges);
Literature
----------
For more information, refer to the following papers :
.. [Dollar2013] Dollar P., Zitnick C. L., "Structured forests for fast edge detection",
IEEE International Conference on Computer Vision (ICCV), 2013,
pp. 1841-1848. `DOI <http://dx.doi.org/10.1109/ICCV.2013.231>`_
.. [Lim2013] Lim J. J., Zitnick C. L., Dollar P., "Sketch Tokens: A Learned
Mid-level Representation for Contour and Object Detection",
Comoputer Vision and Pattern Recognition (CVPR), 2013,
pp. 3158-3165. `DOI <http://dx.doi.org/10.1109/CVPR.2013.406>`_
......@@ -57,7 +57,7 @@ namespace bgsegm
/** @brief Gaussian Mixture-based Background/Foreground Segmentation Algorithm.
The class implements the algorithm described in @cite KB2001.
The class implements the algorithm described in @cite KB2001 .
*/
class CV_EXPORTS_W BackgroundSubtractorMOG : public BackgroundSubtractor
{
......@@ -86,9 +86,9 @@ means some automatic value.
CV_EXPORTS_W Ptr<BackgroundSubtractorMOG>
createBackgroundSubtractorMOG(int history=200, int nmixtures=5,
double backgroundRatio=0.7, double noiseSigma=0);
/** @brief Background Subtractor module based on the algorithm given in @cite Gold2012.
/** @brief Background Subtractor module based on the algorithm given in @cite Gold2012 .
Takes a series of images and returns a sequence of mask (8UC1)
images of the same size, where 255 indicates Foreground and 0 represents Background.
......
********************************************************************
bioinspired. Biologically inspired vision models and derivated tools
********************************************************************
The module provides biological visual systems models (human visual system and others). It also provides derivated objects that take advantage of those bio-inspired models.
.. toctree::
:maxdepth: 2
Human retina documentation <retina>
Retina : a Bio mimetic human retina model {#bioinspired_retina}
=========================================
Bioinspired Module Retina Introduction {#bioinspired_retina}
======================================
Retina
------
**Note** : do not forget that the retina model is included in the following namespace :
*cv::bioinspired*.
@note do not forget that the retina model is included in the following namespace : cv::bioinspired
### Introduction
......@@ -18,14 +17,13 @@ separable spatio-temporal filter modelling the two main retina information chann
From a general point of view, this filter whitens the image spectrum and corrects luminance thanks
to local adaptation. An other important property is its hability to filter out spatio-temporal noise
while enhancing details. This model originates from Jeanny Herault work @cite Herault2010. It has been
while enhancing details. This model originates from Jeanny Herault work @cite Herault2010 . It has been
involved in Alexandre Benoit phd and his current research @cite Benoit2010, @cite Strat2013 (he
currently maintains this module within OpenCV). It includes the work of other Jeanny's phd student
such as @cite Chaix2007 and the log polar transformations of Barthelemy Durette described in Jeanny's
book.
**NOTES :**
@note
- For ease of use in computer vision applications, the two retina channels are applied
homogeneously on all the input images. This does not follow the real retina topology but this
can still be done using the log sampling capabilities proposed within the class.
......@@ -71,7 +69,7 @@ described hereafter. XML parameters file samples are shown at the end of the pag
Here is an overview of the abstract Retina interface, allocate one instance with the *createRetina*
functions.:
@code{.cpp}
namespace cv{namespace bioinspired{
class Retina : public Algorithm
......@@ -122,6 +120,7 @@ functions.:
cv::Ptr<Retina> createRetina (Size inputSize);
cv::Ptr<Retina> createRetina (Size inputSize, const bool colorMode, RETINA_COLORSAMPLINGMETHOD colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const double reductionFactor=1.0, const double samplingStrenght=10.0);
}} // cv and bioinspired namespaces end
@endcode
### Description
......@@ -146,59 +145,47 @@ Use : this model can be used basically for spatio-temporal video effects but als
- performing motion analysis also taking benefit of the previously cited properties (check out the
magnocellular retina channel output, by using the provided **getMagno** methods)
- general image/video sequence description using either one or both channels. An example of the
use of Retina in a Bag of Words approach is given in @cite Strat2013.
use of Retina in a Bag of Words approach is given in @cite Strat2013 .
Literature
----------
For more information, refer to the following papers :
- Model description :
[Benoit2010] Benoit A., Caplier A., Durette B., Herault, J., "Using Human Visual System Modeling For Bio-Inspired Low Level Image Processing", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773. DOI <http://dx.doi.org/10.1016/j.cviu.2010.01.011>
- Model use in a Bag of Words approach :
- Model description : @cite Benoit2010
[Strat2013] Strat S., Benoit A., Lambert P., "Retina enhanced SIFT descriptors for video indexing", CBMI2013, Veszprém, Hungary, 2013.
- Model use in a Bag of Words approach : @cite Strat2013
- Please have a look at the reference work of Jeanny Herault that you can read in his book :
[Herault2010] Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
- Please have a look at the reference work of Jeanny Herault that you can read in his book : @cite Herault2010
This retina filter code includes the research contributions of phd/research collegues from which
code has been redrawn by the author :
- take a look at the *retinacolor.hpp* module to discover Brice Chaix de Lavarene phD color
mosaicing/demosaicing and his reference paper:
[Chaix2007] B. Chaix de Lavarene, D. Alleysson, B. Durette, J. Herault (2007). "Efficient demosaicing through recursive filtering", IEEE International Conference on Image Processing ICIP 2007
mosaicing/demosaicing and his reference paper: @cite Chaix2007
- take a look at *imagelogpolprojection.hpp* to discover retina spatial log sampling which
originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is
also proposed and originates from Jeanny's discussions. More informations in the above cited
Jeanny Heraults's book.
- Meylan&al work on HDR tone mapping that is implemented as a specific method within the model :
[Meylan2007] L. Meylan , D. Alleysson, S. Susstrunk, "A Model of Retinal Local Adaptation for the Tone Mapping of Color Filter Array Images", Journal of Optical Society of America, A, Vol. 24, N 9, September, 1st, 2007, pp. 2807-2816
- Meylan&al work on HDR tone mapping that is implemented as a specific method within the model : @cite Meylan2007
Demos and experiments !
-----------------------
**NOTE : Complementary to the following examples, have a look at the Retina tutorial in the
@note Complementary to the following examples, have a look at the Retina tutorial in the
tutorial/contrib section for complementary explanations.**
Take a look at the provided C++ examples provided with OpenCV :
- **samples/cpp/retinademo.cpp** shows how to use the retina module for details enhancement (Parvo channel output) and transient maps observation (Magno channel output). You can play with images, video sequences and webcam video.
Typical uses are (provided your OpenCV installation is situated in folder
*OpenCVReleaseFolder*)
- image processing : **OpenCVReleaseFolder/bin/retinademo -image myPicture.jpg**
- video processing : **OpenCVReleaseFolder/bin/retinademo -video myMovie.avi**
- webcam processing: **OpenCVReleaseFolder/bin/retinademo -video**
- **samples/cpp/retinademo.cpp** shows how to use the retina module for details enhancement (Parvo channel output) and transient maps observation (Magno channel output). You can play with images, video sequences and webcam video.
Typical uses are (provided your OpenCV installation is situated in folder *OpenCVReleaseFolder*)
- image processing : **OpenCVReleaseFolder/bin/retinademo -image myPicture.jpg**
- video processing : **OpenCVReleaseFolder/bin/retinademo -video myMovie.avi**
- webcam processing: **OpenCVReleaseFolder/bin/retinademo -video**
**Note :** This demo generates the file *RetinaDefaultParameters.xml* which contains the
@note This demo generates the file *RetinaDefaultParameters.xml* which contains the
default parameters of the retina. Then, rename this as *RetinaSpecificParameters.xml*, adjust
the parameters the way you want and reload the program to check the effect.
......@@ -217,7 +204,7 @@ Take a look at the provided C++ examples provided with OpenCV :
Note that some sliders are made available to allow you to play with luminance compression.
If not using the 'fast' option, then, tone mapping is performed using the full retina model
@cite Benoit2010. It includes spectral whitening that allows luminance energy to be reduced.
@cite Benoit2010 . It includes spectral whitening that allows luminance energy to be reduced.
When using the 'fast' option, then, a simpler method is used, it is an adaptation of the
algorithm presented in @cite Meylan2007. This method gives also good results and is faster to
algorithm presented in @cite Meylan2007 . This method gives also good results and is faster to
process but it sometimes requires some more parameters adjustement.
This diff is collapsed.
This diff is collapsed.
Custom Calibration Pattern
==========================
.. highlight:: cpp
CustomPattern
-------------
A custom pattern class that can be used to calibrate a camera and to further track the translation and rotation of the pattern. Defaultly it uses an ``ORB`` feature detector and a ``BruteForce-Hamming(2)`` descriptor matcher to find the location of the pattern feature points that will subsequently be used for calibration.
.. ocv:class:: CustomPattern : public Algorithm
CustomPattern::CustomPattern
----------------------------
CustomPattern constructor.
.. ocv:function:: CustomPattern()
CustomPattern::create
---------------------
A method that initializes the class and generates the necessary detectors, extractors and matchers.
.. ocv:function:: bool create(InputArray pattern, const Size2f boardSize, OutputArray output = noArray())
:param pattern: The image, which will be used as a pattern. If the desired pattern is part of a bigger image, you can crop it out using image(roi).
:param boardSize: The size of the pattern in physical dimensions. These will be used to scale the points when the calibration occurs.
:param output: A matrix that is the same as the input pattern image, but has all the feature points drawn on it.
:return Returns whether the initialization was successful or not. Possible reason for failure may be that no feature points were detected.
.. seealso::
:ocv:func:`getFeatureDetector`,
:ocv:func:`getDescriptorExtractor`,
:ocv:func:`getDescriptorMatcher`
.. note::
* Determine the number of detected feature points can be done through :ocv:func:`getPatternPoints` method.
* The feature detector, extractor and matcher cannot be changed after initialization.
CustomPattern::findPattern
--------------------------
Finds the pattern in the input image
.. ocv:function:: bool findPattern(InputArray image, OutputArray matched_features, OutputArray pattern_points, const double ratio = 0.7, const double proj_error = 8.0, const bool refine_position = false, OutputArray out = noArray(), OutputArray H = noArray(), OutputArray pattern_corners = noArray());
:param image: The input image where the pattern is searched for.
:param matched_features: A ``vector<Point2f>`` of the projections of calibration pattern points, matched in the image. The points correspond to the ``pattern_points``.``matched_features`` and ``pattern_points`` have the same size.
:param pattern_points: A ``vector<Point3f>`` of calibration pattern points in the calibration pattern coordinate space.
:param ratio: A ratio used to threshold matches based on D. Lowe's point ratio test.
:param proj_error: The maximum projection error that is allowed when the found points are back projected. A lower projection error will be beneficial for eliminating mismatches. Higher values are recommended when the camera lens has greater distortions.
:param refine_position: Whether to refine the position of the feature points with :ocv:func:`cornerSubPix`.
:param out: An image showing the matched feature points and a contour around the estimated pattern.
:param H: The homography transformation matrix between the pattern and the current image.
:param pattern_corners: A ``vector<Point2f>`` containing the 4 corners of the found pattern.
:return The method return whether the pattern was found or not.
CustomPattern::isInitialized
----------------------------
.. ocv:function:: bool isInitialized()
:return If the class is initialized or not.
CustomPattern::getPatternPoints
-------------------------------
.. ocv:function:: void getPatternPoints(OutputArray original_points)
:param original_points: Fills the vector with the points found in the pattern.
CustomPattern::getPixelSize
---------------------------
.. ocv:function:: double getPixelSize()
:return Get the physical pixel size as initialized by the pattern.
CustomPattern::setFeatureDetector
---------------------------------
.. ocv:function:: bool setFeatureDetector(Ptr<FeatureDetector> featureDetector)
:param featureDetector: Set a new FeatureDetector.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
.. note::
* It is left to user discretion to select matching feature detector, extractor and matchers. Please consult the documentation for each to confirm coherence.
CustomPattern::setDescriptorExtractor
-------------------------------------
.. ocv:function:: bool setDescriptorExtractor(Ptr<DescriptorExtractor> extractor)
:param extractor: Set a new DescriptorExtractor.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
CustomPattern::setDescriptorMatcher
-----------------------------------
.. ocv:function:: bool setDescriptorMatcher(Ptr<DescriptorMatcher> matcher)
:param matcher: Set a new DescriptorMatcher.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
CustomPattern::getFeatureDetector
---------------------------------
.. ocv:function:: Ptr<FeatureDetector> getFeatureDetector()
:return The used FeatureDetector.
CustomPattern::getDescriptorExtractor
-------------------------------------
.. ocv:function:: Ptr<DescriptorExtractor> getDescriptorExtractor()
:return The used DescriptorExtractor.
CustomPattern::getDescriptorMatcher
-----------------------------------
.. ocv:function:: Ptr<DescriptorMatcher> getDescriptorMatcher()
:return The used DescriptorMatcher.
CustomPattern::calibrate
------------------------
Calibrates the camera.
.. ocv:function:: double calibrate(InputArrayOfArrays objectPoints, InputArrayOfArrays imagePoints, Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs, OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs, int flags = 0, TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON))
See :ocv:func:`calibrateCamera` for parameter information.
CustomPattern::findRt
---------------------
Finds the rotation and translation vectors of the pattern.
.. ocv:function:: bool findRt(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int flags = ITERATIVE)
.. ocv:function:: bool findRt(InputArray image, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int flags = ITERATIVE)
:param image: The image, in which the rotation and translation of the pattern will be found.
See :ocv:func:`solvePnP` for parameter information.
CustomPattern::findRtRANSAC
---------------------------
Finds the rotation and translation vectors of the pattern using RANSAC.
.. ocv:function:: bool findRtRANSAC(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int iterationsCount = 100, float reprojectionError = 8.0, int minInliersCount = 100, OutputArray inliers = noArray(), int flags = ITERATIVE)
.. ocv:function:: bool findRtRANSAC(InputArray image, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int iterationsCount = 100, float reprojectionError = 8.0, int minInliersCount = 100, OutputArray inliers = noArray(), int flags = ITERATIVE)
:param image: The image, in which the rotation and translation of the pattern will be found.
See :ocv:func:`solvePnPRANSAC` for parameter information.
CustomPattern::drawOrientation
------------------------------
Draws the ``(x,y,z)`` axis on the image, in the center of the pattern, showing the orientation of the pattern.
.. ocv:function:: void drawOrientation(InputOutputArray image, InputArray tvec, InputArray rvec, InputArray cameraMatrix, InputArray distCoeffs, double axis_length = 3, int axis_width = 2)
:param image: The image, based on which the rotation and translation was calculated. The axis will be drawn in color - ``x`` - in red, ``y`` - in green, ``z`` - in blue.
:param tvec: Translation vector.
:param rvec: Rotation vector.
:param cameraMatrix: The camera matrix.
:param distCoeffs: The distortion coefficients.
:param axis_length: The length of the axis symbol.
:param axis_width: The width of the axis symbol.
*********************************************************************
cvv. GUI for Interactive Visual Debugging of Computer Vision Programs
*********************************************************************
The module provides an interactive GUI to debug and incrementally design computer vision algorithms. The debug statements can remain in the code after development and aid in further changes because they have neglectable overhead if the program is compiled in release mode.
.. toctree::
:maxdepth: 2
CVV API Documentation <cvv_api>
CVV GUI Documentation <cvv_gui>
CVV : the API
*************
.. highlight:: cpp
Introduction
++++++++++++
Namespace for all functions is **cvv**, i.e. *cvv::showImage()*.
Compilation:
* For development, i.e. for cvv GUI to show up, compile your code using cvv with *g++ -DCVVISUAL_DEBUGMODE*.
* For release, i.e. cvv calls doing nothing, compile your code without above flag.
See cvv tutorial for a commented example application using cvv.
API Functions
+++++++++++++
showImage
---------
Add a single image to debug GUI (similar to :imshow:`imshow <>`).
.. ocv:function:: void showImage(InputArray img, const CallMetaData& metaData, const string& description, const string& view)
:param img: Image to show in debug GUI.
:param metaData: Properly initialized CallMetaData struct, i.e. information about file, line and function name for GUI. Use CVVISUAL_LOCATION macro.
:param description: Human readable description to provide context to image.
:param view: Preselect view that will be used to visualize this image in GUI. Other views can still be selected in GUI later on.
debugFilter
-----------
Add two images to debug GUI for comparison. Usually the input and output of some filter operation, whose result should be inspected.
.. ocv:function:: void debugFilter(InputArray original, InputArray result, const CallMetaData& metaData, const string& description, const string& view)
:param original: First image for comparison, e.g. filter input.
:param result: Second image for comparison, e.g. filter output.
:param metaData: See :ocv:func:`showImage`
:param description: See :ocv:func:`showImage`
:param view: See :ocv:func:`showImage`
debugDMatch
-----------
Add a filled in :basicstructures:`DMatch <dmatch>` to debug GUI. The matches can are visualized for interactive inspection in different GUI views (one similar to an interactive :draw_matches:`drawMatches<>`).
.. ocv:function:: void debugDMatch(InputArray img1, std::vector<cv::KeyPoint> keypoints1, InputArray img2, std::vector<cv::KeyPoint> keypoints2, std::vector<cv::DMatch> matches, const CallMetaData& metaData, const string& description, const string& view, bool useTrainDescriptor)
:param img1: First image used in :basicstructures:`DMatch <dmatch>`.
:param keypoints1: Keypoints of first image.
:param img2: Second image used in DMatch.
:param keypoints2: Keypoints of second image.
:param metaData: See :ocv:func:`showImage`
:param description: See :ocv:func:`showImage`
:param view: See :ocv:func:`showImage`
:param useTrainDescriptor: Use :basicstructures:`DMatch <dmatch>`'s train descriptor index instead of query descriptor index.
finalShow
---------
This function **must** be called *once* *after* all cvv calls if any.
As an alternative create an instance of FinalShowCaller, which calls finalShow() in its destructor (RAII-style).
.. ocv:function:: void finalShow()
setDebugFlag
------------
Enable or disable cvv for current translation unit and thread (disabled this way has higher - but still low - overhead compared to using the compile flags).
.. ocv:function:: void setDebugFlag(bool active)
:param active: See above
CVV : the GUI
*************
.. highlight:: cpp
Introduction
++++++++++++
For now: See cvv tutorial.
Overview
++++++++
Filter
------
Views
++++++++
Interactive Visual Debugging of Computer Vision applications {#tutorial_cvv_introduction}
============================================================
What is the most common way to debug computer vision applications? Usually the answer is temporary,
hacked together, custom code that must be removed from the code for release compilation.
In this tutorial we will show how to use the visual debugging features of the **cvv** module
(*opencv2/cvv.hpp*) instead.
Goals
-----
In this tutorial you will learn how to:
- Add cvv debug calls to your application
- Use the visual debug GUI
- Enable and disable the visual debug features during compilation (with zero runtime overhead when
disabled)
Code
----
The example code
- captures images (*videoio*), e.g. from a webcam,
- applies some filters to each image (*imgproc*),
- detects image features and matches them to the previous image (*features2d*).
If the program is compiled without visual debugging (see CMakeLists.txt below) the only result is
some information printed to the command line. We want to demonstrate how much debugging or
development functionality is added by just a few lines of *cvv* commands.
@includelineno cvv/samples/cvv_demo.cpp
@code{.cmake}
cmake_minimum_required(VERSION 2.8)
project(cvvisual_test)
SET(CMAKE_PREFIX_PATH ~/software/opencv/install)
SET(CMAKE_CXX_COMPILER "g++-4.8")
SET(CMAKE_CXX_FLAGS "-std=c++11 -O2 -pthread -Wall -Werror")
# (un)set: cmake -DCVV_DEBUG_MODE=OFF ..
OPTION(CVV_DEBUG_MODE "cvvisual-debug-mode" ON)
if(CVV_DEBUG_MODE MATCHES ON)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DCVVISUAL_DEBUGMODE")
endif()
FIND_PACKAGE(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(cvvt main.cpp)
target_link_libraries(cvvt
opencv_core opencv_videoio opencv_imgproc opencv_features2d
opencv_cvv
)
@endcode
Explanation
-----------
-# We compile the program either using the above CmakeLists.txt with Option *CVV_DEBUG_MODE=ON*
(*cmake -DCVV_DEBUG_MODE=ON*) or by adding the corresponding define *CVVISUAL_DEBUGMODE* to
our compiler (e.g. *g++ -DCVVISUAL_DEBUGMODE*).
-# The first cvv call simply shows the image (similar to *imshow*) with the imgIdString as comment.
@code{.cpp}
cvv::showImage(imgRead, CVVISUAL_LOCATION, imgIdString.c_str());
@endcode
The image is added to the overview tab in the visual debug GUI and the cvv call blocks.
![image](images/01_overview_single.jpg)
The image can then be selected and viewed
![image](images/02_single_image_view.jpg)
Whenever you want to continue in the code, i.e. unblock the cvv call, you can either continue
until the next cvv call (*Step*), continue until the last cvv call (*\>\>*) or run the
application until it exists (*Close*).
We decide to press the green *Step* button.
-# The next cvv calls are used to debug all kinds of filter operations, i.e. operations that take a
picture as input and return a picture as output.
@code{.cpp}
cvv::debugFilter(imgRead, imgGray, CVVISUAL_LOCATION, "to gray");
@endcode
As with every cvv call, you first end up in the overview.
![image](images/03_overview_two.jpg)
We decide not to care about the conversion to gray scale and press *Step*.
@code{.cpp}
cvv::debugFilter(imgGray, imgGraySmooth, CVVISUAL_LOCATION, "smoothed");
@endcode
If you open the filter call, you will end up in the so called "DefaultFilterView". Both images
are shown next to each other and you can (synchronized) zoom into them.
![image](images/04_default_filter_view.jpg)
When you go to very high zoom levels, each pixel is annotated with its numeric values.
![image](images/05_default_filter_view_high_zoom.jpg)
We press *Step* twice and have a look at the dilated image.
@code{.cpp}
cvv::debugFilter(imgEdges, imgEdgesDilated, CVVISUAL_LOCATION, "dilated edges");
@endcode
The DefaultFilterView showing both images
![image](images/06_default_filter_view_edges.jpg)
Now we use the *View* selector in the top right and select the "DualFilterView". We select
"Changed Pixels" as filter and apply it (middle image).
![image](images/07_dual_filter_view_edges.jpg)
After we had a close look at these images, perhaps using different views, filters or other GUI
features, we decide to let the program run through. Therefore we press the yellow *\>\>* button.
The program will block at
@code{.cpp}
cvv::finalShow();
@endcode
and display the overview with everything that was passed to cvv in the meantime.
![image](images/08_overview_all.jpg)
-# The cvv debugDMatch call is used in a situation where there are two images each with a set of
descriptors that are matched to each other.
We pass both images, both sets of keypoints and their matching to the visual debug module.
@code{.cpp}
cvv::debugDMatch(prevImgGray, prevKeypoints, imgGray, keypoints, matches, CVVISUAL_LOCATION, allMatchIdString.c_str());
@endcode
Since we want to have a look at matches, we use the filter capabilities (*\#type match*) in the
overview to only show match calls.
![image](images/09_overview_filtered_type_match.jpg)
We want to have a closer look at one of them, e.g. to tune our parameters that use the matching.
The view has various settings how to display keypoints and matches. Furthermore, there is a
mouseover tooltip.
![image](images/10_line_match_view.jpg)
We see (visual debugging!) that there are many bad matches. We decide that only 70% of the
matches should be shown - those 70% with the lowest match distance.
![image](images/11_line_match_view_portion_selector.jpg)
Having successfully reduced the visual distraction, we want to see more clearly what changed
between the two images. We select the "TranslationMatchView" that shows to where the keypoint
was matched in a different way.
![image](images/12_translation_match_view_portion_selector.jpg)
It is easy to see that the cup was moved to the left during the two images.
Although, cvv is all about interactively *seeing* the computer vision bugs, this is complemented
by a "RawView" that allows to have a look at the underlying numeric data.
![image](images/13_raw_view.jpg)
-# There are many more useful features contained in the cvv GUI. For instance, one can group the
overview tab.
![image](images/14_overview_group_by_line.jpg)
Result
------
- By adding a view expressive lines to our computer vision program we can interactively debug it
through different visualizations.
- Once we are done developing/debugging we do not have to remove those lines. We simply disable
cvv debugging (*cmake -DCVV_DEBUG_MODE=OFF* or g++ without *-DCVVISUAL_DEBUGMODE*) and our
programs runs without any debug overhead.
Enjoy computer vision!
HMDB: A Large Human Motion Database
===================================
.. ocv:class:: AR_hmdb
Implements loading dataset:
_`"HMDB: A Large Human Motion Database"`: http://serre-lab.clps.brown.edu/resource/hmdb-a-large-human-motion-database/
.. note:: Usage
1. From link above download dataset files: hmdb51_org.rar & test_train_splits.rar.
2. Unpack them. Unpack all archives from directory: hmdb51_org/ and remove them.
3. To load data run: ./opencv/build/bin/example_datasets_ar_hmdb -p=/home/user/path_to_unpacked_folders/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy: 0.107407 (using precomputed HOG/HOF "STIP" features from site, averaging for 3 splits)
To run this benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_ar_hmdb_benchmark -p=/home/user/path_to_unpacked_folders/
(precomputed features should be unpacked in the same folder: /home/user/path_to_unpacked_folders/hmdb51_org_stips/. Also unpack all archives from directory: hmdb51_org_stips/ and remove them.)
**References:**
.. [Kuehne11] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. HMDB: A Large Video Database for Human Motion Recognition. ICCV, 2011
.. [Laptev08] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning Realistic Human Actions From Movies. CVPR, 2008
Sports-1M Dataset
=================
.. ocv:class:: AR_sports
Implements loading dataset:
_`"Sports-1M Dataset"`: http://cs.stanford.edu/people/karpathy/deepvideo/
.. note:: Usage
1. From link above download dataset files (git clone https://code.google.com/p/sports-1m-dataset/).
2. To load data run: ./opencv/build/bin/example_datasets_ar_sports -p=/home/user/path_to_downloaded_folders/
**References:**
.. [KarpathyCVPR14] Andrej Karpathy and George Toderici and Sanketh Shetty and Thomas Leung and Rahul Sukthankar and Li Fei-Fei. Large-scale Video Classification with Convolutional Neural Networks. CVPR, 2014
*******************************************************
datasets. Framework for working with different datasets
*******************************************************
.. highlight:: cpp
The datasets module includes classes for working with different datasets: load data, evaluate different algorithms on them, contains benchmarks, etc.
It is planned to have:
* basic: loading code for all datasets to help start work with them.
* next stage: quick benchmarks for all datasets to show how to solve them using OpenCV and implement evaluation code.
* finally: implement on OpenCV state-of-the-art algorithms, which solve these tasks.
.. toctree::
:hidden:
ar_hmdb
ar_sports
fr_adience
fr_lfw
gr_chalearn
gr_skig
hpe_humaneva
hpe_parse
ir_affine
ir_robot
is_bsds
is_weizmann
msm_epfl
msm_middlebury
or_imagenet
or_mnist
or_sun
pd_caltech
slam_kitti
slam_tumindoor
tr_chars
tr_svt
Action Recognition
------------------
:doc:`ar_hmdb` [#f1]_
:doc:`ar_sports`
Face Recognition
----------------
:doc:`fr_adience`
:doc:`fr_lfw` [#f1]_
Gesture Recognition
-------------------
:doc:`gr_chalearn`
:doc:`gr_skig`
Human Pose Estimation
---------------------
:doc:`hpe_humaneva`
:doc:`hpe_parse`
Image Registration
------------------
:doc:`ir_affine`
:doc:`ir_robot`
Image Segmentation
------------------
:doc:`is_bsds`
:doc:`is_weizmann`
Multiview Stereo Matching
-------------------------
:doc:`msm_epfl`
:doc:`msm_middlebury`
Object Recognition
------------------
:doc:`or_imagenet`
:doc:`or_mnist` [#f2]_
:doc:`or_sun`
Pedestrian Detection
--------------------
:doc:`pd_caltech` [#f2]_
SLAM
----
:doc:`slam_kitti`
:doc:`slam_tumindoor`
Text Recognition
----------------
:doc:`tr_chars`
:doc:`tr_svt` [#f1]_
*Footnotes*
.. [#f1] Benchmark implemented
.. [#f2] Not used in Vision Challenge
Adience
=======
.. ocv:class:: FR_adience
Implements loading dataset:
_`"Adience"`: http://www.openu.ac.il/home/hassner/Adience/data.html
.. note:: Usage
1. From link above download any dataset file: faces.tar.gz\\aligned.tar.gz and files with splits: fold_0_data.txt-fold_4_data.txt, fold_frontal_0_data.txt-fold_frontal_4_data.txt. (For face recognition task another splits should be created)
2. Unpack dataset file to some folder and place split files into the same folder.
3. To load data run: ./opencv/build/bin/example_datasets_fr_adience -p=/home/user/path_to_created_folder/
**References:**
.. [Eidinger] E. Eidinger, R. Enbar, and T. Hassner. Age and Gender Estimation of Unfiltered Faces
Labeled Faces in the Wild
=========================
.. ocv:class:: FR_lfw
Implements loading dataset:
_`"Labeled Faces in the Wild"`: http://vis-www.cs.umass.edu/lfw/
.. note:: Usage
1. From link above download any dataset file: lfw.tgz\\lfwa.tar.gz\\lfw-deepfunneled.tgz\\lfw-funneled.tgz and files with pairs: 10 test splits: pairs.txt and developer train split: pairsDevTrain.txt.
2. Unpack dataset file and place pairs.txt and pairsDevTrain.txt in created folder.
3. To load data run: ./opencv/build/bin/example_datasets_fr_lfw -p=/home/user/path_to_unpacked_folder/lfw2/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy: 0.623833 +- 0.005223 (train split: pairsDevTrain.txt, dataset: lfwa)
To run this benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_fr_lfw_benchmark -p=/home/user/path_to_unpacked_folder/lfw2/
**References:**
.. [Huang07] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. 2007
ChaLearn Looking at People
==========================
.. ocv:class:: GR_chalearn
Implements loading dataset:
_`"ChaLearn Looking at People"`: http://gesture.chalearn.org/
.. note:: Usage
1. Follow instruction from site above, download files for dataset "Track 3: Gesture Recognition": Train1.zip-Train5.zip, Validation1.zip-Validation3.zip (Register on site: www.codalab.org and accept the terms and conditions of competition: https://www.codalab.org/competitions/991#learn_the_details There are three mirrors for downloading dataset files. When I downloaded data only mirror: "Universitat Oberta de Catalunya" works).
2. Unpack train archives Train1.zip-Train5.zip to folder Train/, validation archives Validation1.zip-Validation3.zip to folder Validation/
3. Unpack all archives in Train/ & Validation/ in the folders with the same names, for example: Sample0001.zip to Sample0001/
4. To load data run: ./opencv/build/bin/example_datasets_gr_chalearn -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Escalera14] S. Escalera, X. Baró, J. Gonzàlez, M.A. Bautista, M. Madadi, M. Reyes, V. Ponce-López, H.J. Escalante, J. Shotton, I. Guyon, "ChaLearn Looking at People Challenge 2014: Dataset and Results", ECCV Workshops, 2014
Sheffield Kinect Gesture Dataset
================================
.. ocv:class:: GR_skig
Implements loading dataset:
_`"Sheffield Kinect Gesture Dataset"`: http://lshao.staff.shef.ac.uk/data/SheffieldKinectGesture.htm
.. note:: Usage
1. From link above download dataset files: subject1_dep.7z-subject6_dep.7z, subject1_rgb.7z-subject6_rgb.7z.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_gr_skig -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Liu13] L. Liu and L. Shao, “Learning Discriminative Representations from RGB-D Video Data”, In Proc. International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013.
HumanEva Dataset
================
.. ocv:class:: HPE_humaneva
Implements loading dataset:
_`"HumanEva Dataset"`: http://humaneva.is.tue.mpg.de
.. note:: Usage
1. From link above download dataset files for HumanEva-I (tar) & HumanEva-II.
2. Unpack them to HumanEva_1 & HumanEva_2 accordingly.
3. To load data run: ./opencv/build/bin/example_datasets_hpe_humaneva -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Sigal10] L. Sigal, A. Balan and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion, In International Journal of Computer Vision, Vol. 87 (1-2), 2010
.. [Sigal06] L. Sigal and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion, Techniacl Report CS-06-08, Brown University, 2006
PARSE Dataset
=============
.. ocv:class:: HPE_parse
Implements loading dataset:
_`"PARSE Dataset"`: http://www.ics.uci.edu/~dramanan/papers/parse/
.. note:: Usage
1. From link above download dataset file: people.zip.
2. Unpack it.
3. To load data run: ./opencv/build/bin/example_datasets_hpe_parse -p=/home/user/path_to_unpacked_folder/people_all/
**References:**
.. [Ramanan06] D. Ramanan "Learning to Parse Images of Articulated Bodies." Neural Info. Proc. Systems (NIPS) To appear. Dec 2006.
Affine Covariant Regions Datasets
=================================
.. ocv:class:: IR_affine
Implements loading dataset:
_`"Affine Covariant Regions Datasets"`: http://www.robots.ox.ac.uk/~vgg/data/data-aff.html
.. note:: Usage
1. From link above download dataset files: bark\\bikes\\boat\\graf\\leuven\\trees\\ubc\\wall.tar.gz.
2. Unpack them.
3. To load data, for example, for "bark", run: ./opencv/build/bin/example_datasets_ir_affine -p=/home/user/path_to_unpacked_folder/bark/
**References:**
.. [Mikolajczyk05] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. Van Gool. A Comparison of Affine Region Detectors. International Journal of Computer Vision, Volume 65, Number 1/2, page 43--72, 2005
Robot Data Set
==============
.. ocv:class:: IR_robot
Implements loading dataset:
_`"Robot Data Set, Point Feature Data Set – 2010"`: http://roboimagedata.compute.dtu.dk/?page_id=24
.. note:: Usage
1. From link above download dataset files: SET001_6.tar.gz-SET055_60.tar.gz
2. Unpack them to one folder.
3. To load data run: ./opencv/build/bin/example_datasets_ir_robot -p=/home/user/path_to_unpacked_folder/
**References:**
.. [aanæsinteresting] Aan{\ae}s, H. and Dahl, A.L. and Steenstrup Pedersen, K. Interesting Interest Points. International Journal of Computer Vision. 2012.
The Berkeley Segmentation Dataset and Benchmark
===============================================
.. ocv:class:: IS_bsds
Implements loading dataset:
_`"The Berkeley Segmentation Dataset and Benchmark"`: https://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
.. note:: Usage
1. From link above download dataset files: BSDS300-human.tgz & BSDS300-images.tgz.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_is_bsds -p=/home/user/path_to_unpacked_folder/BSDS300/
**References:**
.. [MartinFTM01] D. Martin and C. Fowlkes and D. Tal and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. 2001
Weizmann Segmentation Evaluation Database
=========================================
.. ocv:class:: IS_weizmann
Implements loading dataset:
_`"Weizmann Segmentation Evaluation Database"`: http://www.wisdom.weizmann.ac.il/~vision/Seg_Evaluation_DB/
.. note:: Usage
1. From link above download dataset files: Weizmann_Seg_DB_1obj.ZIP & Weizmann_Seg_DB_2obj.ZIP.
2. Unpack them.
3. To load data, for example, for 1 object dataset, run: ./opencv/build/bin/example_datasets_is_weizmann -p=/home/user/path_to_unpacked_folder/1obj/
**References:**
.. [AlpertGBB07] Sharon Alpert and Meirav Galun and Ronen Basri and Achi Brandt. Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration. 2007
EPFL Multi-View Stereo
======================
.. ocv:class:: MSM_epfl
Implements loading dataset:
_`"EPFL Multi-View Stereo"`: http://cvlabwww.epfl.ch/~strecha/multiview/denseMVS.html
.. note:: Usage
1. From link above download dataset files: castle_dense\\castle_dense_large\\castle_entry\\fountain\\herzjesu_dense\\herzjesu_dense_large_bounding\\cameras\\images\\p.tar.gz.
2. Unpack them in separate folder for each object. For example, for "fountain", in folder fountain/ : fountain_dense_bounding.tar.gz -> bounding/, fountain_dense_cameras.tar.gz -> camera/, fountain_dense_images.tar.gz -> png/, fountain_dense_p.tar.gz -> P/
3. To load data, for example, for "fountain", run: ./opencv/build/bin/example_datasets_msm_epfl -p=/home/user/path_to_unpacked_folder/fountain/
**References:**
.. [Strecha08] C. Strecha, W. von Hansen, L. Van Gool, P. Fua, U. Thoennessen. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery. CVPR, 2008
Stereo – Middlebury Computer Vision
===================================
.. ocv:class:: MSM_middlebury
Implements loading dataset:
_`"Stereo – Middlebury Computer Vision"`: http://vision.middlebury.edu/mview/
.. note:: Usage
1. From link above download dataset files: dino\\dinoRing\\dinoSparseRing\\temple\\templeRing\\templeSparseRing.zip
2. Unpack them.
3. To load data, for example "temple" dataset, run: ./opencv/build/bin/example_datasets_msm_middlebury -p=/home/user/path_to_unpacked_folder/temple/
**References:**
.. [Seitz06] S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, R. Szeliski. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, CVPR, 2006
ImageNet
========
.. ocv:class:: OR_imagenet
Implements loading dataset:
_`"ImageNet"`: http://www.image-net.org/
.. note:: Usage
1. From link above download dataset files: ILSVRC2010_images_train.tar\\ILSVRC2010_images_test.tar\\ILSVRC2010_images_val.tar & devkit: ILSVRC2010_devkit-1.0.tar.gz (Implemented loading of 2010 dataset as only this dataset has ground truth for test data, but structure for ILSVRC2014 is similar)
2. Unpack them to: some_folder/train/\\some_folder/test/\\some_folder/val & some_folder/ILSVRC2010_validation_ground_truth.txt\\some_folder/ILSVRC2010_test_ground_truth.txt.
3. Create file with labels: some_folder/labels.txt, for example, using :ref:`python script <python-script>` below (each file's row format: synset,labelID,description. For example: "n07751451,18,plum").
4. Unpack all tar files in train.
5. To load data run: ./opencv/build/bin/example_datasets_or_imagenet -p=/home/user/some_folder/
.. _python-script:
Python script to parse meta.mat:
::
import scipy.io
meta_mat = scipy.io.loadmat("devkit-1.0/data/meta.mat")
labels_dic = dict((m[0][1][0], m[0][0][0][0]-1) for m in meta_mat['synsets']
label_names_dic = dict((m[0][1][0], m[0][2][0]) for m in meta_mat['synsets']
for label in labels_dic.keys():
print "{0},{1},{2}".format(label, labels_dic[label], label_names_dic[label])
**References:**
.. [ILSVRCarxiv14] Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. 2014
MNIST
=====
.. ocv:class:: OR_mnist
Implements loading dataset:
_`"MNIST"`: http://yann.lecun.com/exdb/mnist/
.. note:: Usage
1. From link above download dataset files: t10k-images-idx3-ubyte.gz, t10k-labels-idx1-ubyte.gz, train-images-idx3-ubyte.gz, train-labels-idx1-ubyte.gz.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_or_mnist -p=/home/user/path_to_unpacked_files/
**References:**
.. [LeCun98a] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
SUN Database
============
.. ocv:class:: OR_sun
Implements loading dataset:
_`"SUN Database, Scene Recognition Benchmark. SUN397"`: http://vision.cs.princeton.edu/projects/2010/SUN/
.. note:: Usage
1. From link above download dataset file: SUN397.tar & file with splits: Partitions.zip
2. Unpack SUN397.tar into folder: SUN397/ & Partitions.zip into folder: SUN397/Partitions/
3. To load data run: ./opencv/build/bin/example_datasets_or_sun -p=/home/user/path_to_unpacked_files/SUN397/
**References:**
.. [Xiao10] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. SUN Database: Large-scale Scene Recognition from Abbey to Zoo. IEEE Conference on Computer Vision and Pattern Recognition. CVPR, 2010
.. [Xiao14] J. Xiao, K. A. Ehinger, J. Hays, A. Torralba, and A. Oliva. SUN Database: Exploring a Large Collection of Scene Categories. International Journal of Computer Vision. IJCV, 2014
Caltech Pedestrian Detection Benchmark
======================================
.. ocv:class:: PD_caltech
Implements loading dataset:
_`"Caltech Pedestrian Detection Benchmark"`: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
.. note:: First version of Caltech Pedestrian dataset loading.
Code to unpack all frames from seq files commented as their number is huge!
So currently load only meta information without data.
Also ground truth isn't processed, as need to convert it from mat files first.
.. note:: Usage
1. From link above download dataset files: set00.tar-set10.tar.
2. Unpack them to separate folder.
3. To load data run: ./opencv/build/bin/example_datasets_pd_caltech -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Dollár12] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: An Evaluation of the State of the Art. PAMI, 2012.
.. [DollárCVPR09] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: A Benchmark. CVPR, 2009
KITTI Vision Benchmark
======================
.. ocv:class:: SLAM_kitti
Implements loading dataset:
_`"KITTI Vision Benchmark"`: http://www.cvlibs.net/datasets/kitti/eval_odometry.php
.. note:: Usage
1. From link above download "Odometry" dataset files: data_odometry_gray\\data_odometry_color\\data_odometry_velodyne\\data_odometry_poses\\data_odometry_calib.zip.
2. Unpack data_odometry_poses.zip, it creates folder dataset/poses/. After that unpack data_odometry_gray.zip, data_odometry_color.zip, data_odometry_velodyne.zip. Folder dataset/sequences/ will be created with folders 00/..21/. Each of these folders will contain: image_0/, image_1/, image_2/, image_3/, velodyne/ and files calib.txt & times.txt. These two last files will be replaced after unpacking data_odometry_calib.zip at the end.
3. To load data run: ./opencv/build/bin/example_datasets_slam_kitti -p=/home/user/path_to_unpacked_folder/dataset/
**References:**
.. [Geiger2012CVPR] Andreas Geiger and Philip Lenz and Raquel Urtasun. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012
.. [Geiger2013IJRR] Andreas Geiger and Philip Lenz and Christoph Stiller and Raquel Urtasun. Vision meets Robotics: The KITTI Dataset. IJRR, 2013
.. [Fritsch2013ITSC] Jannik Fritsch and Tobias Kuehnl and Andreas Geiger. A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms. ITSC, 2013
TUMindoor Dataset
=================
.. ocv:class:: SLAM_tumindoor
Implements loading dataset:
_`"TUMindoor Dataset"`: http://www.navvis.lmt.ei.tum.de/dataset/
.. note:: Usage
1. From link above download dataset files: dslr\\info\\ladybug\\pointcloud.tar.bz2 for each dataset: 11-11-28 (1st floor)\\11-12-13 (1st floor N1)\\11-12-17a (4th floor)\\11-12-17b (3rd floor)\\11-12-17c (Ground I)\\11-12-18a (Ground II)\\11-12-18b (2nd floor)
2. Unpack them in separate folder for each dataset. dslr.tar.bz2 -> dslr/, info.tar.bz2 -> info/, ladybug.tar.bz2 -> ladybug/, pointcloud.tar.bz2 -> pointcloud/.
3. To load each dataset run: ./opencv/build/bin/example_datasets_slam_tumindoor -p=/home/user/path_to_unpacked_folders/
**References:**
.. [TUMindoor] R. Huitl and G. Schroth and S. Hilsenbeck and F. Schweiger and E. Steinbach. {TUM}indoor: An Extensive Image and Point Cloud Dataset for Visual Indoor Localization and Mapping. 2012
The Chars74K Dataset
====================
.. ocv:class:: TR_chars
Implements loading dataset:
_`"The Chars74K Dataset"`: http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
.. note:: Usage
1. From link above download dataset files: EnglishFnt\\EnglishHnd\\EnglishImg\\KannadaHnd\\KannadaImg.tgz, ListsTXT.tgz.
2. Unpack them.
3. Move .m files from folder ListsTXT/ to appropriate folder. For example, English/list_English_Img.m for EnglishImg.tgz.
4. To load data, for example "EnglishImg", run: ./opencv/build/bin/example_datasets_tr_chars -p=/home/user/path_to_unpacked_folder/English/
**References:**
.. [Campos09] T. E. de Campos, B. R. Babu and M. Varma. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), 2009
The Street View Text Dataset
============================
.. ocv:class:: TR_svt
Implements loading dataset:
_`"The Street View Text Dataset"`: http://vision.ucsd.edu/~kai/svt/
.. note:: Usage
1. From link above download dataset file: svt.zip.
2. Unpack it.
3. To load data run: ./opencv/build/bin/example_datasets_tr_svt -p=/home/user/path_to_unpacked_folder/svt/svt1/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy (mean f1): 0.217
To run benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_tr_svt_benchmark -p=/home/user/path_to_unpacked_folders/svt/svt1/
**References:**
.. [Wang11] Kai Wang, Boris Babenko and Serge Belongie. End-to-end Scene Text Recognition. ICCV, 2011
.. [Wang10] Kai Wang and Serge Belongie. Word Spotting in the Wild. ECCV, 2010
File mode changed from 100755 to 100644
***************************************
face. Face Recognition
***************************************
The module contains some recently added functionality that has not been stabilized, or functionality that is considered optional.
.. toctree::
:maxdepth: 2
FaceRecognizer Documentation <index>
This diff is collapsed.
Changelog
=========
Release 0.05
------------
This library is now included in the official OpenCV distribution (from 2.4 on).
The :ocv:class`FaceRecognizer` is now an :ocv:class:`Algorithm`, which better fits into the overall
OpenCV API.
To reduce the confusion on user side and minimize my work, libfacerec and OpenCV
have been synchronized and are now based on the same interfaces and implementation.
The library now has an extensive documentation:
* The API is explained in detail and with a lot of code examples.
* The face recognition guide I had written for Python and GNU Octave/MATLAB has been adapted to the new OpenCV C++ ``cv::FaceRecognizer``.
* A tutorial for gender classification with Fisherfaces.
* A tutorial for face recognition in videos (e.g. webcam).
Release highlights
++++++++++++++++++
* There are no single highlights to pick from, this release is a highlight itself.
Release 0.04
------------
This version is fully Windows-compatible and works with OpenCV 2.3.1. Several
bugfixes, but none influenced the recognition rate.
Release highlights
++++++++++++++++++
* A whole lot of exceptions with meaningful error messages.
* A tutorial for Windows users: `http://bytefish.de/blog/opencv_visual_studio_and_libfacerec <http://bytefish.de/blog/opencv_visual_studio_and_libfacerec>`_
Release 0.03
------------
Reworked the library to provide separate implementations in cpp files, because
it's the preferred way of contributing OpenCV libraries. This means the library
is not header-only anymore. Slight API changes were done, please see the
documentation for details.
Release highlights
++++++++++++++++++
* New Unit Tests (for LBP Histograms) make the library more robust.
* Added more documentation.
Release 0.02
------------
Reworked the library to provide separate implementations in cpp files, because
it's the preferred way of contributing OpenCV libraries. This means the library
is not header-only anymore. Slight API changes were done, please see the
documentation for details.
Release highlights
++++++++++++++++++
* New Unit Tests (for LBP Histograms) make the library more robust.
* Added a documentation and changelog in reStructuredText.
Release 0.01
------------
Initial release as header-only library.
Release highlights
++++++++++++++++++
* Colormaps for OpenCV to enhance the visualization.
* Face Recognition algorithms implemented:
* Eigenfaces [TP91]_
* Fisherfaces [BHK97]_
* Local Binary Patterns Histograms [AHP04]_
* Added persistence facilities to store the models with a common API.
* Unit Tests (using `gtest <http://code.google.com/p/googletest/>`_).
* Providing a CMakeLists.txt to enable easy cross-platform building.
This diff is collapsed.
FaceRecognizer - Face Recognition with OpenCV
##############################################
OpenCV 2.4 now comes with the very new :ocv:class:`FaceRecognizer` class for face recognition. This documentation is going to explain you :doc:`the API <facerec_api>` in detail and it will give you a lot of help to get started (full source code examples). :doc:`Face Recognition with OpenCV <facerec_tutorial>` is the definite guide to the new :ocv:class:`FaceRecognizer`. There's also a :doc:`tutorial on gender classification <tutorial/facerec_gender_classification>`, a :doc:`tutorial for face recognition in videos <tutorial/facerec_video_recognition>` and it's shown :doc:`how to load & save your results <tutorial/facerec_save_load>`.
These documents are the help I have wished for, when I was working myself into face recognition. I hope you also think the new :ocv:class:`FaceRecognizer` is a useful addition to OpenCV.
Please issue any feature requests and/or bugs on the official OpenCV bug tracker at:
* http://code.opencv.org/projects/opencv/issues
Contents
========
.. toctree::
:maxdepth: 1
FaceRecognizer API <facerec_api>
Guide to Face Recognition with OpenCV <facerec_tutorial>
Tutorial on Gender Classification <tutorial/facerec_gender_classification>
Tutorial on Face Recognition in Videos <tutorial/facerec_video_recognition>
Tutorial On Saving & Loading a FaceRecognizer <tutorial/facerec_save_load>
Changelog <facerec_changelog>
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
File mode changed from 100755 to 100644
Saving and Loading a FaceRecognizer
===================================
Introduction
------------
Saving and loading a :ocv:class:`FaceRecognizer` is very important. Training a FaceRecognizer can be a very time-intense task, plus it's often impossible to ship the whole face database to the user of your product. The task of saving and loading a FaceRecognizer is easy with :ocv:class:`FaceRecognizer`. You only have to call :ocv:func:`FaceRecognizer::load` for loading and :ocv:func:`FaceRecognizer::save` for saving a :ocv:class:`FaceRecognizer`.
I'll adapt the Eigenfaces example from the :doc:`../facerec_tutorial`: Imagine we want to learn the Eigenfaces of the `AT&T Facedatabase <http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html>`_, store the model to a YAML file and then load it again.
From the loaded model, we'll get a prediction, show the mean, Eigenfaces and the image reconstruction.
Using FaceRecognizer::save and FaceRecognizer::load
-----------------------------------------------------
The source code for this demo application is also available in the ``src`` folder coming with this documentation:
* :download:`src/facerec_save_load.cpp <../src/facerec_save_load.cpp>`
.. literalinclude:: ../src/facerec_save_load.cpp
:language: cpp
:linenos:
Results
-------
``eigenfaces_at.yml`` then contains the model state, we'll simply look at the first 10 lines with ``head eigenfaces_at.yml``:
.. code-block:: none
philipp@mango:~/github/libfacerec-build$ head eigenfaces_at.yml
%YAML:1.0
num_components: 399
mean: !!opencv-matrix
rows: 1
cols: 10304
dt: d
data: [ 8.5558897243107765e+01, 8.5511278195488714e+01,
8.5854636591478695e+01, 8.5796992481203006e+01,
8.5952380952380949e+01, 8.6162907268170414e+01,
8.6082706766917283e+01, 8.5776942355889716e+01,
And here is the Reconstruction, which is the same as the original:
.. image:: ../img/eigenface_reconstruction_opencv.png
:align: center
Face Recognition in Videos with OpenCV
=======================================
.. contents:: Table of Contents
:depth: 3
Introduction
------------
Whenever you hear the term *face recognition*, you instantly think of surveillance in videos. So performing face recognition in videos (e.g. webcam) is one of the most requested features I have got. I have heard your cries, so here it is. An application, that shows you how to do face recognition in videos! For the face detection part we'll use the awesome :ocv:class:`CascadeClassifier` and we'll use :ocv:class:`FaceRecognizer` for face recognition. This example uses the Fisherfaces method for face recognition, because it is robust against large changes in illumination.
Here is what the final application looks like. As you can see I am only writing the id of the recognized person above the detected face (by the way this id is Arnold Schwarzenegger for my data set):
.. image:: ../img/tutorial/facerec_video/facerec_video.png
:align: center
:scale: 70%
This demo is a basis for your research and it shows you how to implement face recognition in videos. You probably want to extend the application and make it more sophisticated: You could combine the id with the name, then show the confidence of the prediction, recognize the emotion... and and and. But before you send mails, asking what these Haar-Cascade thing is or what a CSV is: Make sure you have read the entire tutorial. It's all explained in here. If you just want to scroll down to the code, please note:
* The available Haar-Cascades for face detection are located in the ``data`` folder of your OpenCV installation! One of the available Haar-Cascades for face detection is for example ``/path/to/opencv/data/haarcascades/haarcascade_frontalface_default.xml``.
I encourage you to experiment with the application. Play around with the available :ocv:class:`FaceRecognizer` implementations, try the available cascades in OpenCV and see if you can improve your results!
Prerequisites
--------------
You want to do face recognition, so you need some face images to learn a :ocv:class:`FaceRecognizer` on. I have decided to reuse the images from the gender classification example: :doc:`facerec_gender_classification`.
I have the following celebrities in my training data set:
* Angelina Jolie
* Arnold Schwarzenegger
* Brad Pitt
* George Clooney
* Johnny Depp
* Justin Timberlake
* Katy Perry
* Keanu Reeves
* Patrick Stewart
* Tom Cruise
In the demo I have decided to read the images from a very simple CSV file. Why? Because it's the simplest platform-independent approach I can think of. However, if you know a simpler solution please ping me about it. Basically all the CSV file needs to contain are lines composed of a ``filename`` followed by a ``;`` followed by the ``label`` (as *integer number*), making up a line like this:
.. code-block:: none
/path/to/image.ext;0
Let's dissect the line. ``/path/to/image.ext`` is the path to an image, probably something like this if you are in Windows: ``C:/faces/person0/image0.jpg``. Then there is the separator ``;`` and finally we assign a label ``0`` to the image. Think of the label as the subject (the person, the gender or whatever comes to your mind). In the face recognition scenario, the label is the person this image belongs to. In the gender classification scenario, the label is the gender the person has. So my CSV file looks like this:
.. code-block:: none
/home/philipp/facerec/data/c/keanu_reeves/keanu_reeves_01.jpg;0
/home/philipp/facerec/data/c/keanu_reeves/keanu_reeves_02.jpg;0
/home/philipp/facerec/data/c/keanu_reeves/keanu_reeves_03.jpg;0
...
/home/philipp/facerec/data/c/katy_perry/katy_perry_01.jpg;1
/home/philipp/facerec/data/c/katy_perry/katy_perry_02.jpg;1
/home/philipp/facerec/data/c/katy_perry/katy_perry_03.jpg;1
...
/home/philipp/facerec/data/c/brad_pitt/brad_pitt_01.jpg;2
/home/philipp/facerec/data/c/brad_pitt/brad_pitt_02.jpg;2
/home/philipp/facerec/data/c/brad_pitt/brad_pitt_03.jpg;2
...
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_08.jpg;6
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_05.jpg;6
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_02.jpg;6
/home/philipp/facerec/data/c1/crop_arnold_schwarzenegger/crop_03.jpg;6
All images for this example were chosen to have a frontal face perspective. They have been cropped, scaled and rotated to be aligned at the eyes, just like this set of George Clooney images:
.. image:: ../img/tutorial/gender_classification/clooney_set.png
:align: center
Face Recongition from Videos
-----------------------------
The source code for the demo is available in the ``src`` folder coming with this documentation:
* :download:`src/facerec_video.cpp <../src/facerec_video.cpp>`
This demo uses the :ocv:class:`CascadeClassifier`:
.. literalinclude:: ../src/facerec_video.cpp
:language: cpp
:linenos:
Running the Demo
----------------
You'll need:
* The path to a valid Haar-Cascade for detecting a face with a :ocv:class:`CascadeClassifier`.
* The path to a valid CSV File for learning a :ocv:class:`FaceRecognizer`.
* A webcam and its device id (you don't know the device id? Simply start from 0 on and see what happens).
If you are in Windows, then simply start the demo by running (from command line):
.. code-block:: none
facerec_video.exe <C:/path/to/your/haar_cascade.xml> <C:/path/to/your/csv.ext> <video device>
If you are in Linux, then simply start the demo by running:
.. code-block:: none
./facerec_video </path/to/your/haar_cascade.xml> </path/to/your/csv.ext> <video device>
An example. If the haar-cascade is at ``C:/opencv/data/haarcascades/haarcascade_frontalface_default.xml``, the CSV file is at ``C:/facerec/data/celebrities.txt`` and I have a webcam with deviceId ``1``, then I would call the demo with:
.. code-block:: none
facerec_video.exe C:/opencv/data/haarcascades/haarcascade_frontalface_default.xml C:/facerec/data/celebrities.txt 1
That's it.
Results
-------
Enjoy!
Appendix
--------
Creating the CSV File
+++++++++++++++++++++
You don't really want to create the CSV file by hand. I have prepared you a little Python script ``create_csv.py`` (you find it at ``/src/create_csv.py`` coming with this tutorial) that automatically creates you a CSV file. If you have your images in hierarchie like this (``/basepath/<subject>/<image.ext>``):
.. code-block:: none
philipp@mango:~/facerec/data/at$ tree
.
|-- s1
| |-- 1.pgm
| |-- ...
| |-- 10.pgm
|-- s2
| |-- 1.pgm
| |-- ...
| |-- 10.pgm
...
|-- s40
| |-- 1.pgm
| |-- ...
| |-- 10.pgm
Then simply call ``create_csv.py`` with the path to the folder, just like this and you could save the output:
.. code-block:: none
philipp@mango:~/facerec/data$ python create_csv.py
at/s13/2.pgm;0
at/s13/7.pgm;0
at/s13/6.pgm;0
at/s13/9.pgm;0
at/s13/5.pgm;0
at/s13/3.pgm;0
at/s13/4.pgm;0
at/s13/10.pgm;0
at/s13/8.pgm;0
at/s13/1.pgm;0
at/s17/2.pgm;1
at/s17/7.pgm;1
at/s17/6.pgm;1
at/s17/9.pgm;1
at/s17/5.pgm;1
at/s17/3.pgm;1
[...]
Here is the script, if you can't find it:
.. literalinclude:: ../src/create_csv.py
:language: python
:linenos:
Aligning Face Images
++++++++++++++++++++
An accurate alignment of your image data is especially important in tasks like emotion detection, were you need as much detail as possible. Believe me... You don't want to do this by hand. So I've prepared you a tiny Python script. The code is really easy to use. To scale, rotate and crop the face image you just need to call *CropFace(image, eye_left, eye_right, offset_pct, dest_sz)*, where:
* *eye_left* is the position of the left eye
* *eye_right* is the position of the right eye
* *offset_pct* is the percent of the image you want to keep next to the eyes (horizontal, vertical direction)
* *dest_sz* is the size of the output image
If you are using the same *offset_pct* and *dest_sz* for your images, they are all aligned at the eyes.
.. literalinclude:: ../src/crop_face.py
:language: python
:linenos:
Imagine we are given `this photo of Arnold Schwarzenegger <http://en.wikipedia.org/wiki/File:Arnold_Schwarzenegger_edit%28ws%29.jpg>`_, which is under a Public Domain license. The (x,y)-position of the eyes is approximately *(252,364)* for the left and *(420,366)* for the right eye. Now you only need to define the horizontal offset, vertical offset and the size your scaled, rotated & cropped face should have.
Here are some examples:
+---------------------------------+----------------------------------------------------------------------------+
| Configuration | Cropped, Scaled, Rotated Face |
+=================================+============================================================================+
| 0.1 (10%), 0.1 (10%), (200,200) | .. image:: ../img/tutorial/gender_classification/arnie_10_10_200_200.jpg |
+---------------------------------+----------------------------------------------------------------------------+
| 0.2 (20%), 0.2 (20%), (200,200) | .. image:: ../img/tutorial/gender_classification/arnie_20_20_200_200.jpg |
+---------------------------------+----------------------------------------------------------------------------+
| 0.3 (30%), 0.3 (30%), (200,200) | .. image:: ../img/tutorial/gender_classification/arnie_30_30_200_200.jpg |
+---------------------------------+----------------------------------------------------------------------------+
| 0.2 (20%), 0.2 (20%), (70,70) | .. image:: ../img/tutorial/gender_classification/arnie_20_20_70_70.jpg |
+---------------------------------+----------------------------------------------------------------------------+
......@@ -43,7 +43,7 @@ the use of this software, even if advised of the possibility of such damage.
@defgroup face Face Recognition
- @ref face_changelog
- @ref face_tutorial
- @ref tutorial_face_main
*/
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment