Commit 2bfcf0f1 authored by Samyak Datta's avatar Samyak Datta

Merge remote-tracking branch 'upstream/master'

parents ce27f041 c05a7e01
.. _Table-Of-Content-Bioinspired:
*bioinspired* module. Algorithms inspired from biological models
----------------------------------------------------------------
Here you will learn how to use additional modules of OpenCV defined in the "bioinspired" module.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|RetinaDemoImg| **Title:** :ref:`Retina_Model`
*Compatibility:* > OpenCV 2.4
*Author:* |Author_AlexB|
You will learn how to process images and video streams with a model of retina filter for details enhancement, spatio-temporal noise removal, luminance correction and spatio-temporal events detection.
=============== ======================================================
.. |RetinaDemoImg| image:: images/retina_TreeHdr_small.jpg
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../retina_model/retina_model
.. _Table-Of-Content-CVV:
*cvv* module. GUI for Interactive Visual Debugging
--------------------------------------------------
Here you will learn how to use the cvv module to ease programming computer vision software through visual debugging aids.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|cvvIntro| *Title:* :ref:`Visual_Debugging_Introduction`
*Compatibility:* > OpenCV 2.4.8
*Author:* |Author_Bihlmaier|
We will learn how to debug our applications in a visual and interactive way.
=============== ======================================================
.. |cvvIntro| image:: images/Visual_Debugging_Introduction_Tutorial_Cover.jpg
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../visual_debugging_introduction/visual_debugging_introduction
.. ximgproc:
Structured forests for fast edge detection
******************************************
Introduction
------------
In this tutorial you will learn how to use structured forests for the purpose of edge detection in an image.
Examples
--------
.. image:: images/01.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/02.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/03.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/04.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/05.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/06.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/07.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/08.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/09.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/10.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/11.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
.. image:: images/12.jpg
:height: 238pt
:width: 750pt
:alt: First example
:align: center
**Note :** binarization techniques like Canny edge detector are applicable
to edges produced by both algorithms (``Sobel`` and ``StructuredEdgeDetection::detectEdges``).
Source Code
-----------
.. literalinclude:: ../../../../modules/ximpgroc/samples/cpp/structured_edge_detection.cpp
:language: cpp
:linenos:
:tab-width: 4
Explanation
-----------
1. **Load source color image**
.. code-block:: cpp
cv::Mat image = cv::imread(inFilename, 1);
if ( image.empty() )
{
printf("Cannot read image file: %s\n", inFilename.c_str());
return -1;
}
2. **Convert source image to [0;1] range**
.. code-block:: cpp
image.convertTo(image, cv::DataType<float>::type, 1/255.0);
3. **Run main algorithm**
.. code-block:: cpp
cv::Mat edges(image.size(), image.type());
cv::Ptr<StructuredEdgeDetection> pDollar =
cv::createStructuredEdgeDetection(modelFilename);
pDollar->detectEdges(image, edges);
4. **Show results**
.. code-block:: cpp
if ( outFilename == "" )
{
cv::namedWindow("edges", 1);
cv::imshow("edges", edges);
cv::waitKey(0);
}
else
cv::imwrite(outFilename, 255*edges);
Literature
----------
For more information, refer to the following papers :
.. [Dollar2013] Dollar P., Zitnick C. L., "Structured forests for fast edge detection",
IEEE International Conference on Computer Vision (ICCV), 2013,
pp. 1841-1848. `DOI <http://dx.doi.org/10.1109/ICCV.2013.231>`_
.. [Lim2013] Lim J. J., Zitnick C. L., Dollar P., "Sketch Tokens: A Learned
Mid-level Representation for Contour and Object Detection",
Comoputer Vision and Pattern Recognition (CVPR), 2013,
pp. 3158-3165. `DOI <http://dx.doi.org/10.1109/CVPR.2013.406>`_
......@@ -51,3 +51,5 @@ $ cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -D BUILD_opencv_re
21. **opencv_xobjdetect**: Integral Channel Features Detector Framework.
22. **opencv_xphoto**: Additional photo processing algorithms: Color balance / Denoising / Inpainting.
23. **opencv_stereo**: Stereo Correspondence done with different descriptors: Census / CS-Census / MCT / BRIEF / MV / RT.
......@@ -13,7 +13,7 @@ endif()
project(${the_target})
ocv_include_directories("${OpenCV_SOURCE_DIR}/include/opencv")
ocv_include_modules(${OPENCV_${the_target}_DEPS})
ocv_include_modules_recurse(${OPENCV_${the_target}_DEPS})
file(GLOB ${the_target}_SOURCES ${CMAKE_CURRENT_SOURCE_DIR}/*.cpp)
......
......@@ -13,7 +13,7 @@ endif()
project(${the_target})
ocv_include_directories("${OpenCV_SOURCE_DIR}/include/opencv")
ocv_include_modules(${OPENCV_${the_target}_DEPS})
ocv_include_modules_recurse(${OPENCV_${the_target}_DEPS})
file(GLOB ${the_target}_SOURCES ${CMAKE_CURRENT_SOURCE_DIR}/*.cpp)
......
set(the_description "Background Segmentation Algorithms")
ocv_define_module(bgsegm opencv_core opencv_imgproc opencv_video opencv_highgui)
ocv_define_module(bgsegm opencv_core opencv_imgproc opencv_video opencv_highgui WRAP python)
......@@ -57,7 +57,7 @@ namespace bgsegm
/** @brief Gaussian Mixture-based Background/Foreground Segmentation Algorithm.
The class implements the algorithm described in @cite KB2001.
The class implements the algorithm described in @cite KB2001 .
*/
class CV_EXPORTS_W BackgroundSubtractorMOG : public BackgroundSubtractor
{
......@@ -86,9 +86,9 @@ means some automatic value.
CV_EXPORTS_W Ptr<BackgroundSubtractorMOG>
createBackgroundSubtractorMOG(int history=200, int nmixtures=5,
double backgroundRatio=0.7, double noiseSigma=0);
/** @brief Background Subtractor module based on the algorithm given in @cite Gold2012.
/** @brief Background Subtractor module based on the algorithm given in @cite Gold2012 .
Takes a series of images and returns a sequence of mask (8UC1)
images of the same size, where 255 indicates Foreground and 0 represents Background.
......
set(the_description "Biologically inspired algorithms")
ocv_warnings_disable(CMAKE_CXX_FLAGS -Wundef)
ocv_define_module(bioinspired opencv_core OPTIONAL opencv_highgui opencv_ocl)
ocv_define_module(bioinspired opencv_core OPTIONAL opencv_highgui opencv_ocl WRAP java python)
Biologically inspired vision models and derivated tools
=======================================================
1. A biological retina model for image spatio-temporal noise and luminance changes enhancement
2. A transient areas (spatio-temporal events) segmentation tool to use at the output of the Retina
3. High Dynamic Range (HDR >8bit images) tone mapping to (conversion to 8bit) use cas of the retina
......@@ -9,6 +9,15 @@
publisher={Elsevier}
}
@INPROCEEDINGS{Benoit2014,
author={Strat, S.T. and Benoit, A. and Lambert, P.},
booktitle={Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European},
title={Retina enhanced bag of words descriptors for video classification},
year={2014},
month={Sept},
pages={1307-1311}
}
@inproceedings{Strat2013,
title={Retina enhanced SIFT descriptors for video indexing},
author={Strat, Sabin Tiberius and Benoit, Alexandre and Lambert, Patrick},
......
********************************************************************
bioinspired. Biologically inspired vision models and derivated tools
********************************************************************
The module provides biological visual systems models (human visual system and others). It also provides derivated objects that take advantage of those bio-inspired models.
.. toctree::
:maxdepth: 2
Human retina documentation <retina>
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -12,12 +12,12 @@
**
** Maintainers : Listic lab (code author current affiliation & applications)
**
** Creation - enhancement process 2007-2013
** Creation - enhancement process 2007-2015
** Author: Alexandre Benoit (benoit.alexandre.vision@gmail.com), LISTIC lab, Annecy le vieux, France
**
** Theses algorithm have been developped by Alexandre BENOIT since his thesis with Alice Caplier at Gipsa-Lab (www.gipsa-lab.inpg.fr) and the research he pursues at LISTIC Lab (www.listic.univ-savoie.fr).
** Refer to the following research paper for more information:
** Strat S. T. , Benoit A.Lambert P. , Caplier A., "Retina Enhanced SURF Descriptors for Spatio-Temporal Concept Detection", Multimedia Tools and Applications, 2012 (DOI: 10.1007/s11042-012-1280-0)
** Strat, S.T.; Benoit, A.; Lambert, P., "Retina enhanced bag of words descriptors for video classification," Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European , vol., no., pp.1307,1311, 1-5 Sept. 2014 (http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6952461&isnumber=6951911)
** Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI: http://dx.doi.org/10.1016/j.cviu.2010.01.011
** This work have been carried out thanks to Jeanny Herault who's research and great discussions are the basis of all this work, please take a look at his book:
** Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
......@@ -30,7 +30,7 @@
** Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
**
** For Human Visual System tools (bioinspired)
** Copyright (C) 2007-2011, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Copyright (C) 2007-2015, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
**
** Third party copyrights are property of their respective owners.
**
......@@ -74,10 +74,37 @@ namespace cv
{
namespace bioinspired
{
//! @addtogroup bioinspired
//! @{
/** @brief parameter structure that stores the transient events detector setup parameters
*/
struct SegmentationParameters{ // CV_EXPORTS_W_MAP to export to python native dictionnaries
// default structure instance construction with default values
SegmentationParameters():
thresholdON(100),
thresholdOFF(100),
localEnergy_temporalConstant(0.5),
localEnergy_spatialConstant(5),
neighborhoodEnergy_temporalConstant(1),
neighborhoodEnergy_spatialConstant(15),
contextEnergy_temporalConstant(1),
contextEnergy_spatialConstant(75){};
// all properties list
float thresholdON;
float thresholdOFF;
//! the time constant of the first order low pass filter, use it to cut high temporal frequencies (noise or fast motion), unit is frames, typical value is 0.5 frame
float localEnergy_temporalConstant;
//! the spatial constant of the first order low pass filter, use it to cut high spatial frequencies (noise or thick contours), unit is pixels, typical value is 5 pixel
float localEnergy_spatialConstant;
//! local neighborhood energy filtering parameters : the aim is to get information about the energy neighborhood to perform a center surround energy analysis
float neighborhoodEnergy_temporalConstant;
float neighborhoodEnergy_spatialConstant;
//! context neighborhood energy filtering parameters : the aim is to get information about the energy on a wide neighborhood area to filtered out local effects
float contextEnergy_temporalConstant;
float contextEnergy_spatialConstant;
};
/** @brief class which provides a transient/moving areas segmentation module
perform a locally adapted segmentation by using the retina magno input data Based on Alexandre
......@@ -96,30 +123,6 @@ class CV_EXPORTS_W TransientAreasSegmentationModule: public Algorithm
{
public:
//! parameters structure
struct CV_EXPORTS_W SegmentationParameters{
SegmentationParameters():
thresholdON(100),
thresholdOFF(100),
localEnergy_temporalConstant(0.5),
localEnergy_spatialConstant(5),
neighborhoodEnergy_temporalConstant(1),
neighborhoodEnergy_spatialConstant(15),
contextEnergy_temporalConstant(1),
contextEnergy_spatialConstant(75){};// default setup
CV_PROP_RW float thresholdON;
CV_PROP_RW float thresholdOFF;
//! the time constant of the first order low pass filter, use it to cut high temporal frequencies (noise or fast motion), unit is frames, typical value is 0.5 frame
CV_PROP_RW float localEnergy_temporalConstant;
//! the spatial constant of the first order low pass filter, use it to cut high spatial frequencies (noise or thick contours), unit is pixels, typical value is 5 pixel
CV_PROP_RW float localEnergy_spatialConstant;
//! local neighborhood energy filtering parameters : the aim is to get information about the energy neighborhood to perform a center surround energy analysis
CV_PROP_RW float neighborhoodEnergy_temporalConstant;
CV_PROP_RW float neighborhoodEnergy_spatialConstant;
//! context neighborhood energy filtering parameters : the aim is to get information about the energy on a wide neighborhood area to filtered out local effects
CV_PROP_RW float contextEnergy_temporalConstant;
CV_PROP_RW float contextEnergy_spatialConstant;
};
/** @brief return the sze of the manage input and output images
*/
......@@ -141,7 +144,7 @@ public:
@param fs : the open Filestorage which contains segmentation parameters
@param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
*/
CV_WRAP virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true)=0;
virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true)=0;
/** @brief try to open an XML segmentation parameters file to adjust current segmentation instance setup
......@@ -149,11 +152,11 @@ public:
- warning, Exceptions are thrown if read XML file is not valid
@param newParameters : a parameters structures updated with the new target configuration
*/
CV_WRAP virtual void setup(SegmentationParameters newParameters)=0;
virtual void setup(SegmentationParameters newParameters)=0;
/** @brief return the current parameters setup
*/
CV_WRAP virtual SegmentationParameters getParameters()=0;
virtual SegmentationParameters getParameters()=0;
/** @brief parameters setup display method
@return a string which contains formatted parameters information
......@@ -168,7 +171,7 @@ public:
/** @brief write xml/yml formated parameters information
@param fs : a cv::Filestorage object ready to be filled
*/
CV_WRAP virtual void write( cv::FileStorage& fs ) const=0;
virtual void write( cv::FileStorage& fs ) const=0;
/** @brief main processing method, get result using methods getSegmentationPicture()
@param inputToSegment : the image to process, it must match the instance buffer size !
......
//============================================================================
// Name : retinademo.cpp
// Author : Alexandre Benoit, benoit.alexandre.vision@gmail.com
// Version : 0.1
// Copyright : LISTIC/GIPSA French Labs, May 2015
// Description : Gipsa/LISTIC Labs quick retina demo in C++, Ansi-style
//============================================================================
// include bioinspired module and OpenCV core utilities
#include "opencv2/bioinspired.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/videoio.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
#include <cstring>
// main function
int main(int argc, char* argv[]) {
// basic input arguments checking
if (argc>1)
{
std::cout<<"****************************************************"<<std::endl;
std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl;
std::cout<<"* This retina model allows spatio-temporal image processing (applied on a webcam sequences)."<<std::endl;
std::cout<<"* As a summary, these are the retina model properties:"<<std::endl;
std::cout<<"* => It applies a spectral whithening (mid-frequency details enhancement)"<<std::endl;
std::cout<<"* => high frequency spatio-temporal noise reduction"<<std::endl;
std::cout<<"* => low frequency luminance to be reduced (luminance range compression)"<<std::endl;
std::cout<<"* => local logarithmic luminance compression allows details to be enhanced in low light conditions\n"<<std::endl;
std::cout<<"* for more information, reer to the following papers :"<<std::endl;
std::cout<<"* Benoit A., Caplier A., Durette B., Herault, J., \"USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING\", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI: http://dx.doi.org/10.1016/j.cviu.2010.01.011"<<std::endl;
std::cout<<"* Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891."<<std::endl;
std::cout<<"* => reports comments/remarks at benoit.alexandre.vision@gmail.com"<<std::endl;
std::cout<<"* => more informations and papers at : http://sites.google.com/site/benoitalexandrevision/"<<std::endl;
std::cout<<"****************************************************"<<std::endl;
std::cout<<" NOTE : this program generates the default retina parameters file 'RetinaDefaultParameters.xml'"<<std::endl;
std::cout<<" => you can use this to fine tune parameters and load them if you save to file 'RetinaSpecificParameters.xml'"<<std::endl;
if (strcmp(argv[1], "help")==0){
std::cout<<"No help provided for now, please test the retina Demo for a more complete program"<<std::endl;
}
}
std::string inputMediaType=argv[1];
// declare the retina input buffer.
cv::Mat inputFrame;
// setup webcam reader and grab a first frame to get its size
cv::VideoCapture videoCapture(0);
videoCapture>>inputFrame;
// allocate a retina instance with input size equal to the one of the loaded image
cv::Ptr<cv::bioinspired::Retina> myRetina = cv::bioinspired::createRetina(inputFrame.size());
/* retina parameters management methods use sample
-> save current (here default) retina parameters to a xml file (you may use it only one time to get the file and modify it)
*/
myRetina->write("RetinaDefaultParameters.xml");
// -> load parameters if file exists
myRetina->setup("RetinaSpecificParameters.xml");
// reset all retina buffers (open your eyes)
myRetina->clearBuffers();
// declare retina output buffers
cv::Mat retinaOutput_parvo;
cv::Mat retinaOutput_magno;
//main processing loop
bool stillProcess=true;
while(stillProcess){
// if using video stream, then, grabbing a new frame, else, input remains the same
if (videoCapture.isOpened())
videoCapture>>inputFrame;
else
stillProcess=false;
// run retina filter
myRetina->run(inputFrame);
// Retrieve and display retina output
myRetina->getParvo(retinaOutput_parvo);
myRetina->getMagno(retinaOutput_magno);
cv::imshow("retina input", inputFrame);
cv::imshow("Retina Parvo", retinaOutput_parvo);
cv::imshow("Retina Magno", retinaOutput_magno);
cv::waitKey(5);
}
}
This diff is collapsed.
......@@ -351,7 +351,7 @@ void RetinaOCLImpl::setupIPLMagnoChannel(const bool normaliseOutput, const float
_retinaParameters.IplMagno.localAdaptintegration_k = localAdaptintegration_k;
}
void RetinaOCLImpl::run(const InputArray input)
void RetinaOCLImpl::run(InputArray input)
{
oclMat &inputMatToConvert = getOclMatRef(input);
bool colorMode = convertToColorPlanes(inputMatToConvert, _inputBuffer);
......
......@@ -12,12 +12,12 @@
**
** Maintainers : Listic lab (code author current affiliation & applications)
**
** Creation - enhancement process 2007-2013
** Creation - enhancement process 2007-2015
** Author: Alexandre Benoit (benoit.alexandre.vision@gmail.com), LISTIC lab, Annecy le vieux, France
**
** Theses algorithm have been developped by Alexandre BENOIT since his thesis with Alice Caplier at Gipsa-Lab (www.gipsa-lab.inpg.fr) and the research he pursues at LISTIC Lab (www.listic.univ-savoie.fr).
** Refer to the following research paper for more information:
** Strat S. T. , Benoit A.Lambert P. , Caplier A., "Retina Enhanced SURF Descriptors for Spatio-Temporal Concept Detection", Multimedia Tools and Applications, 2012 (DOI: 10.1007/s11042-012-1280-0)
** Strat, S.T.; Benoit, A.; Lambert, P., "Retina enhanced bag of words descriptors for video classification," Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European , vol., no., pp.1307,1311, 1-5 Sept. 2014 (http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6952461&isnumber=6951911)
** Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI: http://dx.doi.org/10.1016/j.cviu.2010.01.011
** This work have been carried out thanks to Jeanny Herault who's research and great discussions are the basis of all this work, please take a look at his book:
** Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
......@@ -30,7 +30,7 @@
** Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
**
** For Human Visual System tools (bioinspired)
** Copyright (C) 2007-2011, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Copyright (C) 2007-2015, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
**
** Third party copyrights are property of their respective owners.
**
......@@ -132,12 +132,12 @@ public:
* @param newParameters : a parameters structures updated with the new target configuration
* @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
*/
void setup(TransientAreasSegmentationModule::SegmentationParameters newParameters);
void setup(SegmentationParameters newParameters);
/**
* @return the current parameters setup
*/
struct TransientAreasSegmentationModule::SegmentationParameters getParameters();
struct SegmentationParameters getParameters();
/**
* parameters setup display method
......@@ -203,7 +203,7 @@ protected:
*/
inline const std::valarray<float> &getMotionContextPicture() const {return _contextMotionEnergy;};
struct cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters _segmentationParameters;
struct cv::bioinspired::SegmentationParameters _segmentationParameters;
// template buffers and related acess pointers
std::valarray<float> _inputToSegment;
std::valarray<float> _contextMotionEnergy;
......@@ -221,7 +221,7 @@ protected:
void _convertValarrayBuffer2cvMat(const std::valarray<bool> &grayMatrixToConvert, const unsigned int nbRows, const unsigned int nbColumns, OutputArray outBuffer);
bool _convertCvMat2ValarrayBuffer(InputArray inputMat, std::valarray<float> &outputValarrayMatrix);
const TransientAreasSegmentationModuleImpl & operator = (const TransientAreasSegmentationModuleImpl &);
const TransientAreasSegmentationModuleImpl & operator = (const TransientAreasSegmentationModuleImpl &);
};
class TransientAreasSegmentationModuleImpl_: public TransientAreasSegmentationModule
......@@ -232,9 +232,9 @@ public:
inline virtual void write( cv::FileStorage& fs ) const{_segmTool.write(fs);};
inline virtual void setup(String segmentationParameterFile, const bool applyDefaultSetupOnFailure){_segmTool.setup(segmentationParameterFile, applyDefaultSetupOnFailure);};
inline virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure){_segmTool.setup(fs, applyDefaultSetupOnFailure);};
inline virtual void setup(TransientAreasSegmentationModule::SegmentationParameters newParameters){_segmTool.setup(newParameters);};
inline virtual void setup(SegmentationParameters newParameters){_segmTool.setup(newParameters);};
inline virtual const String printSetup(){return _segmTool.printSetup();};
inline virtual struct TransientAreasSegmentationModule::SegmentationParameters getParameters(){return _segmTool.getParameters();};
inline virtual struct SegmentationParameters getParameters(){return _segmTool.getParameters();};
inline virtual void write( String fs ) const{_segmTool.write(fs);};
inline virtual void run(InputArray inputToSegment, const int channelIndex){_segmTool.run(inputToSegment, channelIndex);};
inline virtual void getSegmentationPicture(OutputArray transientAreas){return _segmTool.getSegmentationPicture(transientAreas);};
......@@ -286,7 +286,7 @@ void TransientAreasSegmentationModuleImpl::clearAllBuffers()
_segmentedAreas=0;
}
struct TransientAreasSegmentationModule::SegmentationParameters TransientAreasSegmentationModuleImpl::getParameters()
struct SegmentationParameters TransientAreasSegmentationModuleImpl::getParameters()
{
return _segmentationParameters;
};
......@@ -305,7 +305,7 @@ void TransientAreasSegmentationModuleImpl::setup(String segmentationParameterFil
if (applyDefaultSetupOnFailure)
{
printf("Retina::setup: resetting retina with default parameters\n");
cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters defaults;
cv::bioinspired::SegmentationParameters defaults;
setup(defaults);
}
else
......@@ -344,7 +344,7 @@ void TransientAreasSegmentationModuleImpl::setup(cv::FileStorage &fs, const bool
std::cout<<"Retina::setup: resetting retina with default parameters"<<std::endl;
if (applyDefaultSetupOnFailure)
{
struct cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters defaults;
struct cv::bioinspired::SegmentationParameters defaults;
setup(defaults);
}
std::cout<<"SegmentationModule::setup: wrong/unappropriate xml parameter file : error report :`n=>"<<e.what()<<std::endl;
......@@ -356,11 +356,11 @@ void TransientAreasSegmentationModuleImpl::setup(cv::FileStorage &fs, const bool
}
// setup parameters for the 2 filters that allow the segmentation
void TransientAreasSegmentationModuleImpl::setup(cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters newParameters)
void TransientAreasSegmentationModuleImpl::setup(cv::bioinspired::SegmentationParameters newParameters)
{
// copy structure contents
memcpy(&_segmentationParameters, &newParameters, sizeof(cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters));
memcpy(&_segmentationParameters, &newParameters, sizeof(cv::bioinspired::SegmentationParameters));
// apply setup
// init local motion energy extraction low pass filter
BasicRetinaFilter::setLPfilterParameters(0, newParameters.localEnergy_temporalConstant, newParameters.localEnergy_spatialConstant);
......
This diff is collapsed.
set(the_description "Custom Calibration Pattern")
ocv_define_module(ccalib opencv_core opencv_imgproc opencv_calib3d opencv_features2d)
ocv_define_module(ccalib opencv_core opencv_imgproc opencv_calib3d opencv_features2d WRAP python)
Custom Calibration Pattern
==========================
.. highlight:: cpp
CustomPattern
-------------
A custom pattern class that can be used to calibrate a camera and to further track the translation and rotation of the pattern. Defaultly it uses an ``ORB`` feature detector and a ``BruteForce-Hamming(2)`` descriptor matcher to find the location of the pattern feature points that will subsequently be used for calibration.
.. ocv:class:: CustomPattern : public Algorithm
CustomPattern::CustomPattern
----------------------------
CustomPattern constructor.
.. ocv:function:: CustomPattern()
CustomPattern::create
---------------------
A method that initializes the class and generates the necessary detectors, extractors and matchers.
.. ocv:function:: bool create(InputArray pattern, const Size2f boardSize, OutputArray output = noArray())
:param pattern: The image, which will be used as a pattern. If the desired pattern is part of a bigger image, you can crop it out using image(roi).
:param boardSize: The size of the pattern in physical dimensions. These will be used to scale the points when the calibration occurs.
:param output: A matrix that is the same as the input pattern image, but has all the feature points drawn on it.
:return Returns whether the initialization was successful or not. Possible reason for failure may be that no feature points were detected.
.. seealso::
:ocv:func:`getFeatureDetector`,
:ocv:func:`getDescriptorExtractor`,
:ocv:func:`getDescriptorMatcher`
.. note::
* Determine the number of detected feature points can be done through :ocv:func:`getPatternPoints` method.
* The feature detector, extractor and matcher cannot be changed after initialization.
CustomPattern::findPattern
--------------------------
Finds the pattern in the input image
.. ocv:function:: bool findPattern(InputArray image, OutputArray matched_features, OutputArray pattern_points, const double ratio = 0.7, const double proj_error = 8.0, const bool refine_position = false, OutputArray out = noArray(), OutputArray H = noArray(), OutputArray pattern_corners = noArray());
:param image: The input image where the pattern is searched for.
:param matched_features: A ``vector<Point2f>`` of the projections of calibration pattern points, matched in the image. The points correspond to the ``pattern_points``.``matched_features`` and ``pattern_points`` have the same size.
:param pattern_points: A ``vector<Point3f>`` of calibration pattern points in the calibration pattern coordinate space.
:param ratio: A ratio used to threshold matches based on D. Lowe's point ratio test.
:param proj_error: The maximum projection error that is allowed when the found points are back projected. A lower projection error will be beneficial for eliminating mismatches. Higher values are recommended when the camera lens has greater distortions.
:param refine_position: Whether to refine the position of the feature points with :ocv:func:`cornerSubPix`.
:param out: An image showing the matched feature points and a contour around the estimated pattern.
:param H: The homography transformation matrix between the pattern and the current image.
:param pattern_corners: A ``vector<Point2f>`` containing the 4 corners of the found pattern.
:return The method return whether the pattern was found or not.
CustomPattern::isInitialized
----------------------------
.. ocv:function:: bool isInitialized()
:return If the class is initialized or not.
CustomPattern::getPatternPoints
-------------------------------
.. ocv:function:: void getPatternPoints(OutputArray original_points)
:param original_points: Fills the vector with the points found in the pattern.
CustomPattern::getPixelSize
---------------------------
.. ocv:function:: double getPixelSize()
:return Get the physical pixel size as initialized by the pattern.
CustomPattern::setFeatureDetector
---------------------------------
.. ocv:function:: bool setFeatureDetector(Ptr<FeatureDetector> featureDetector)
:param featureDetector: Set a new FeatureDetector.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
.. note::
* It is left to user discretion to select matching feature detector, extractor and matchers. Please consult the documentation for each to confirm coherence.
CustomPattern::setDescriptorExtractor
-------------------------------------
.. ocv:function:: bool setDescriptorExtractor(Ptr<DescriptorExtractor> extractor)
:param extractor: Set a new DescriptorExtractor.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
CustomPattern::setDescriptorMatcher
-----------------------------------
.. ocv:function:: bool setDescriptorMatcher(Ptr<DescriptorMatcher> matcher)
:param matcher: Set a new DescriptorMatcher.
:return Is it successfully set? Will fail if the object is already initialized by :ocv:func:`create`.
CustomPattern::getFeatureDetector
---------------------------------
.. ocv:function:: Ptr<FeatureDetector> getFeatureDetector()
:return The used FeatureDetector.
CustomPattern::getDescriptorExtractor
-------------------------------------
.. ocv:function:: Ptr<DescriptorExtractor> getDescriptorExtractor()
:return The used DescriptorExtractor.
CustomPattern::getDescriptorMatcher
-----------------------------------
.. ocv:function:: Ptr<DescriptorMatcher> getDescriptorMatcher()
:return The used DescriptorMatcher.
CustomPattern::calibrate
------------------------
Calibrates the camera.
.. ocv:function:: double calibrate(InputArrayOfArrays objectPoints, InputArrayOfArrays imagePoints, Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs, OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs, int flags = 0, TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON))
See :ocv:func:`calibrateCamera` for parameter information.
CustomPattern::findRt
---------------------
Finds the rotation and translation vectors of the pattern.
.. ocv:function:: bool findRt(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int flags = ITERATIVE)
.. ocv:function:: bool findRt(InputArray image, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int flags = ITERATIVE)
:param image: The image, in which the rotation and translation of the pattern will be found.
See :ocv:func:`solvePnP` for parameter information.
CustomPattern::findRtRANSAC
---------------------------
Finds the rotation and translation vectors of the pattern using RANSAC.
.. ocv:function:: bool findRtRANSAC(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int iterationsCount = 100, float reprojectionError = 8.0, int minInliersCount = 100, OutputArray inliers = noArray(), int flags = ITERATIVE)
.. ocv:function:: bool findRtRANSAC(InputArray image, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess = false, int iterationsCount = 100, float reprojectionError = 8.0, int minInliersCount = 100, OutputArray inliers = noArray(), int flags = ITERATIVE)
:param image: The image, in which the rotation and translation of the pattern will be found.
See :ocv:func:`solvePnPRANSAC` for parameter information.
CustomPattern::drawOrientation
------------------------------
Draws the ``(x,y,z)`` axis on the image, in the center of the pattern, showing the orientation of the pattern.
.. ocv:function:: void drawOrientation(InputOutputArray image, InputArray tvec, InputArray rvec, InputArray cameraMatrix, InputArray distCoeffs, double axis_length = 3, int axis_width = 2)
:param image: The image, based on which the rotation and translation was calculated. The axis will be drawn in color - ``x`` - in red, ``y`` - in green, ``z`` - in blue.
:param tvec: Translation vector.
:param rvec: Rotation vector.
:param cameraMatrix: The camera matrix.
:param distCoeffs: The distortion coefficients.
:param axis_length: The length of the axis symbol.
:param axis_width: The width of the axis symbol.
......@@ -17,4 +17,4 @@ foreach(dt5_dep Core Gui Widgets)
endforeach()
set(the_description "Debug visualization framework")
ocv_define_module(cvv opencv_core opencv_imgproc opencv_features2d ${CVV_LIBRARIES})
ocv_define_module(cvv opencv_core opencv_imgproc opencv_features2d ${CVV_LIBRARIES} WRAP python)
*********************************************************************
cvv. GUI for Interactive Visual Debugging of Computer Vision Programs
*********************************************************************
The module provides an interactive GUI to debug and incrementally design computer vision algorithms. The debug statements can remain in the code after development and aid in further changes because they have neglectable overhead if the program is compiled in release mode.
.. toctree::
:maxdepth: 2
CVV API Documentation <cvv_api>
CVV GUI Documentation <cvv_gui>
CVV : the API
*************
.. highlight:: cpp
Introduction
++++++++++++
Namespace for all functions is **cvv**, i.e. *cvv::showImage()*.
Compilation:
* For development, i.e. for cvv GUI to show up, compile your code using cvv with *g++ -DCVVISUAL_DEBUGMODE*.
* For release, i.e. cvv calls doing nothing, compile your code without above flag.
See cvv tutorial for a commented example application using cvv.
API Functions
+++++++++++++
showImage
---------
Add a single image to debug GUI (similar to :imshow:`imshow <>`).
.. ocv:function:: void showImage(InputArray img, const CallMetaData& metaData, const string& description, const string& view)
:param img: Image to show in debug GUI.
:param metaData: Properly initialized CallMetaData struct, i.e. information about file, line and function name for GUI. Use CVVISUAL_LOCATION macro.
:param description: Human readable description to provide context to image.
:param view: Preselect view that will be used to visualize this image in GUI. Other views can still be selected in GUI later on.
debugFilter
-----------
Add two images to debug GUI for comparison. Usually the input and output of some filter operation, whose result should be inspected.
.. ocv:function:: void debugFilter(InputArray original, InputArray result, const CallMetaData& metaData, const string& description, const string& view)
:param original: First image for comparison, e.g. filter input.
:param result: Second image for comparison, e.g. filter output.
:param metaData: See :ocv:func:`showImage`
:param description: See :ocv:func:`showImage`
:param view: See :ocv:func:`showImage`
debugDMatch
-----------
Add a filled in :basicstructures:`DMatch <dmatch>` to debug GUI. The matches can are visualized for interactive inspection in different GUI views (one similar to an interactive :draw_matches:`drawMatches<>`).
.. ocv:function:: void debugDMatch(InputArray img1, std::vector<cv::KeyPoint> keypoints1, InputArray img2, std::vector<cv::KeyPoint> keypoints2, std::vector<cv::DMatch> matches, const CallMetaData& metaData, const string& description, const string& view, bool useTrainDescriptor)
:param img1: First image used in :basicstructures:`DMatch <dmatch>`.
:param keypoints1: Keypoints of first image.
:param img2: Second image used in DMatch.
:param keypoints2: Keypoints of second image.
:param metaData: See :ocv:func:`showImage`
:param description: See :ocv:func:`showImage`
:param view: See :ocv:func:`showImage`
:param useTrainDescriptor: Use :basicstructures:`DMatch <dmatch>`'s train descriptor index instead of query descriptor index.
finalShow
---------
This function **must** be called *once* *after* all cvv calls if any.
As an alternative create an instance of FinalShowCaller, which calls finalShow() in its destructor (RAII-style).
.. ocv:function:: void finalShow()
setDebugFlag
------------
Enable or disable cvv for current translation unit and thread (disabled this way has higher - but still low - overhead compared to using the compile flags).
.. ocv:function:: void setDebugFlag(bool active)
:param active: See above
CVV : the GUI
*************
.. highlight:: cpp
Introduction
++++++++++++
For now: See cvv tutorial.
Overview
++++++++
Filter
------
Views
++++++++
Interactive Visual Debugging of Computer Vision applications {#tutorial_cvv_introduction}
============================================================
What is the most common way to debug computer vision applications? Usually the answer is temporary,
hacked together, custom code that must be removed from the code for release compilation.
In this tutorial we will show how to use the visual debugging features of the **cvv** module
(*opencv2/cvv.hpp*) instead.
Goals
-----
In this tutorial you will learn how to:
- Add cvv debug calls to your application
- Use the visual debug GUI
- Enable and disable the visual debug features during compilation (with zero runtime overhead when
disabled)
Code
----
The example code
- captures images (*videoio*), e.g. from a webcam,
- applies some filters to each image (*imgproc*),
- detects image features and matches them to the previous image (*features2d*).
If the program is compiled without visual debugging (see CMakeLists.txt below) the only result is
some information printed to the command line. We want to demonstrate how much debugging or
development functionality is added by just a few lines of *cvv* commands.
@includelineno cvv/samples/cvv_demo.cpp
@code{.cmake}
cmake_minimum_required(VERSION 2.8)
project(cvvisual_test)
SET(CMAKE_PREFIX_PATH ~/software/opencv/install)
SET(CMAKE_CXX_COMPILER "g++-4.8")
SET(CMAKE_CXX_FLAGS "-std=c++11 -O2 -pthread -Wall -Werror")
# (un)set: cmake -DCVV_DEBUG_MODE=OFF ..
OPTION(CVV_DEBUG_MODE "cvvisual-debug-mode" ON)
if(CVV_DEBUG_MODE MATCHES ON)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DCVVISUAL_DEBUGMODE")
endif()
FIND_PACKAGE(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(cvvt main.cpp)
target_link_libraries(cvvt
opencv_core opencv_videoio opencv_imgproc opencv_features2d
opencv_cvv
)
@endcode
Explanation
-----------
-# We compile the program either using the above CmakeLists.txt with Option *CVV_DEBUG_MODE=ON*
(*cmake -DCVV_DEBUG_MODE=ON*) or by adding the corresponding define *CVVISUAL_DEBUGMODE* to
our compiler (e.g. *g++ -DCVVISUAL_DEBUGMODE*).
-# The first cvv call simply shows the image (similar to *imshow*) with the imgIdString as comment.
@code{.cpp}
cvv::showImage(imgRead, CVVISUAL_LOCATION, imgIdString.c_str());
@endcode
The image is added to the overview tab in the visual debug GUI and the cvv call blocks.
![image](images/01_overview_single.jpg)
The image can then be selected and viewed
![image](images/02_single_image_view.jpg)
Whenever you want to continue in the code, i.e. unblock the cvv call, you can either continue
until the next cvv call (*Step*), continue until the last cvv call (*\>\>*) or run the
application until it exists (*Close*).
We decide to press the green *Step* button.
-# The next cvv calls are used to debug all kinds of filter operations, i.e. operations that take a
picture as input and return a picture as output.
@code{.cpp}
cvv::debugFilter(imgRead, imgGray, CVVISUAL_LOCATION, "to gray");
@endcode
As with every cvv call, you first end up in the overview.
![image](images/03_overview_two.jpg)
We decide not to care about the conversion to gray scale and press *Step*.
@code{.cpp}
cvv::debugFilter(imgGray, imgGraySmooth, CVVISUAL_LOCATION, "smoothed");
@endcode
If you open the filter call, you will end up in the so called "DefaultFilterView". Both images
are shown next to each other and you can (synchronized) zoom into them.
![image](images/04_default_filter_view.jpg)
When you go to very high zoom levels, each pixel is annotated with its numeric values.
![image](images/05_default_filter_view_high_zoom.jpg)
We press *Step* twice and have a look at the dilated image.
@code{.cpp}
cvv::debugFilter(imgEdges, imgEdgesDilated, CVVISUAL_LOCATION, "dilated edges");
@endcode
The DefaultFilterView showing both images
![image](images/06_default_filter_view_edges.jpg)
Now we use the *View* selector in the top right and select the "DualFilterView". We select
"Changed Pixels" as filter and apply it (middle image).
![image](images/07_dual_filter_view_edges.jpg)
After we had a close look at these images, perhaps using different views, filters or other GUI
features, we decide to let the program run through. Therefore we press the yellow *\>\>* button.
The program will block at
@code{.cpp}
cvv::finalShow();
@endcode
and display the overview with everything that was passed to cvv in the meantime.
![image](images/08_overview_all.jpg)
-# The cvv debugDMatch call is used in a situation where there are two images each with a set of
descriptors that are matched to each other.
We pass both images, both sets of keypoints and their matching to the visual debug module.
@code{.cpp}
cvv::debugDMatch(prevImgGray, prevKeypoints, imgGray, keypoints, matches, CVVISUAL_LOCATION, allMatchIdString.c_str());
@endcode
Since we want to have a look at matches, we use the filter capabilities (*\#type match*) in the
overview to only show match calls.
![image](images/09_overview_filtered_type_match.jpg)
We want to have a closer look at one of them, e.g. to tune our parameters that use the matching.
The view has various settings how to display keypoints and matches. Furthermore, there is a
mouseover tooltip.
![image](images/10_line_match_view.jpg)
We see (visual debugging!) that there are many bad matches. We decide that only 70% of the
matches should be shown - those 70% with the lowest match distance.
![image](images/11_line_match_view_portion_selector.jpg)
Having successfully reduced the visual distraction, we want to see more clearly what changed
between the two images. We select the "TranslationMatchView" that shows to where the keypoint
was matched in a different way.
![image](images/12_translation_match_view_portion_selector.jpg)
It is easy to see that the cup was moved to the left during the two images.
Although, cvv is all about interactively *seeing* the computer vision bugs, this is complemented
by a "RawView" that allows to have a look at the underlying numeric data.
![image](images/13_raw_view.jpg)
-# There are many more useful features contained in the cvv GUI. For instance, one can group the
overview tab.
![image](images/14_overview_group_by_line.jpg)
Result
------
- By adding a view expressive lines to our computer vision program we can interactively debug it
through different visualizations.
- Once we are done developing/debugging we do not have to remove those lines. We simply disable
cvv debugging (*cmake -DCVV_DEBUG_MODE=OFF* or g++ without *-DCVVISUAL_DEBUGMODE*) and our
programs runs without any debug overhead.
Enjoy computer vision!
set(the_description "datasets framework")
ocv_define_module(datasets opencv_core opencv_face opencv_ml opencv_flann opencv_text)
ocv_define_module(datasets opencv_core opencv_face opencv_ml opencv_flann opencv_text WRAP python)
ocv_warnings_disable(CMAKE_CXX_FLAGS /wd4267) # flann, Win64
HMDB: A Large Human Motion Database
===================================
.. ocv:class:: AR_hmdb
Implements loading dataset:
_`"HMDB: A Large Human Motion Database"`: http://serre-lab.clps.brown.edu/resource/hmdb-a-large-human-motion-database/
.. note:: Usage
1. From link above download dataset files: hmdb51_org.rar & test_train_splits.rar.
2. Unpack them. Unpack all archives from directory: hmdb51_org/ and remove them.
3. To load data run: ./opencv/build/bin/example_datasets_ar_hmdb -p=/home/user/path_to_unpacked_folders/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy: 0.107407 (using precomputed HOG/HOF "STIP" features from site, averaging for 3 splits)
To run this benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_ar_hmdb_benchmark -p=/home/user/path_to_unpacked_folders/
(precomputed features should be unpacked in the same folder: /home/user/path_to_unpacked_folders/hmdb51_org_stips/. Also unpack all archives from directory: hmdb51_org_stips/ and remove them.)
**References:**
.. [Kuehne11] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. HMDB: A Large Video Database for Human Motion Recognition. ICCV, 2011
.. [Laptev08] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning Realistic Human Actions From Movies. CVPR, 2008
Sports-1M Dataset
=================
.. ocv:class:: AR_sports
Implements loading dataset:
_`"Sports-1M Dataset"`: http://cs.stanford.edu/people/karpathy/deepvideo/
.. note:: Usage
1. From link above download dataset files (git clone https://code.google.com/p/sports-1m-dataset/).
2. To load data run: ./opencv/build/bin/example_datasets_ar_sports -p=/home/user/path_to_downloaded_folders/
**References:**
.. [KarpathyCVPR14] Andrej Karpathy and George Toderici and Sanketh Shetty and Thomas Leung and Rahul Sukthankar and Li Fei-Fei. Large-scale Video Classification with Convolutional Neural Networks. CVPR, 2014
*******************************************************
datasets. Framework for working with different datasets
*******************************************************
.. highlight:: cpp
The datasets module includes classes for working with different datasets: load data, evaluate different algorithms on them, contains benchmarks, etc.
It is planned to have:
* basic: loading code for all datasets to help start work with them.
* next stage: quick benchmarks for all datasets to show how to solve them using OpenCV and implement evaluation code.
* finally: implement on OpenCV state-of-the-art algorithms, which solve these tasks.
.. toctree::
:hidden:
ar_hmdb
ar_sports
fr_adience
fr_lfw
gr_chalearn
gr_skig
hpe_humaneva
hpe_parse
ir_affine
ir_robot
is_bsds
is_weizmann
msm_epfl
msm_middlebury
or_imagenet
or_mnist
or_sun
pd_caltech
slam_kitti
slam_tumindoor
tr_chars
tr_svt
Action Recognition
------------------
:doc:`ar_hmdb` [#f1]_
:doc:`ar_sports`
Face Recognition
----------------
:doc:`fr_adience`
:doc:`fr_lfw` [#f1]_
Gesture Recognition
-------------------
:doc:`gr_chalearn`
:doc:`gr_skig`
Human Pose Estimation
---------------------
:doc:`hpe_humaneva`
:doc:`hpe_parse`
Image Registration
------------------
:doc:`ir_affine`
:doc:`ir_robot`
Image Segmentation
------------------
:doc:`is_bsds`
:doc:`is_weizmann`
Multiview Stereo Matching
-------------------------
:doc:`msm_epfl`
:doc:`msm_middlebury`
Object Recognition
------------------
:doc:`or_imagenet`
:doc:`or_mnist` [#f2]_
:doc:`or_sun`
Pedestrian Detection
--------------------
:doc:`pd_caltech` [#f2]_
SLAM
----
:doc:`slam_kitti`
:doc:`slam_tumindoor`
Text Recognition
----------------
:doc:`tr_chars`
:doc:`tr_svt` [#f1]_
*Footnotes*
.. [#f1] Benchmark implemented
.. [#f2] Not used in Vision Challenge
Adience
=======
.. ocv:class:: FR_adience
Implements loading dataset:
_`"Adience"`: http://www.openu.ac.il/home/hassner/Adience/data.html
.. note:: Usage
1. From link above download any dataset file: faces.tar.gz\\aligned.tar.gz and files with splits: fold_0_data.txt-fold_4_data.txt, fold_frontal_0_data.txt-fold_frontal_4_data.txt. (For face recognition task another splits should be created)
2. Unpack dataset file to some folder and place split files into the same folder.
3. To load data run: ./opencv/build/bin/example_datasets_fr_adience -p=/home/user/path_to_created_folder/
**References:**
.. [Eidinger] E. Eidinger, R. Enbar, and T. Hassner. Age and Gender Estimation of Unfiltered Faces
Labeled Faces in the Wild
=========================
.. ocv:class:: FR_lfw
Implements loading dataset:
_`"Labeled Faces in the Wild"`: http://vis-www.cs.umass.edu/lfw/
.. note:: Usage
1. From link above download any dataset file: lfw.tgz\\lfwa.tar.gz\\lfw-deepfunneled.tgz\\lfw-funneled.tgz and files with pairs: 10 test splits: pairs.txt and developer train split: pairsDevTrain.txt.
2. Unpack dataset file and place pairs.txt and pairsDevTrain.txt in created folder.
3. To load data run: ./opencv/build/bin/example_datasets_fr_lfw -p=/home/user/path_to_unpacked_folder/lfw2/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy: 0.623833 +- 0.005223 (train split: pairsDevTrain.txt, dataset: lfwa)
To run this benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_fr_lfw_benchmark -p=/home/user/path_to_unpacked_folder/lfw2/
**References:**
.. [Huang07] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. 2007
ChaLearn Looking at People
==========================
.. ocv:class:: GR_chalearn
Implements loading dataset:
_`"ChaLearn Looking at People"`: http://gesture.chalearn.org/
.. note:: Usage
1. Follow instruction from site above, download files for dataset "Track 3: Gesture Recognition": Train1.zip-Train5.zip, Validation1.zip-Validation3.zip (Register on site: www.codalab.org and accept the terms and conditions of competition: https://www.codalab.org/competitions/991#learn_the_details There are three mirrors for downloading dataset files. When I downloaded data only mirror: "Universitat Oberta de Catalunya" works).
2. Unpack train archives Train1.zip-Train5.zip to folder Train/, validation archives Validation1.zip-Validation3.zip to folder Validation/
3. Unpack all archives in Train/ & Validation/ in the folders with the same names, for example: Sample0001.zip to Sample0001/
4. To load data run: ./opencv/build/bin/example_datasets_gr_chalearn -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Escalera14] S. Escalera, X. Baró, J. Gonzàlez, M.A. Bautista, M. Madadi, M. Reyes, V. Ponce-López, H.J. Escalante, J. Shotton, I. Guyon, "ChaLearn Looking at People Challenge 2014: Dataset and Results", ECCV Workshops, 2014
Sheffield Kinect Gesture Dataset
================================
.. ocv:class:: GR_skig
Implements loading dataset:
_`"Sheffield Kinect Gesture Dataset"`: http://lshao.staff.shef.ac.uk/data/SheffieldKinectGesture.htm
.. note:: Usage
1. From link above download dataset files: subject1_dep.7z-subject6_dep.7z, subject1_rgb.7z-subject6_rgb.7z.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_gr_skig -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Liu13] L. Liu and L. Shao, “Learning Discriminative Representations from RGB-D Video Data”, In Proc. International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013.
HumanEva Dataset
================
.. ocv:class:: HPE_humaneva
Implements loading dataset:
_`"HumanEva Dataset"`: http://humaneva.is.tue.mpg.de
.. note:: Usage
1. From link above download dataset files for HumanEva-I (tar) & HumanEva-II.
2. Unpack them to HumanEva_1 & HumanEva_2 accordingly.
3. To load data run: ./opencv/build/bin/example_datasets_hpe_humaneva -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Sigal10] L. Sigal, A. Balan and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion, In International Journal of Computer Vision, Vol. 87 (1-2), 2010
.. [Sigal06] L. Sigal and M. J. Black. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion, Techniacl Report CS-06-08, Brown University, 2006
PARSE Dataset
=============
.. ocv:class:: HPE_parse
Implements loading dataset:
_`"PARSE Dataset"`: http://www.ics.uci.edu/~dramanan/papers/parse/
.. note:: Usage
1. From link above download dataset file: people.zip.
2. Unpack it.
3. To load data run: ./opencv/build/bin/example_datasets_hpe_parse -p=/home/user/path_to_unpacked_folder/people_all/
**References:**
.. [Ramanan06] D. Ramanan "Learning to Parse Images of Articulated Bodies." Neural Info. Proc. Systems (NIPS) To appear. Dec 2006.
Affine Covariant Regions Datasets
=================================
.. ocv:class:: IR_affine
Implements loading dataset:
_`"Affine Covariant Regions Datasets"`: http://www.robots.ox.ac.uk/~vgg/data/data-aff.html
.. note:: Usage
1. From link above download dataset files: bark\\bikes\\boat\\graf\\leuven\\trees\\ubc\\wall.tar.gz.
2. Unpack them.
3. To load data, for example, for "bark", run: ./opencv/build/bin/example_datasets_ir_affine -p=/home/user/path_to_unpacked_folder/bark/
**References:**
.. [Mikolajczyk05] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. Van Gool. A Comparison of Affine Region Detectors. International Journal of Computer Vision, Volume 65, Number 1/2, page 43--72, 2005
Robot Data Set
==============
.. ocv:class:: IR_robot
Implements loading dataset:
_`"Robot Data Set, Point Feature Data Set – 2010"`: http://roboimagedata.compute.dtu.dk/?page_id=24
.. note:: Usage
1. From link above download dataset files: SET001_6.tar.gz-SET055_60.tar.gz
2. Unpack them to one folder.
3. To load data run: ./opencv/build/bin/example_datasets_ir_robot -p=/home/user/path_to_unpacked_folder/
**References:**
.. [aanæsinteresting] Aan{\ae}s, H. and Dahl, A.L. and Steenstrup Pedersen, K. Interesting Interest Points. International Journal of Computer Vision. 2012.
The Berkeley Segmentation Dataset and Benchmark
===============================================
.. ocv:class:: IS_bsds
Implements loading dataset:
_`"The Berkeley Segmentation Dataset and Benchmark"`: https://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
.. note:: Usage
1. From link above download dataset files: BSDS300-human.tgz & BSDS300-images.tgz.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_is_bsds -p=/home/user/path_to_unpacked_folder/BSDS300/
**References:**
.. [MartinFTM01] D. Martin and C. Fowlkes and D. Tal and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. 2001
Weizmann Segmentation Evaluation Database
=========================================
.. ocv:class:: IS_weizmann
Implements loading dataset:
_`"Weizmann Segmentation Evaluation Database"`: http://www.wisdom.weizmann.ac.il/~vision/Seg_Evaluation_DB/
.. note:: Usage
1. From link above download dataset files: Weizmann_Seg_DB_1obj.ZIP & Weizmann_Seg_DB_2obj.ZIP.
2. Unpack them.
3. To load data, for example, for 1 object dataset, run: ./opencv/build/bin/example_datasets_is_weizmann -p=/home/user/path_to_unpacked_folder/1obj/
**References:**
.. [AlpertGBB07] Sharon Alpert and Meirav Galun and Ronen Basri and Achi Brandt. Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration. 2007
EPFL Multi-View Stereo
======================
.. ocv:class:: MSM_epfl
Implements loading dataset:
_`"EPFL Multi-View Stereo"`: http://cvlabwww.epfl.ch/~strecha/multiview/denseMVS.html
.. note:: Usage
1. From link above download dataset files: castle_dense\\castle_dense_large\\castle_entry\\fountain\\herzjesu_dense\\herzjesu_dense_large_bounding\\cameras\\images\\p.tar.gz.
2. Unpack them in separate folder for each object. For example, for "fountain", in folder fountain/ : fountain_dense_bounding.tar.gz -> bounding/, fountain_dense_cameras.tar.gz -> camera/, fountain_dense_images.tar.gz -> png/, fountain_dense_p.tar.gz -> P/
3. To load data, for example, for "fountain", run: ./opencv/build/bin/example_datasets_msm_epfl -p=/home/user/path_to_unpacked_folder/fountain/
**References:**
.. [Strecha08] C. Strecha, W. von Hansen, L. Van Gool, P. Fua, U. Thoennessen. On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery. CVPR, 2008
Stereo – Middlebury Computer Vision
===================================
.. ocv:class:: MSM_middlebury
Implements loading dataset:
_`"Stereo – Middlebury Computer Vision"`: http://vision.middlebury.edu/mview/
.. note:: Usage
1. From link above download dataset files: dino\\dinoRing\\dinoSparseRing\\temple\\templeRing\\templeSparseRing.zip
2. Unpack them.
3. To load data, for example "temple" dataset, run: ./opencv/build/bin/example_datasets_msm_middlebury -p=/home/user/path_to_unpacked_folder/temple/
**References:**
.. [Seitz06] S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, R. Szeliski. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, CVPR, 2006
ImageNet
========
.. ocv:class:: OR_imagenet
Implements loading dataset:
_`"ImageNet"`: http://www.image-net.org/
.. note:: Usage
1. From link above download dataset files: ILSVRC2010_images_train.tar\\ILSVRC2010_images_test.tar\\ILSVRC2010_images_val.tar & devkit: ILSVRC2010_devkit-1.0.tar.gz (Implemented loading of 2010 dataset as only this dataset has ground truth for test data, but structure for ILSVRC2014 is similar)
2. Unpack them to: some_folder/train/\\some_folder/test/\\some_folder/val & some_folder/ILSVRC2010_validation_ground_truth.txt\\some_folder/ILSVRC2010_test_ground_truth.txt.
3. Create file with labels: some_folder/labels.txt, for example, using :ref:`python script <python-script>` below (each file's row format: synset,labelID,description. For example: "n07751451,18,plum").
4. Unpack all tar files in train.
5. To load data run: ./opencv/build/bin/example_datasets_or_imagenet -p=/home/user/some_folder/
.. _python-script:
Python script to parse meta.mat:
::
import scipy.io
meta_mat = scipy.io.loadmat("devkit-1.0/data/meta.mat")
labels_dic = dict((m[0][1][0], m[0][0][0][0]-1) for m in meta_mat['synsets']
label_names_dic = dict((m[0][1][0], m[0][2][0]) for m in meta_mat['synsets']
for label in labels_dic.keys():
print "{0},{1},{2}".format(label, labels_dic[label], label_names_dic[label])
**References:**
.. [ILSVRCarxiv14] Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. 2014
MNIST
=====
.. ocv:class:: OR_mnist
Implements loading dataset:
_`"MNIST"`: http://yann.lecun.com/exdb/mnist/
.. note:: Usage
1. From link above download dataset files: t10k-images-idx3-ubyte.gz, t10k-labels-idx1-ubyte.gz, train-images-idx3-ubyte.gz, train-labels-idx1-ubyte.gz.
2. Unpack them.
3. To load data run: ./opencv/build/bin/example_datasets_or_mnist -p=/home/user/path_to_unpacked_files/
**References:**
.. [LeCun98a] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
SUN Database
============
.. ocv:class:: OR_sun
Implements loading dataset:
_`"SUN Database, Scene Recognition Benchmark. SUN397"`: http://vision.cs.princeton.edu/projects/2010/SUN/
.. note:: Usage
1. From link above download dataset file: SUN397.tar & file with splits: Partitions.zip
2. Unpack SUN397.tar into folder: SUN397/ & Partitions.zip into folder: SUN397/Partitions/
3. To load data run: ./opencv/build/bin/example_datasets_or_sun -p=/home/user/path_to_unpacked_files/SUN397/
**References:**
.. [Xiao10] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. SUN Database: Large-scale Scene Recognition from Abbey to Zoo. IEEE Conference on Computer Vision and Pattern Recognition. CVPR, 2010
.. [Xiao14] J. Xiao, K. A. Ehinger, J. Hays, A. Torralba, and A. Oliva. SUN Database: Exploring a Large Collection of Scene Categories. International Journal of Computer Vision. IJCV, 2014
Caltech Pedestrian Detection Benchmark
======================================
.. ocv:class:: PD_caltech
Implements loading dataset:
_`"Caltech Pedestrian Detection Benchmark"`: http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
.. note:: First version of Caltech Pedestrian dataset loading.
Code to unpack all frames from seq files commented as their number is huge!
So currently load only meta information without data.
Also ground truth isn't processed, as need to convert it from mat files first.
.. note:: Usage
1. From link above download dataset files: set00.tar-set10.tar.
2. Unpack them to separate folder.
3. To load data run: ./opencv/build/bin/example_datasets_pd_caltech -p=/home/user/path_to_unpacked_folders/
**References:**
.. [Dollár12] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: An Evaluation of the State of the Art. PAMI, 2012.
.. [DollárCVPR09] P. Dollár, C. Wojek, B. Schiele and P. Perona. Pedestrian Detection: A Benchmark. CVPR, 2009
KITTI Vision Benchmark
======================
.. ocv:class:: SLAM_kitti
Implements loading dataset:
_`"KITTI Vision Benchmark"`: http://www.cvlibs.net/datasets/kitti/eval_odometry.php
.. note:: Usage
1. From link above download "Odometry" dataset files: data_odometry_gray\\data_odometry_color\\data_odometry_velodyne\\data_odometry_poses\\data_odometry_calib.zip.
2. Unpack data_odometry_poses.zip, it creates folder dataset/poses/. After that unpack data_odometry_gray.zip, data_odometry_color.zip, data_odometry_velodyne.zip. Folder dataset/sequences/ will be created with folders 00/..21/. Each of these folders will contain: image_0/, image_1/, image_2/, image_3/, velodyne/ and files calib.txt & times.txt. These two last files will be replaced after unpacking data_odometry_calib.zip at the end.
3. To load data run: ./opencv/build/bin/example_datasets_slam_kitti -p=/home/user/path_to_unpacked_folder/dataset/
**References:**
.. [Geiger2012CVPR] Andreas Geiger and Philip Lenz and Raquel Urtasun. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. CVPR, 2012
.. [Geiger2013IJRR] Andreas Geiger and Philip Lenz and Christoph Stiller and Raquel Urtasun. Vision meets Robotics: The KITTI Dataset. IJRR, 2013
.. [Fritsch2013ITSC] Jannik Fritsch and Tobias Kuehnl and Andreas Geiger. A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms. ITSC, 2013
TUMindoor Dataset
=================
.. ocv:class:: SLAM_tumindoor
Implements loading dataset:
_`"TUMindoor Dataset"`: http://www.navvis.lmt.ei.tum.de/dataset/
.. note:: Usage
1. From link above download dataset files: dslr\\info\\ladybug\\pointcloud.tar.bz2 for each dataset: 11-11-28 (1st floor)\\11-12-13 (1st floor N1)\\11-12-17a (4th floor)\\11-12-17b (3rd floor)\\11-12-17c (Ground I)\\11-12-18a (Ground II)\\11-12-18b (2nd floor)
2. Unpack them in separate folder for each dataset. dslr.tar.bz2 -> dslr/, info.tar.bz2 -> info/, ladybug.tar.bz2 -> ladybug/, pointcloud.tar.bz2 -> pointcloud/.
3. To load each dataset run: ./opencv/build/bin/example_datasets_slam_tumindoor -p=/home/user/path_to_unpacked_folders/
**References:**
.. [TUMindoor] R. Huitl and G. Schroth and S. Hilsenbeck and F. Schweiger and E. Steinbach. {TUM}indoor: An Extensive Image and Point Cloud Dataset for Visual Indoor Localization and Mapping. 2012
The Chars74K Dataset
====================
.. ocv:class:: TR_chars
Implements loading dataset:
_`"The Chars74K Dataset"`: http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
.. note:: Usage
1. From link above download dataset files: EnglishFnt\\EnglishHnd\\EnglishImg\\KannadaHnd\\KannadaImg.tgz, ListsTXT.tgz.
2. Unpack them.
3. Move .m files from folder ListsTXT/ to appropriate folder. For example, English/list_English_Img.m for EnglishImg.tgz.
4. To load data, for example "EnglishImg", run: ./opencv/build/bin/example_datasets_tr_chars -p=/home/user/path_to_unpacked_folder/English/
**References:**
.. [Campos09] T. E. de Campos, B. R. Babu and M. Varma. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), 2009
The Street View Text Dataset
============================
.. ocv:class:: TR_svt
Implements loading dataset:
_`"The Street View Text Dataset"`: http://vision.ucsd.edu/~kai/svt/
.. note:: Usage
1. From link above download dataset file: svt.zip.
2. Unpack it.
3. To load data run: ./opencv/build/bin/example_datasets_tr_svt -p=/home/user/path_to_unpacked_folder/svt/svt1/
Benchmark
"""""""""
For this dataset was implemented benchmark with accuracy (mean f1): 0.217
To run benchmark execute:
.. code-block:: bash
./opencv/build/bin/example_datasets_tr_svt_benchmark -p=/home/user/path_to_unpacked_folders/svt/svt1/
**References:**
.. [Wang11] Kai Wang, Boris Babenko and Serge Belongie. End-to-end Scene Text Recognition. ICCV, 2011
.. [Wang10] Kai Wang and Serge Belongie. Word Spotting in the Wild. ECCV, 2010
......@@ -282,7 +282,7 @@ Usage:
Implements loading dataset:
"EPFL Multi-View Stereo": <http://cvlabwww.epfl.ch/~strecha/multiview/denseMVS.html>
"EPFL Multi-View Stereo": <http://cvlab.epfl.ch/data/strechamvs>
Usage:
-# From link above download dataset files:
......
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2015, Itseez Inc, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Itseez Inc or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#ifndef OPENCV_DATASETS_PD_INRIA_HPP
#define OPENCV_DATASETS_PD_INRIA_HPP
#include <string>
#include <vector>
#include "opencv2/datasets/dataset.hpp"
#include <opencv2/core.hpp>
namespace cv
{
namespace datasets
{
//! @addtogroup datasets_pd
//! @{
enum sampleType
{
POS = 0,
NEG = 1
};
struct PD_inriaObj : public Object
{
// image file name
std::string filename;
// positive or negative
sampleType sType;
// image size
int width;
int height;
int depth;
// bounding boxes
std::vector< Rect > bndboxes;
};
class CV_EXPORTS PD_inria : public Dataset
{
public:
virtual void load(const std::string &path) = 0;
static Ptr<PD_inria> create();
};
//! @}
}
}
#endif
......@@ -78,7 +78,9 @@ int main(int argc, const char *argv[])
{
const char *keys =
"{ help h usage ? | | show this message }"
"{ path p |true| path to dataset (lfw2 folder) }";
"{ path p |true| path to dataset (lfw2 folder) }"
"{ train t |dev | train method: 'dev'(pairsDevTrain.txt) or 'split'(pairs.txt) }";
CommandLineParser parser(argc, argv, keys);
string path(parser.get<string>("path"));
if (parser.has("help") || path=="true")
......@@ -86,6 +88,7 @@ int main(int argc, const char *argv[])
parser.printMessage();
return -1;
}
string trainMethod(parser.get<string>("train"));
// These vectors hold the images and corresponding labels.
vector<Mat> images;
......@@ -97,24 +100,12 @@ int main(int argc, const char *argv[])
unsigned int numSplits = dataset->getNumSplits();
printf("splits number: %u\n", numSplits);
printf("train size: %u\n", (unsigned int)dataset->getTrain().size());
if (trainMethod == "dev")
printf("train size: %u\n", (unsigned int)dataset->getTrain().size());
else
printf("train size: %u\n", (numSplits-1) * (unsigned int)dataset->getTest().size());
printf("test size: %u\n", (unsigned int)dataset->getTest().size());
for (unsigned int i=0; i<dataset->getTrain().size(); ++i)
{
FR_lfwObj *example = static_cast<FR_lfwObj *>(dataset->getTrain()[i].get());
int currNum = getLabel(example->image1);
Mat img = imread(path+example->image1, IMREAD_GRAYSCALE);
images.push_back(img);
labels.push_back(currNum);
currNum = getLabel(example->image2);
img = imread(path+example->image2, IMREAD_GRAYSCALE);
images.push_back(img);
labels.push_back(currNum);
}
// 2200 pairsDevTrain, first split: correct: 373, from: 600 -> 62.1667%
Ptr<FaceRecognizer> model = createLBPHFaceRecognizer();
// 2200 pairsDevTrain, first split: correct: correct: 369, from: 600 -> 61.5%
......@@ -122,14 +113,58 @@ int main(int argc, const char *argv[])
// 2200 pairsDevTrain, first split: correct: 372, from: 600 -> 62%
//Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
//string saveModelPath = "face-rec-model.txt";
//cout << "Saving the trained model to " << saveModelPath << endl;
//model->save(saveModelPath);
if (trainMethod == "dev") // train on personsDevTrain.txt
{
for (unsigned int i=0; i<dataset->getTrain().size(); ++i)
{
FR_lfwObj *example = static_cast<FR_lfwObj *>(dataset->getTrain()[i].get());
int currNum = getLabel(example->image1);
Mat img = imread(path+example->image1, IMREAD_GRAYSCALE);
images.push_back(img);
labels.push_back(currNum);
currNum = getLabel(example->image2);
img = imread(path+example->image2, IMREAD_GRAYSCALE);
images.push_back(img);
labels.push_back(currNum);
}
model->train(images, labels);
//string saveModelPath = "face-rec-model.txt";
//cout << "Saving the trained model to " << saveModelPath << endl;
//model->save(saveModelPath);
}
vector<double> p;
for (unsigned int j=0; j<numSplits; ++j)
{
if (trainMethod == "split") // train on the remaining 9 splits from pairs.txt
{
images.clear();
labels.clear();
for (unsigned int j2=0; j2<numSplits; ++j2)
{
if (j==j2) continue; // skip test split for training
vector < Ptr<Object> > &curr = dataset->getTest(j2);
for (unsigned int i=0; i<curr.size(); ++i)
{
FR_lfwObj *example = static_cast<FR_lfwObj *>(curr[i].get());
int currNum = getLabel(example->image1);
Mat img = imread(path+example->image1, IMREAD_GRAYSCALE);
images.push_back(img);
labels.push_back(currNum);
currNum = getLabel(example->image2);
img = imread(path+example->image2, IMREAD_GRAYSCALE);
images.push_back(img);
labels.push_back(currNum);
}
}
model->train(images, labels);
}
unsigned int incorrect = 0, correct = 0;
vector < Ptr<Object> > &curr = dataset->getTest(j);
for (unsigned int i=0; i<curr.size(); ++i)
......@@ -168,7 +203,7 @@ int main(int argc, const char *argv[])
sigma += (*it - mu)*(*it - mu);
}
sigma = sqrt(sigma/p.size());
double se = sigma/sqrt(p.size());
double se = sigma/sqrt(double(p.size()));
printf("estimated mean accuracy: %f and the standard error of the mean: %f\n", mu, se);
return 0;
......
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2015, Itseez Inc, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Itseez Inc or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "opencv2/datasets/pd_inria.hpp"
#include <opencv2/core.hpp>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
using namespace cv;
using namespace cv::datasets;
int main(int argc, char *argv[])
{
const char *keys =
"{ help h usage ? | | show this message }"
"{ path p |true| path to dataset }";
CommandLineParser parser(argc, argv, keys);
string path(parser.get<string>("path"));
if (parser.has("help") || path=="true")
{
parser.printMessage();
return -1;
}
Ptr<PD_inria> dataset = PD_inria::create();
dataset->load(path);
cout << "train size: " << (unsigned int)dataset->getTrain().size() << endl;
PD_inriaObj *example = static_cast<PD_inriaObj *>(dataset->getTrain()[0].get());
cout << "first train object: " << endl;
cout << "name: " << example->filename << endl;
// image size
cout << "image size: " << endl;
cout << " - width: " << example->width << endl;
cout << " - height: " << example->height << endl;
cout << " - depth: " << example->depth << endl;
// bounding boxes
for (unsigned int i = 0; i < (example->bndboxes).size(); i++)
{
cout << "object " << i << endl;
int x = (example->bndboxes)[i].x;
int y = (example->bndboxes)[i].y;
cout << " - xmin = " << x << endl;
cout << " - ymin = " << y << endl;
cout << " - xmax = " << (example->bndboxes)[i].width + x << endl;
cout << " - ymax = " << (example->bndboxes)[i].height + y<< endl;
}
return 0;
}
......@@ -84,6 +84,7 @@ void OR_mnistImp::loadDatasetPart(const string &imagesFile, const string &labels
fclose(f);
if (num*imageSize != res)
{
delete[] images;
return;
}
f = fopen(labelsFile.c_str(), "rb");
......@@ -93,6 +94,8 @@ void OR_mnistImp::loadDatasetPart(const string &imagesFile, const string &labels
fclose(f);
if (num != res)
{
delete[] images;
delete[] labels;
return;
}
......
This diff is collapsed.
set(the_description "Face recognition etc")
ocv_define_module(face opencv_core opencv_imgproc)
ocv_define_module(face opencv_core opencv_imgproc WRAP python)
File mode changed from 100755 to 100644
***************************************
face. Face Recognition
***************************************
The module contains some recently added functionality that has not been stabilized, or functionality that is considered optional.
.. toctree::
:maxdepth: 2
FaceRecognizer Documentation <index>
This diff is collapsed.
Changelog
=========
Release 0.05
------------
This library is now included in the official OpenCV distribution (from 2.4 on).
The :ocv:class`FaceRecognizer` is now an :ocv:class:`Algorithm`, which better fits into the overall
OpenCV API.
To reduce the confusion on user side and minimize my work, libfacerec and OpenCV
have been synchronized and are now based on the same interfaces and implementation.
The library now has an extensive documentation:
* The API is explained in detail and with a lot of code examples.
* The face recognition guide I had written for Python and GNU Octave/MATLAB has been adapted to the new OpenCV C++ ``cv::FaceRecognizer``.
* A tutorial for gender classification with Fisherfaces.
* A tutorial for face recognition in videos (e.g. webcam).
Release highlights
++++++++++++++++++
* There are no single highlights to pick from, this release is a highlight itself.
Release 0.04
------------
This version is fully Windows-compatible and works with OpenCV 2.3.1. Several
bugfixes, but none influenced the recognition rate.
Release highlights
++++++++++++++++++
* A whole lot of exceptions with meaningful error messages.
* A tutorial for Windows users: `http://bytefish.de/blog/opencv_visual_studio_and_libfacerec <http://bytefish.de/blog/opencv_visual_studio_and_libfacerec>`_
Release 0.03
------------
Reworked the library to provide separate implementations in cpp files, because
it's the preferred way of contributing OpenCV libraries. This means the library
is not header-only anymore. Slight API changes were done, please see the
documentation for details.
Release highlights
++++++++++++++++++
* New Unit Tests (for LBP Histograms) make the library more robust.
* Added more documentation.
Release 0.02
------------
Reworked the library to provide separate implementations in cpp files, because
it's the preferred way of contributing OpenCV libraries. This means the library
is not header-only anymore. Slight API changes were done, please see the
documentation for details.
Release highlights
++++++++++++++++++
* New Unit Tests (for LBP Histograms) make the library more robust.
* Added a documentation and changelog in reStructuredText.
Release 0.01
------------
Initial release as header-only library.
Release highlights
++++++++++++++++++
* Colormaps for OpenCV to enhance the visualization.
* Face Recognition algorithms implemented:
* Eigenfaces [TP91]_
* Fisherfaces [BHK97]_
* Local Binary Patterns Histograms [AHP04]_
* Added persistence facilities to store the models with a common API.
* Unit Tests (using `gtest <http://code.google.com/p/googletest/>`_).
* Providing a CMakeLists.txt to enable easy cross-platform building.
This diff is collapsed.
FaceRecognizer - Face Recognition with OpenCV
##############################################
OpenCV 2.4 now comes with the very new :ocv:class:`FaceRecognizer` class for face recognition. This documentation is going to explain you :doc:`the API <facerec_api>` in detail and it will give you a lot of help to get started (full source code examples). :doc:`Face Recognition with OpenCV <facerec_tutorial>` is the definite guide to the new :ocv:class:`FaceRecognizer`. There's also a :doc:`tutorial on gender classification <tutorial/facerec_gender_classification>`, a :doc:`tutorial for face recognition in videos <tutorial/facerec_video_recognition>` and it's shown :doc:`how to load & save your results <tutorial/facerec_save_load>`.
These documents are the help I have wished for, when I was working myself into face recognition. I hope you also think the new :ocv:class:`FaceRecognizer` is a useful addition to OpenCV.
Please issue any feature requests and/or bugs on the official OpenCV bug tracker at:
* http://code.opencv.org/projects/opencv/issues
Contents
========
.. toctree::
:maxdepth: 1
FaceRecognizer API <facerec_api>
Guide to Face Recognition with OpenCV <facerec_tutorial>
Tutorial on Gender Classification <tutorial/facerec_gender_classification>
Tutorial on Face Recognition in Videos <tutorial/facerec_video_recognition>
Tutorial On Saving & Loading a FaceRecognizer <tutorial/facerec_save_load>
Changelog <facerec_changelog>
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
File mode changed from 100755 to 100644
Saving and Loading a FaceRecognizer
===================================
Introduction
------------
Saving and loading a :ocv:class:`FaceRecognizer` is very important. Training a FaceRecognizer can be a very time-intense task, plus it's often impossible to ship the whole face database to the user of your product. The task of saving and loading a FaceRecognizer is easy with :ocv:class:`FaceRecognizer`. You only have to call :ocv:func:`FaceRecognizer::load` for loading and :ocv:func:`FaceRecognizer::save` for saving a :ocv:class:`FaceRecognizer`.
I'll adapt the Eigenfaces example from the :doc:`../facerec_tutorial`: Imagine we want to learn the Eigenfaces of the `AT&T Facedatabase <http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html>`_, store the model to a YAML file and then load it again.
From the loaded model, we'll get a prediction, show the mean, Eigenfaces and the image reconstruction.
Using FaceRecognizer::save and FaceRecognizer::load
-----------------------------------------------------
The source code for this demo application is also available in the ``src`` folder coming with this documentation:
* :download:`src/facerec_save_load.cpp <../src/facerec_save_load.cpp>`
.. literalinclude:: ../src/facerec_save_load.cpp
:language: cpp
:linenos:
Results
-------
``eigenfaces_at.yml`` then contains the model state, we'll simply look at the first 10 lines with ``head eigenfaces_at.yml``:
.. code-block:: none
philipp@mango:~/github/libfacerec-build$ head eigenfaces_at.yml
%YAML:1.0
num_components: 399
mean: !!opencv-matrix
rows: 1
cols: 10304
dt: d
data: [ 8.5558897243107765e+01, 8.5511278195488714e+01,
8.5854636591478695e+01, 8.5796992481203006e+01,
8.5952380952380949e+01, 8.6162907268170414e+01,
8.6082706766917283e+01, 8.5776942355889716e+01,
And here is the Reconstruction, which is the same as the original:
.. image:: ../img/eigenface_reconstruction_opencv.png
:align: center
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment